Receipt OCR

Description

I took one of my receipts from King Soopers and wrote a script to extract data from it using the taggun api. Taggun utilizes OCR.

Here’s what the JSON response was.

alt text

I extracted the data and imported it into pandas dataframe. It looks like this:

alt text

Imagine the time savings if we needed to extract totals from hundreds of receipts :). Not all of the extracted info was right. But the total amount, tax, city, and country all were. It didn’t pull the itemized info but taggun might have other endpoints for that. This was the output from using their simple/file endpoint.

Data

The script takes an image file of a reciept as input data. It makes an api call to taggun and receives a json response with info it was able to parse out of the receipt.

Code

The script is pretty simple. It makes a post request to the api endpoint and the file handler gets sent along. And then it transfers the JSON data into a pandas dataframe. That data then gets written to a new Excel workbook.

Click here to view this project's repository