Rajesh Murali Nair

Posted on Dec 16, 2025

Building a Receipt-Scanning Feature for a Budgeting App with Amazon Bedrock

#aws #bedrock #ai

Introduction

While building a budgeting app, I identified a feature that had value beyond personal expense tracking. By enabling users to scan supermarket receipts, the application can extract structured purchase data and analyze individual spending patterns automatically.

This capability not only simplifies budgeting for users but also highlights a broader opportunity for the retail industry. Receipt-level data can provide insights into consumer behavior and enable retailers to deliver more targeted, data-driven promotional offers tailored to specific customers.

Problem Statement: Manual Expense Tracking Doesn’t Scale

Tracking expenses manually is time-consuming and error-prone. Most budgeting applications rely on users to enter purchase details by hand, which often leads to incomplete data and poor long-term adoption.

Supermarket receipts contain rich information—item names, prices, categories, totals—but this data is usually locked away in unstructured formats such as images or PDFs. Without automation, extracting and organizing this information becomes a significant challenge, limiting both accurate budget tracking and deeper spending analysis.

This problem becomes more pronounced as transaction volume grows, making a scalable, automated receipt-processing solution essential.

Feature Overview

The receipt-scanning feature allows users to capture supermarket receipts and automatically convert them into structured expense data within the budgeting app.

From a user’s perspective, the workflow is simple:

The user uploads a photo of a supermarket receipt.
The application processes the image and extracts key purchase details such as store name, items, prices, total amount, and purchase date.
The extracted data is then categorized and stored, making it immediately available for budget tracking and spending analysis.

By automating this process, the feature removes the need for manual expense entry while enabling more accurate, item-level insights into consumer spending patterns.

Architecture Walkthrough: Receipt Processing Pipeline

This section walks through the architecture shown above, focusing on how each AWS service contributes to the receipt-scanning feature, from ingestion to persistent storage.

The goal of this design is to keep the workflow event-driven, scalable, and simple, while clearly separating responsibilities between OCR, AI reasoning, and data storage.

Architecture Diagram

1. Ingestion: Uploading Receipts to Amazon S3

The workflow starts when a user uploads a receipt image using either a mobile application or a web interface.

All receipt images are stored in an Amazon S3 bucket named receipts
S3 acts as a durable, cost-effective entry point for unstructured data (images)
The bucket is configured with an event notification that triggers processing as soon as a new object is uploaded

Using S3 for ingestion removes the need for a dedicated API layer just to accept images and ensures uploads scale automatically with usage.

2. Event Trigger: AWS Lambda (`receipt-analyzer`)

When a new receipt image is uploaded, S3 triggers an AWS Lambda function called receipt-analyzer.

When a new receipt image is uploaded, S3 triggers an AWS Lambda function called receipt-analyzer.

This Lambda function acts as the orchestrator for the entire pipeline:

It reads the S3 event metadata
Coordinates calls to downstream services
Normalizes and persists the final output

Because Lambda is event-driven and serverless, the system only runs compute when a receipt actually arrives.

import json
import boto3
from datetime import datetime
from decimal import Decimal

# Initialize clients
textract = boto3.client('textract', region_name='us-east-1')
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')

# Configuration
BEDROCK_MODEL_ID = 'anthropic.claude-3-sonnet-20240229-v1:0'
DYNAMODB_TABLE_NAME = 'receipt-processing-results'

def convert_floats_to_decimal(obj):
    """Recursively convert float values to Decimal for DynamoDB compatibility"""
    if isinstance(obj, list):
        return [convert_floats_to_decimal(item) for item in obj]
    elif isinstance(obj, dict):
        return {key: convert_floats_to_decimal(value) for key, value in obj.items()}
    elif isinstance(obj, float):
        return Decimal(str(obj))
    else:
        return obj

def lambda_handler(event, context):
    # Extract S3 bucket and object key from event
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    document_name = event['Records'][0]['s3']['object']['key']

    # Step 1: Extract raw text using Textract OCR
    textract_response = textract.detect_document_text(
        Document={'S3Object': {'Bucket': bucket_name, 'Name': document_name}}
    )

    # Step 2: Concatenate lines of text
    text_lines = [block['Text'] for block in textract_response['Blocks'] if block['BlockType'] == 'LINE']
    full_text = "\n".join(text_lines)

    # Step 3: Prepare prompt for Bedrock
    prompt = f"""You are an AI assistant that extracts structured data from receipts. Given the receipt text below, return a JSON with the following fields:
- supermarket_name
- location (address)
- items (list of item name and price)
- total_amount
- date_of_purchase

Receipt Text:
\"\"\"
{full_text}
\"\"\"

Return only the JSON object, no explanation."""

    # Step 4: Invoke Bedrock
    bedrock_response = bedrock.invoke_model(
        modelId=BEDROCK_MODEL_ID,
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 1024,
            "messages": [
                {
                    "role": "user",
                    "content": prompt
                }
            ],
            "temperature": 0.3,
            "top_p": 0.9
        })
    )

    # Step 5: Parse Bedrock response
    response_body = json.loads(bedrock_response['body'].read().decode())
    model_output = response_body['content'][0]['text'].strip()

    # Try to parse model output as JSON
    try:
        extracted_data = json.loads(model_output)
    except json.JSONDecodeError:
        extracted_data = {"error": "Failed to parse response", "raw_output": model_output}

    # Step 6: Convert floats to Decimal for DynamoDB
    extracted_data_decimal = convert_floats_to_decimal(extracted_data)

    # Step 7: Save to DynamoDB
    table = dynamodb.Table(DYNAMODB_TABLE_NAME)

    # Create DynamoDB item
    dynamodb_item = {
        'document_id': document_name,
        'bucket_name': bucket_name,
        'processed_timestamp': datetime.utcnow().isoformat(),
        'extracted_data': extracted_data_decimal,
        'raw_text': full_text
    }

    try:
        # Save to DynamoDB
        table.put_item(Item=dynamodb_item)

        return {
            'statusCode': 200,
            'body': json.dumps({
                'message': 'Receipt processed and saved successfully',
                'document_id': document_name,
                'extracted_data': extracted_data  # Return original for JSON serialization
            })
        }
    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps({
                'error': 'Failed to save to DynamoDB',
                'details': str(e),
                'extracted_data': extracted_data
            })
        }

3. Text Extraction: Amazon Textract

The first processing step inside the Lambda is optical character recognition (OCR) using Amazon Textract.

Textract extracts raw text from the receipt image
All detected LINE blocks are concatenated into a single text representation
No assumptions are made about receipt layout or formatting

At this stage, the data is still unstructured—just plain text—but it provides a reliable foundation for semantic analysis.

4. Semantic Parsing: Amazon Bedrock (Claude 3 Sonnet)

Once raw text is extracted, the Lambda invokes Amazon Bedrock using the anthropic.claude-3-sonnet model.

Instead of trying to manually parse receipts with rules or regex, the model is prompted to reason over the text and return a clean JSON structure containing:

Supermarket name
Store location
Item list (name and price)
Total amount
Date of purchase

The prompt explicitly instructs the model to:

Return only JSON
Follow a fixed schema

This approach dramatically simplifies downstream processing and makes the output predictable enough for database storage.

5. Persistence: Amazon DynamoDB

After successful extraction, the structured result is stored in Amazon DynamoDB, in a table named receipt-processing-results.

Each receipt is saved as a single item with the following attributes:

document_id (String, primary key)
bucket_name
extracted_data (structured JSON)
processed_timestamp
raw_text (original OCR output)

Example of extracted_data field:

{
  "location": {
    "S": "Markt 54, 3431 LB Nieuwegein"
  },
  "items": {
    "L": [
      {
        "M": {
          "name": {
            "S": "S. MARIA TORTILLA W.W"
          },
          "price": {
            "N": "2.99"
          }
        }
      }
    ]
  },
  "total_amount": {
    "N": "2.99"
  },
  "date_of_purchase": {
    "S": "14/06/2025"
  },
  "supermarket_name": {
    "S": "Dirk van den Broek"
  }
}

DynamoDB was chosen because it:

Scales automatically with receipt volume
Provides low-latency access for dashboards and queries
Works well for item-centric access patterns (one receipt per item)

Storing both structured data and raw text allows future reprocessing if extraction logic or prompts improve.

Why This Architecture Works Well

This design has a few key advantages:

Fully serverless – no servers to manage or scale
Event-driven – processing happens only when new data arrives
Separation of concerns – OCR, reasoning, and storage are cleanly isolated
Extensible – easy to add user IDs, GSIs, or analytics pipelines later

It also keeps the system flexible: Textract can be swapped or enhanced, prompts can evolve, and DynamoDB schemas can grow without breaking the ingestion flow.

Conclusion

This feature demonstrates how a focused, single-purpose workflow can deliver meaningful value when built with the right AWS services. By combining Amazon S3, AWS Lambda, Amazon Textract, Amazon Bedrock (Claude 3 Sonnet), and Amazon DynamoDB, unstructured receipt images are transformed into structured, queryable data with minimal operational complexity.

The event-driven, serverless design scales automatically with usage and keeps costs aligned with demand. Separating OCR from AI-based reasoning also makes the solution flexible—models, prompts, or extraction logic can evolve over time without requiring architectural changes.

Most importantly, this approach removes manual effort for users while creating a strong foundation for future capabilities such as spending analytics, budget insights, and personalized recommendations. With small incremental extensions, this same architecture can support more advanced financial intelligence use cases without sacrificing simplicity or scalability.

DEV Community

Building a Receipt-Scanning Feature for a Budgeting App with Amazon Bedrock

Introduction

Problem Statement: Manual Expense Tracking Doesn’t Scale

Feature Overview