DEV Community

Edwards Tech Innovations
Edwards Tech Innovations

Posted on

Bank Reconciliation in Python: Building a Plaid Integration from Scratch

Bank Reconciliation in Python: Building a Plaid Integration from Scratch

If you're building a fintech product, a business dashboard, or any application that needs to pull bank data, you're going to need Plaid. But Plaid's API surface is large, and the official SDK has a lot of ceremony around request/response objects.

In this tutorial, I'll walk through a complete Plaid integration module -- 337 lines of Python -- that handles the full lifecycle: link tokens, account access, transaction sync, reconciliation, and spending categorization.

Prerequisites

You will need:

  • A Plaid developer account (sandbox is free)
  • Python 3.9+
  • The plaid-python SDK
pip install plaid-python python-dotenv
Enter fullscreen mode Exit fullscreen mode

Set up your .env:

PLAID_CLIENT_ID=your_client_id
PLAID_SECRET=your_sandbox_secret
PLAID_ENV=sandbox
Enter fullscreen mode Exit fullscreen mode

Step 1: Initialize the Client

The first design decision is handling the case where Plaid credentials aren't available. In a multi-integration system, not every deployment will have every service configured. We use a try/except on import and an initialized flag.

import os
from typing import Dict, Any, List
from datetime import datetime, timedelta

try:
    import plaid
    from plaid.api import plaid_api
    from plaid.model.products import Products
    from plaid.model.country_code import CountryCode
    from plaid.model.link_token_create_request import LinkTokenCreateRequest
    from plaid.model.link_token_create_request_user import LinkTokenCreateRequestUser
    from plaid.model.item_public_token_exchange_request import ItemPublicTokenExchangeRequest
    from plaid.model.transactions_get_request import TransactionsGetRequest
    from plaid.model.transactions_get_request_options import TransactionsGetRequestOptions
    from plaid.model.accounts_get_request import AccountsGetRequest
    from plaid.model.accounts_balance_get_request import AccountsBalanceGetRequest
    PLAID_AVAILABLE = True
except ImportError:
    PLAID_AVAILABLE = False


class PlaidIntegration:
    def __init__(self):
        self.client_id = os.getenv("PLAID_CLIENT_ID")
        self.secret = os.getenv("PLAID_SECRET")
        self.env = os.getenv("PLAID_ENV", "sandbox")
        self.initialized = bool(self.client_id and self.secret and PLAID_AVAILABLE)

        if self.initialized:
            if self.env == "production":
                host = plaid.Environment.Production
            elif self.env == "development":
                host = plaid.Environment.Development
            else:
                host = plaid.Environment.Sandbox

            configuration = plaid.Configuration(
                host=host,
                api_key={
                    "clientId": self.client_id,
                    "secret": self.secret
                }
            )
            api_client = plaid.ApiClient(configuration)
            self.client = plaid_api.PlaidApi(api_client)
Enter fullscreen mode Exit fullscreen mode

The environment toggle (sandbox / development / production) is read from an env var with a safe default. In production, you never want to accidentally hit sandbox.

Step 2: Link Tokens -- The Entry Point

Plaid Link is a drop-in frontend component that lets users securely connect their bank accounts. Your backend creates a link token, your frontend uses it to open the Link flow, and then you exchange the resulting public token for a persistent access token.

def create_link_token(self, user_id: str, products: List[str] = None) -> Dict[str, Any]:
    """Create a link token for Plaid Link"""
    try:
        product_list = [Products(p) for p in (products or ["transactions"])]

        request = LinkTokenCreateRequest(
            products=product_list,
            client_name="Your App Name",
            country_codes=[CountryCode("US")],
            language="en",
            user=LinkTokenCreateRequestUser(client_user_id=user_id)
        )

        response = self.client.link_token_create(request)
        return {
            "success": True,
            "link_token": response.link_token,
            "expiration": response.expiration.isoformat()
        }
    except Exception as e:
        return {"success": False, "error": str(e)}
Enter fullscreen mode Exit fullscreen mode

The products parameter controls what data you can access. Common options:

  • "transactions" -- account balances and transaction history
  • "auth" -- account and routing numbers (for ACH)
  • "identity" -- account holder name, address, email

You can request multiple products in one link session.

Step 3: Token Exchange

When the user completes the Link flow, Plaid gives your frontend a public_token. This is temporary. You exchange it for a permanent access_token that you store server-side.

def exchange_public_token(self, public_token: str) -> Dict[str, Any]:
    """Exchange public token for access token"""
    try:
        request = ItemPublicTokenExchangeRequest(public_token=public_token)
        response = self.client.item_public_token_exchange(request)
        return {
            "success": True,
            "access_token": response.access_token,
            "item_id": response.item_id
        }
    except Exception as e:
        return {"success": False, "error": str(e)}
Enter fullscreen mode Exit fullscreen mode

Important: The access_token is a credential. Store it encrypted. Never log it. Never send it to the frontend.

Step 4: Pulling Transactions

This is where it gets useful. Plaid's transaction endpoint supports date ranges, pagination, and account filtering.

def get_transactions(self, access_token: str, start_date: str = None,
                    end_date: str = None, account_ids: List[str] = None,
                    count: int = 100, offset: int = 0) -> Dict[str, Any]:
    """Get transactions for date range"""
    try:
        if not start_date:
            start_date = (datetime.now() - timedelta(days=30)).strftime("%Y-%m-%d")
        if not end_date:
            end_date = datetime.now().strftime("%Y-%m-%d")

        options = TransactionsGetRequestOptions(count=count, offset=offset)
        if account_ids:
            options.account_ids = account_ids

        request = TransactionsGetRequest(
            access_token=access_token,
            start_date=datetime.strptime(start_date, "%Y-%m-%d").date(),
            end_date=datetime.strptime(end_date, "%Y-%m-%d").date(),
            options=options
        )

        response = self.client.transactions_get(request)

        return {
            "success": True,
            "total_transactions": response.total_transactions,
            "transactions": [{
                "transaction_id": t.transaction_id,
                "account_id": t.account_id,
                "date": t.date.isoformat(),
                "name": t.name,
                "merchant_name": t.merchant_name,
                "amount": t.amount,
                "currency": t.iso_currency_code,
                "category": t.category,
                "pending": t.pending,
                "payment_channel": t.payment_channel.value if t.payment_channel else None
            } for t in response.transactions]
        }
    except Exception as e:
        return {"success": False, "error": str(e)}
Enter fullscreen mode Exit fullscreen mode

Plaid returns up to 500 transactions per request. For accounts with heavy activity, you need to paginate:

def get_all_transactions(self, access_token: str, start_date: str = None,
                        end_date: str = None) -> List[Dict]:
    """Get all transactions (handles pagination)"""
    all_transactions = []
    offset = 0
    count = 500

    while True:
        result = self.get_transactions(
            access_token, start_date, end_date,
            count=count, offset=offset
        )
        if not result.get("success"):
            break

        transactions = result.get("transactions", [])
        all_transactions.extend(transactions)

        if len(transactions) < count:
            break
        offset += count

    return all_transactions
Enter fullscreen mode Exit fullscreen mode

Step 5: Bank Reconciliation

Reconciliation is matching your internal records against what the bank actually shows. This is where business applications diverge from simple "show me my transactions" dashboards.

The approach here is straightforward: match by amount (within a penny tolerance), then report what matched, what's in the bank but not in your records, and what's in your records but not in the bank.

def reconcile_transactions(self, access_token: str, expected_transactions: List[Dict],
                          start_date: str = None, end_date: str = None) -> Dict[str, Any]:
    """Reconcile bank transactions against expected transactions"""
    try:
        bank_txns = self.get_all_transactions(access_token, start_date, end_date)

        matched = []
        unmatched_bank = []
        unmatched_expected = list(expected_transactions)

        for bank_txn in bank_txns:
            found_match = False
            for i, exp_txn in enumerate(unmatched_expected):
                if abs(bank_txn["amount"] - exp_txn.get("amount", 0)) < 0.01:
                    matched.append({
                        "bank_transaction": bank_txn,
                        "expected_transaction": exp_txn,
                        "status": "matched"
                    })
                    unmatched_expected.pop(i)
                    found_match = True
                    break

            if not found_match:
                unmatched_bank.append(bank_txn)

        return {
            "success": True,
            "matched_count": len(matched),
            "unmatched_bank_count": len(unmatched_bank),
            "unmatched_expected_count": len(unmatched_expected),
            "matched": matched,
            "unmatched_bank": unmatched_bank,
            "unmatched_expected": unmatched_expected,
            "reconciliation_status": "complete" if not unmatched_bank and not unmatched_expected else "needs_review"
        }
    except Exception as e:
        return {"success": False, "error": str(e)}
Enter fullscreen mode Exit fullscreen mode

The reconciliation_status field gives you a quick check: "complete" means everything lines up, "needs_review" means a human should look at the discrepancies.

Step 6: Spending Categorization

Plaid provides category data with transactions, but the raw categories are nested lists. This method rolls them up into a summary with totals:

def categorize_transactions(self, transactions: List[Dict]) -> Dict[str, Any]:
    """Categorize transactions by type"""
    categories = {}
    for txn in transactions:
        cat = txn.get("category", ["Uncategorized"])[0] if txn.get("category") else "Uncategorized"
        if cat not in categories:
            categories[cat] = {"count": 0, "total": 0, "transactions": []}
        categories[cat]["count"] += 1
        categories[cat]["total"] += txn.get("amount", 0)
        categories[cat]["transactions"].append(txn)

    return {
        "success": True,
        "categories": categories,
        "summary": {cat: {"count": data["count"], "total": data["total"]}
                   for cat, data in categories.items()}
    }
Enter fullscreen mode Exit fullscreen mode

And for a full spending report with top merchants:

def get_spending_summary(self, access_token: str, start_date: str = None,
                        end_date: str = None) -> Dict[str, Any]:
    """Get spending summary by category"""
    transactions = self.get_all_transactions(access_token, start_date, end_date)

    # In Plaid, positive amounts are debits (money out)
    expenses = [t for t in transactions if t["amount"] > 0]
    income = [t for t in transactions if t["amount"] < 0]

    categorized = self.categorize_transactions(expenses)

    return {
        "success": True,
        "period": {"start": start_date, "end": end_date},
        "total_expenses": sum(t["amount"] for t in expenses),
        "total_income": abs(sum(t["amount"] for t in income)),
        "transaction_count": len(transactions),
        "expense_categories": categorized.get("summary", {}),
        "top_merchants": self._get_top_merchants(expenses, limit=10)
    }
Enter fullscreen mode Exit fullscreen mode

One thing that catches people: Plaid uses positive numbers for debits and negative numbers for credits. This is the opposite of what most people expect. The code above handles this correctly by filtering on sign.

Putting It Together

With the singleton pattern, using this in your application is straightforward:

from dotenv import load_dotenv
load_dotenv()

from eti_integrations import get_plaid

plaid = get_plaid()

# Check if Plaid is configured
if not plaid.initialized:
    print("Set PLAID_CLIENT_ID and PLAID_SECRET in .env")
else:
    # Create a link token for your frontend
    link = plaid.create_link_token(user_id="user_123")
    print(f"Link token: {link['link_token']}")

    # After the user completes Link and you have an access token:
    summary = plaid.get_spending_summary(access_token, "2025-01-01", "2025-03-31")
    print(f"Total expenses: ${summary['total_expenses']:.2f}")
    print(f"Total income: ${summary['total_income']:.2f}")
    for category, data in summary["expense_categories"].items():
        print(f"  {category}: ${data['total']:.2f} ({data['count']} transactions)")
Enter fullscreen mode Exit fullscreen mode

What I'd Build Next

This module covers the core Plaid use cases. If I were extending it, I'd add:

  • Plaid Investments for brokerage account data
  • Plaid Liabilities for credit card and loan balances
  • Webhook handling for real-time transaction updates instead of polling
  • Caching for balance requests that don't need real-time data

This Plaid module is one of 9 production Python modules in the AI Integration Pack. The other 8 cover Stripe, PayPal, Twilio, Airtable, ClickUp, Outlook, and e-commerce platforms (Gumroad, LemonSqueezy, Paddle). 4,488 lines of code total, 152 tests, security-audited.

If you're interested: AI Integration Pack on Gumroad -- $49 for the full pack.

Top comments (0)