DEV Community

Eliana Lam
Eliana Lam

Posted on • Originally published at aws-user-group.com

Financial Transaction Data Reconciler PayPal

Speaker: Jayaseelan Shanmugam @ AWS FSI Meetup 2025 Q4



Introduction to PayPal:

  • PayPal is a global payment service provider processing 1.7 trillion in annual payment volume.

  • Operates in 200 global markets with 430 million active accounts.

  • Processes approximately 900 transactions per second, with peaks during holiday seasons like Black Friday and Cyber Monday.

Critical Problem:

  • Ensuring the accuracy and reconciliation of the massive transaction volume.

  • Reconciling transactions across multiple systems within PayPal, with external processors, and networks.

  • Matching transactions to ensure no data or financial discrepancies.

  • Validating that financial records accurately reflect actual customer payments.

Reconciliation Process:

  • Transactions flow through PayPal’s system and are recorded in multiple internal ledgers.

  • Transactions are sent to external processors for clearing and confirmation by the network.

  • PayPal settles the transaction money to the merchant.

  • Reconciliation involves matching transactions across PayPal’s internal systems, processor acknowledgments, and funding settlement summaries (end of day, T+1, T+2).

  • Primary goal: Ensure transactions are not lost and there are no discrepancies.

Why Reconciliation Matters:

  • Three-way matching problem: PayPal internal ledger, external processor records, and network confirmations.

  • Critical for financial accuracy and customer trust.

  • Ensures that financial records reflect actual customer payments.

High-Level Architecture:

  • Focusing on how PayPal achieved near real-time reconciliation.

  • Technologies and strategies used to handle the scale and complexity of the problem.

Business Impact:

  • Reduction in reconciliation time from 24 hours to 15 minutes.

  • Improved accuracy with minimal discrepancies.

  • Enhanced customer trust and operational efficiency.



Continued Discussion on PayPal’s Reconciliation System

Three-Way Matching Problem:

PayPal Internal Ledger:

  • Records transactions within PayPal’s systems.

External Processor:

  • Registers transactions in their local system.

Network Confirmation:

  • Confirms whether the transaction has been successfully made.

Responsibilities:

  • PayPal is responsible for matching transactions at every stage from entry into the system to settlement with the merchant.

  • Manual reconciliation is impractical due to the high volume of transactions.

Automated State Machine with Rule Engine:

  • Utilizes an automated state machine and high-level rule engine.

  • Configured to handle transaction processing with external vendors and timelines for acknowledgments and funding summaries.

  • Ensures transactions are reconciled efficiently.

Importance of Reconciliation:

  • Critical for understanding "what happened versus what actually happened."

  • Ensures transactions are auditable and compliant with regulatory standards (PCI DSS).

  • Provides a clear record of when transactions were recorded and settled.

Current Gaps in Legacy System:

  • The legacy system relies on end-of-day batch processing.

  • Uses a store and process mechanism where transactions accumulate throughout the day and are reconciled at the end of the day.

  • Source of truth is not the direct operational data store to avoid performance and latency impacts.

  • Utilizes an ETL system sourcing data from Oracle GoldenGate.

  • ETL pipeline involves transformation and formatting, leading to potential data mismatches or inconsistencies.

Need for Improvement:

  • Move away from batch processing to near real-time processing.

  • Reduce reliance on ETL systems to minimize data transformation issues.

  • Enhance automation to ensure accurate and timely reconciliation.



Problems with Legacy System:

  • Experiences delays due to accommodating all transactions until the end of the day.

  • Transactions are matched and account books are closed only at the end of the day.

  • This delay is a significant problem.

Objective:

  • Transition to a new age platform in the cloud.

  • Leverage AWS infrastructure to solve the aforementioned problems.

Key Objectives of the New Solution:

End-to-End Data Integrity Across Payment Lifecycle:

  • Ensure data integrity from the moment a record enters the real-time payment processing system.

  • Track transactions across multiple systems within PayPal.

  • Link all transactions with the correct identifier and timestamp.

  • Match outbound files sent to processors with inbound records received from networks or vendors.

Automated State-Driven Match Logic:

  • Move from a store-and-process mechanism to a stream-and-process mechanism.

  • Reduce the entire reconciliation cycle.

Real-Time Monitoring:

  • Identify exceptions while matching transactions within the internal system or records from external vendors.

  • Record exceptions where matches fail between received records and local ledger transactions.

  • Operational team to act on these recorded exceptions.

Technical Architecture:

Data Injection:

  • Sources from which data is injected into the reconciler.

Reconciliation Process:

  • Methods and processes involved in performing reconciliation.

Storage of Reconciliation Outcomes:

  • How the results of the reconciliation are stored.

Operational Team Leverage:

  • How the operational team uses reconciliation exceptions and acts on them.


High-Level Technical Reconciliation Overview

Scope Confinement:

  • Upstream payment processing systems are abstracted out.

  • Focus starts with the real-time payment card processor.

Real-Time Payment Card Processor:

  • Utilizes EKS service to receive millions of transactions per day (expected ~300 million transactions daily).

  • Each transaction is recorded in AWS DynamoDB, which serves as the source of truth and operational data store.

Data Flow:

  • [ 1 ] DynamoDB to Kinesis Data Stream:

  • Transactions recorded in DynamoDB are streamed via Kinesis Data Stream.

  • Kinesis manages ordering of transactions.

  • [ 2 ] Amazon Data Firehose:

  • Transactions are bucketed and chunked based on different business parameters.

  • [ 3 ] AWS S3:

  • Transactions are recorded in AWS S3.

  • S3 acts as a secondary data store for transactions but primary for file processing.

Reconciliation Process:

  • [ 1 ] Inbound Transactions in PayPal:

  • Processed transactions in PayPal are translated into file format in AWS S3.

  • [ 2 ] External Partner Processing:

  • Chunk files are translated into files and processed with external partners.

  • Inbound records from external partners are returned to S3.

AWS S3 as Central Source:

  • S3 holds both internal transaction footprints and external processed transactions received as inbound files.

Data Processing:

  • EventBridge Scheduling:

  • Triggers Apache Spark on AWS EMR cluster every 15 minutes.

  • Distributed processing of transactions in S3 using a preconfigured rule engine.

Rule Engine:

  • Comprises multiple state graphs.

  • Categorizes transactions and determines terminal states.

  • Includes complex rules based on market operations, external partners, and cutoff times for data export/import.



Technical Reconciliation Architecture

Apache EMR Cluster and Rule Engine:

  • Apache EMR cluster utilizes the rule engine to match transactions.

  • Successful reconciliation results are written back to AWS S3.

  • Exceptions are sent to AWS EventBridge, which triggers a Lambda function to enrich and report exceptions back to S3.

Storage and Operational Aspects:

  • Data stored as parquet files in S3.

  • Apache Glue Catalog configured on top of parquet files.

  • The operational team can query data using Amazon Athena in SQL fashion.

  • Custom-built UI portal on top of Glue Catalog provides detailed reconciliation states and outcomes for specific days, settlements, or partners.

Architecture Highlights:

Active-Active Architecture:

  • Operates across multiple AWS regions.

  • Ensures high availability with zero recovery point objective (RPO) and recovery time objective (RTO).

  • If one region goes down, another can process transactions seamlessly.

  • In-flight transactions are managed using Amazon Kinesis Data Stream and DynamoDB for consistency across regions.

Technology and Architecture Decisions:

  • AWS EMR vs. Redshift:

  • Considered using Redshift for a data lake solution but opted for AWS EMR due to cost efficiency.

  • EMR cluster extension to the core processing system, leveraging existing S3 data store.

  • Low-cost solution achieved by using AWS EMR to realize the problem statement.



Business Impact of the New Reconciliation Solution

Accuracy:

  • Improved from three 9s to four 9s.

Speed:

  • Reduced reconciliation time from 24 hours to a 15-minute cycle.

  • Horizontal cluster ensures consistent processing time (max 30 minutes) regardless of transaction volume (1 million to 300 million transactions).

Risk Reduction:

  • Faster reconciliation (within 15 minutes) minimizes potential fraud and risk.

  • Allows for quicker action on system or external issues.

Cost Optimization:

  • Chosen AWS EMR cluster over data lake solutions for cost efficiency.

  • EMR cluster and Lambda functions operate on-demand, not continuously.

  • Computing instances have a limited lifetime, freeing up resources and minimizing costs once processing is complete.

Top comments (0)