Speaker: Hafiz Syed Ashir Hassan @ AWS Amarathon 2025
Summary by Amazon Nova
Problem Statement:
Organisations struggle with unstructured data in various formats (documents, images, audio, video).
Manual processing is slow, inconsistent, and costly.
Existing automation systems are rigid, requiring templates, rules, and manual corrections.
Increasing demand for compliance, accuracy, and scalability.
Need for automating multi-format data processing with high accuracy using generative AI.
What is Bedrock Data Automation (BDA)?:
A fully-managed document and media automation capability in Amazon Web Services.
Enables building end-to-end extraction, classification, and transformation pipelines using foundation models.
Processes documents, images, audio, and video at scale.
Orchestrates multi-step workflows using serverless automation.
Minimises custom code while maximising flexibility.
Input Asset:
[ 1 ] Supports various formats:
Documents (PDF, DOCX, scanned, structured/unstructured)
Images (PNG, JPG)
Audio (voice notes, call recordings)
Video (meetings, CCTV, webinars)
[ 2 ] Offers two types of output instructions:
Standard Output Configuration
Custom Schema based on matched blueprint
Output Response:
Linearized Text representation of the asset based on configuration.
Output returned as JSON + additional files if selected in configuration.
Supported Formats & Information BDA Extracts:
[ 1 ] Documents:
Extracts fields, tables, entities
Classifies, transforms, summarises, and validates
[ 2 ] Images:
Offers OCR, document classification, object detection, and handwriting extraction
[ 3 ] Audio:
Provides transcription, summarisation, sentiment analysis, speaker detection, and intent extraction
[ 4 ] Video:
Offers video summaries, speech-to-text, scene detection, object recognition, and action understanding
Standard Output vs Custom Output (Blueprints):
[ 1 ] Standard Output:
Out-of-the-box extraction
Ideal for common documents
Zero setup, quick results
[ 2 ] Custom Output:
Based on blueprints
Allows for prompt or user-defined blueprints
Accelerates setup and maintains consistency
Suitable for industry-specific or complex documents
Types of Document Blueprints:
[ 1 ] Classification:
Invoice, bank statement, ID card, contract, HR letter, etc.
[ 2 ] Extraction:
Entities, fields, tables, metadata
[ 3 ] Transformation:
Modify or restructure data
[ 4 ] Normalization:
Standardise data values
[ 5 ] Validation:
Validate extracted fields against rules
Use Cases:
[ 1 ] Banking & Finance:
Automate bank statements, invoices, receipts, fraud checks
[ 2 ] Insurance:
Claims processing from forms, photos, reports
Auto-summaries, extraction, validation
[ 3 ] Customer Support:
Transcribe & summarize calls
Detect sentiment and customer intent
[ 4 ] HR & Legal:
Process resumes, contracts, offer letters
Extract skills, clauses, obligations
[ 5 ] Security & Operations:
Summaries from meeting recordings
CCTV context extraction (people, actions)
Key Takeaways:
Bedrock Data Automation (BDA) is a comprehensive, customizable, and scalable solution.
[ 1 ] One Platform for All Formats:
Automates document, image, audio, and video processing.
[ 2 ] Customizability:
Delivers highly accurate and customizable outputs using advanced foundation models.
Ensures trustworthy and consistent insights tailored to any business workflow.
[ 3 ] Enterprise-Ready:
Scales to thousands of files with high accuracy and compliance.
[ 4 ] Faster, Cheaper, Smarter:
Reduces manual workload and delivers clean, structured outputs instantly.
Team:
Top comments (0)