DEV Community

Cover image for Transforming Unstructured Data into Actionable Insights with Amazon Bedrock Data Automation
Eliana Lam for AWS Community On Air

Posted on

Transforming Unstructured Data into Actionable Insights with Amazon Bedrock Data Automation

Speaker: Hafiz Syed Ashir Hassan @ AWS Amarathon 2025

Summary by Amazon Nova



Problem Statement:

  • Organisations struggle with unstructured data in various formats (documents, images, audio, video).

  • Manual processing is slow, inconsistent, and costly.

  • Existing automation systems are rigid, requiring templates, rules, and manual corrections.

  • Increasing demand for compliance, accuracy, and scalability.

  • Need for automating multi-format data processing with high accuracy using generative AI.

What is Bedrock Data Automation (BDA)?:

  • A fully-managed document and media automation capability in Amazon Web Services.

  • Enables building end-to-end extraction, classification, and transformation pipelines using foundation models.

  • Processes documents, images, audio, and video at scale.

  • Orchestrates multi-step workflows using serverless automation.

  • Minimises custom code while maximising flexibility.

Input Asset:

  • [ 1 ] Supports various formats:

  • Documents (PDF, DOCX, scanned, structured/unstructured)

  • Images (PNG, JPG)

  • Audio (voice notes, call recordings)

  • Video (meetings, CCTV, webinars)

  • [ 2 ] Offers two types of output instructions:

  • Standard Output Configuration

  • Custom Schema based on matched blueprint

Output Response:

  • Linearized Text representation of the asset based on configuration.

  • Output returned as JSON + additional files if selected in configuration.

  • Supported Formats & Information BDA Extracts:

  • [ 1 ] Documents:

  • Extracts fields, tables, entities

  • Classifies, transforms, summarises, and validates

  • [ 2 ] Images:

  • Offers OCR, document classification, object detection, and handwriting extraction

  • [ 3 ] Audio:

  • Provides transcription, summarisation, sentiment analysis, speaker detection, and intent extraction

  • [ 4 ] Video:

  • Offers video summaries, speech-to-text, scene detection, object recognition, and action understanding

Standard Output vs Custom Output (Blueprints):

  • [ 1 ] Standard Output:

  • Out-of-the-box extraction

  • Ideal for common documents

  • Zero setup, quick results

  • [ 2 ] Custom Output:

  • Based on blueprints

  • Allows for prompt or user-defined blueprints

  • Accelerates setup and maintains consistency

  • Suitable for industry-specific or complex documents



Types of Document Blueprints:

  • [ 1 ] Classification:

  • Invoice, bank statement, ID card, contract, HR letter, etc.

  • [ 2 ] Extraction:

  • Entities, fields, tables, metadata

  • [ 3 ] Transformation:

  • Modify or restructure data

  • [ 4 ] Normalization:

  • Standardise data values

  • [ 5 ] Validation:

  • Validate extracted fields against rules

Use Cases:

  • [ 1 ] Banking & Finance:

  • Automate bank statements, invoices, receipts, fraud checks

  • [ 2 ] Insurance:

  • Claims processing from forms, photos, reports

  • Auto-summaries, extraction, validation

  • [ 3 ] Customer Support:

  • Transcribe & summarize calls

  • Detect sentiment and customer intent

  • [ 4 ] HR & Legal:

  • Process resumes, contracts, offer letters

  • Extract skills, clauses, obligations

  • [ 5 ] Security & Operations:

  • Summaries from meeting recordings

  • CCTV context extraction (people, actions)

Key Takeaways:

  • Bedrock Data Automation (BDA) is a comprehensive, customizable, and scalable solution.

  • [ 1 ] One Platform for All Formats:

  • Automates document, image, audio, and video processing.

  • [ 2 ] Customizability:

  • Delivers highly accurate and customizable outputs using advanced foundation models.

  • Ensures trustworthy and consistent insights tailored to any business workflow.

  • [ 3 ] Enterprise-Ready:

  • Scales to thousands of files with high accuracy and compliance.

  • [ 4 ] Faster, Cheaper, Smarter:

  • Reduces manual workload and delivers clean, structured outputs instantly.



Team:

AWS FSI Customer Acceleration Hong Kong

AWS Amarathon Fan Club

AWS Community Builder Hong Kong

Top comments (0)