Oracle Autonomous Database transforms how organizations access, analyze, and share data through integrated data lakehouse capabilities and modern data sharing protocols. By combining data warehouse and data lake technologies with secure, real-time data sharing, Autonomous Database enables organizations to break down data silos and accelerate insights across cloud platforms and organizational boundaries.
Oracle Autonomous AI Lakehouse
Multi-Cloud Data Lake Integration
Oracle Autonomous AI Lakehouse provides simple, secure multi-cloud access to all types of data, eliminating traditional boundaries between databases and data lakes.
Oracle Autonomous AI Lakehouse enables open, interoperable data access across multi-platform, multicloud environments. It combines Oracle Autonomous AI Database with vendor-independent Apache Iceberg, enabling customers to run AI and analytics securely on all their data—available on OCI, AWS, Azure, Google Cloud, and Exadata Cloud@Customer.
Supported Cloud Object Stores:
- Oracle Cloud Infrastructure (OCI): Native object storage integration
- Amazon Web Services (AWS): S3 bucket connectivity
- Microsoft Azure: Azure Blob Storage support
- Google Cloud Platform (GCP): Google Cloud Storage access
Key Capabilities:
- Query or load data from multi-cloud object stores without data movement
- Unified SQL access across distributed data sources
- Integrated security and governance policies
- Cost-effective storage at object storage prices with database performance
Comprehensive Data Format Support
Autonomous Database supports all common data types and table formats, providing flexibility for diverse data sources and analytics workloads.
Supported File Formats:
- Parquet: Columnar storage format optimized for analytics
- ORC (Optimized Row Columnar): High-compression columnar format
- CSV (Comma-Separated Values): Traditional text-based data format
- JSON: Semi-structured hierarchical data format
- Avro: Row-based format with schema evolution support
Additional Format Support:
- Delimited text files
- Apache Iceberg tables
- Delta Lake format (via Delta Sharing)
- Proprietary formats through custom readers
External Tables for Data Lake Access
Direct Object Store Querying:
Autonomous Database external tables are used to link to data in object storage. An external table accesses the files in object store directly without loading the data.
External Table Benefits:
- Zero Data Movement: Query data where it resides
- Rapid Data Exploration: Quickly assess data value without loading
- Transparent Access: Applications query external tables like regular tables
- Cost Efficiency: No storage costs within database for external data
External Table Types:
Standard External Tables:
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
table_name => 'SALES_DATA',
credential_name => 'OBJ_STORE_CRED',
file_uri_list => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/.../sales/*.parquet',
format => JSON_OBJECT('type' value 'parquet')
);
END;
/
Partitioned External Tables:
Partitioned external tables deliver performance benefits at query time through partition pruning, which eliminates scanning of unnecessary data files.
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE(
table_name => 'SALES_PARTITIONED',
credential_name => 'OBJ_STORE_CRED',
file_uri_list => 'https://objectstorage/.../sales/',
column_list => 'sale_id NUMBER, amount NUMBER, sale_date DATE',
field_list => 'sale_id, amount, sale_date',
format => JSON_OBJECT('type' value 'parquet'),
partitioning_clause => 'PARTITION BY (sale_date)'
);
END;
/
Hybrid Partitioned Tables:
CREATE_HYBRID_PART_TABLE creates a hybrid partitioned table allowing you to specify both partition data stored in Autonomous Database and partition data stored in object storage.
Apache Iceberg Integration
Open Lakehouse Format:
Autonomous AI Lakehouse integrates with open data platforms across any cloud via Apache Iceberg, letting you query Iceberg tables in place with Oracle AI Database 26ai's built-in AI, machine learning, graph, and spatial—without data movement.
Iceberg Benefits:
- Open Standard: Vendor-independent table format
- Schema Evolution: Add, drop, and rename columns without rewriting data
- Time Travel: Query historical versions of tables
- ACID Transactions: Full transactional support for data lakes
- Hidden Partitioning: Automatic partition management
Multi-Catalog Support:
The Autonomous AI Database Data Catalog supports integration with external data sources and catalogs including:
- Oracle Cloud Infrastructure Data Catalog
- AWS Glue
- Databricks Unity Catalog
- Snowflake Polaris
- Apache Iceberg catalogs
Data Lake Accelerator
High-Performance Query Engine:
Oracle Data Lake Accelerator significantly improves query speeds on external data stored in object stores, enabling faster analysis of large datasets without data movement or workflow changes.
Accelerator Features:
- Dynamic Scaling: Automatically scales network and compute resources based on query demands
- Petabyte-Scale Scans: Handles massive datasets efficiently
- Table Caching: Fast repeat queries through intelligent caching
- Consumption-Based Billing: Pay only for resources used during query execution
Performance Benefits:
Organizations using Data Lake Accelerator experience dramatically faster query speeds on Iceberg tables, with the ability to dynamically scale compute resources on demand while keeping costs under control.
Data Sharing in Autonomous Database
Modern Data Sharing Overview
Data sharing is a process of making data accessible to one or many recipients to use, enabling collaboration, innovation, and insights across organizational boundaries.
Autonomous Database Data Sharing:
Making data centrally and securely available is a fundamental capability in Autonomous Database, enabling real-time collaboration without data duplication.
Key Sharing Capabilities:
- ADB can share Oracle data with non-Oracle clients through open protocols
- Real-time data exchange across Oracle Autonomous Database instances
- Bi-directional sharing (both provide and consume data)
- Support for internal and external recipients
Data Sharing Methods
1. Delta Sharing Protocol:
Oracle Autonomous Database now supports Delta Sharing, enabling secure, open data collaboration across platforms without data copies or ETL.
Delta Sharing Characteristics:
- Open Protocol: Developed by Databricks, now open-sourced
- Vendor-Agnostic: Works across cloud providers and platforms
- Secure: Strong authentication and authorization
- Efficient: No data duplication or complex pipelines
- Versioned: Share specific versions of data with recipients
Delta Sharing Architecture:
- Provider creates data share and publishes to object storage
- Recipients receive activation link with credentials
- Data shared in Parquet format via REST API
- Recipients query data without direct database access
2. Live Share (Real-Time Sharing):
Live Share enables optimized, native, and security-focused data sharing between Autonomous Database instances without data duplication.
Live Share Benefits:
- Real-Time Access: Data instantly available across databases
- Native Integration: Based on Autonomous Database Cloud Link technology
- Zero Duplication: No data copying or physical movement
- Low Latency: Smooth, high-performance connection between ADB instances
- Security-Focused: Native OCI security controls
Live Share vs. Delta Sharing:
- Live Share: For ADB-to-ADB real-time sharing within OCI
- Delta Sharing: For ADB-to-external platforms (Databricks, Power BI, Tableau, etc.)
Data Sharing Components
Provider:
The entity (person, institution, or software system) that shares data objects from their Autonomous Database.
Recipient:
An entity (individual, institution, or software system) that receives and accesses shared data from a provider.
Share:
A named collection of tables or views shared as a single entity, representing logical groupings of related data.
Cloud Storage Link:
Location where shared data is stored (Object Storage bucket with credentials) for versioned Delta Sharing.
Data Share Version:
Data shares are versioned; recipients typically access only the latest version unless configured otherwise.
Activation Link:
Secure link sent to recipients enabling them to download a data share profile with authentication tokens.
Data Sharing Use Cases
Internal Collaboration:
- Sharing data between departments without duplication
- Enabling self-service analytics across business units
- Providing test data to development teams
- Supporting data science and ML model development
External Collaboration:
- Sharing data with business partners and suppliers
- Providing data to customers for analytics
- Collaborating with research institutions
- Supporting third-party applications
Multi-Cloud Analytics:
- Analyzing Oracle data in Databricks workspaces
- Visualizing ADB data in Power BI or Tableau
- Integrating ADB with Snowflake analytics
- Building cross-cloud data products
Data Monetization:
- Creating data products for external customers
- Establishing data marketplaces
- Providing subscription-based data access
- Supporting SaaS and analytics applications
Implementing Data Lake and Sharing Solutions
Setting Up Data Lake Access
1. Create Cloud Credentials:
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL(
credential_name => 'OCI_CRED',
username => '<tenancy>/<username>',
password => '<auth_token>'
);
END;
/
2. Create External Tables:
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
table_name => 'CUSTOMER_DATA',
credential_name => 'OCI_CRED',
file_uri_list => 'https://objectstorage.region.oraclecloud.com/n/namespace/b/bucket/o/customers/*.parquet',
format => JSON_OBJECT('type' value 'parquet')
);
END;
/
3. Query External Data:
SELECT
customer_id,
customer_name,
total_purchases
FROM CUSTOMER_DATA
WHERE purchase_date >= DATE '2024-01-01'
ORDER BY total_purchases DESC
FETCH FIRST 10 ROWS ONLY;
Configuring Data Sharing
As a Data Provider:
1. Access Data Share Tool:
Navigate to Database Actions > Data Studio > Data Share
2. Create Share:
- Click "Provide Share"
- Select tables or views to share
- Name the share and provide description
- Choose share type (versioned Delta Share or Live Share)
3. Add Recipients:
- Specify recipient email addresses
- Set access permissions and expiration
- Generate activation links
- Recipients receive email with profile download link
4. Publish Share:
BEGIN
DBMS_SHARE.PUBLISH_SHARE(
share_name => 'SALES_SHARE',
comments => 'Q4 2024 sales data'
);
END;
/
As a Data Recipient:
1. Receive Activation Link:
Provider sends email with secure activation link
2. Download Profile:
Click activation link to download JSON profile with credentials
3. Configure Connection:
- In Database Actions, navigate to Data Share
- Click "Consume Share"
- Import JSON profile
- Configure access parameters
4. Access Shared Data:
-- Query shared data as if it were local
SELECT * FROM SHARED_SALES_DATA
WHERE region = 'WEST'
AND sale_date >= TRUNC(SYSDATE, 'MM');
Integrating with External Platforms
Databricks Integration:
# Python code in Databricks
import delta_sharing
# Path to profile JSON file
profile_file = "/dbfs/delta_sharing/oracle_share.json"
# Create SharingClient
client = delta_sharing.SharingClient(profile_file)
# List available shares
shares = client.list_shares()
# Access shared table
table_url = profile_file + "#share_name.schema_name.table_name"
df = delta_sharing.load_as_pandas(table_url)
Power BI Integration:
- In Power BI, select "Get Data" > "Delta Share"
- Import JSON profile from Oracle ADB
- Select tables to visualize
- Build reports and dashboards
Tableau Integration:
- Create Delta Share connection in Tableau
- Import profile JSON
- Connect to shared tables
- Create visualizations
Best Practices
Data Lake Best Practices
Performance Optimization:
- Use partitioned external tables for large datasets
- Leverage Data Lake Accelerator for complex queries
- Cache frequently accessed data
- Use appropriate file formats (Parquet for analytics)
Security:
- Implement least-privilege access for credentials
- Rotate credentials regularly
- Use private endpoints when possible
- Implement comprehensive audit logging
Cost Management:
- Store cold data in object storage, not database
- Use compression for file formats
- Monitor Data Lake Accelerator consumption
- Clean up unused external tables
Data Sharing Best Practices
Provider Best Practices:
- Document shared datasets with metadata and descriptions
- Version shares appropriately
- Monitor recipient access patterns
- Set appropriate expiration dates for external shares
- Maintain security and governance policies
Recipient Best Practices:
- Secure profile JSON files properly
- Rotate credentials per security policies
- Monitor data freshness and versions
- Validate data quality from external sources
- Document data lineage and dependencies
Governance:
- Establish data classification standards
- Implement approval workflows for external sharing
- Regular audits of shared data and recipients
- Compliance checks for regulatory requirements
- Data quality validation processes
Advanced Scenarios
Hybrid Data Lakehouse
Combining Internal and External Data:
-- Join database table with external data lake table
SELECT
c.customer_id,
c.customer_name,
e.transaction_amount,
e.transaction_date
FROM CUSTOMERS c
INNER JOIN EXTERNAL_TRANSACTIONS e
ON c.customer_id = e.customer_id
WHERE e.transaction_date >= ADD_MONTHS(SYSDATE, -3);
Multi-Cloud Analytics Pipeline
End-to-End Flow:
- Data ingested from various sources to multi-cloud object storage
- Autonomous Database queries data in place using external tables
- ML models built on combined data
- Results shared via Delta Sharing to analytics platforms
- Business users consume via Power BI, Tableau, or custom applications
Real-Time Collaborative Analytics
Live Share Architecture:
- Production ADB instance shares live data
- Analytics ADB instances consume via Live Share
- Data scientists access real-time data without impacting production
- Business analysts query current state for decision-making
- No data replication or synchronization delays
Conclusion
Oracle Autonomous Database's integrated data lakehouse and data sharing capabilities represent a fundamental shift in how organizations access, analyze, and collaborate with data. By combining simple, secure multi-cloud access with modern sharing protocols, Autonomous Database eliminates traditional barriers to data-driven insights.
Key Capabilities:
Data Lake Integration:
- Multi-cloud object store access (OCI, AWS, Azure, GCP)
- Comprehensive format support (Parquet, ORC, CSV, JSON, Avro)
- External tables for zero-copy querying
- Apache Iceberg integration for open lakehouse
- Data Lake Accelerator for petabyte-scale performance
Data Sharing:
- Delta Sharing for cross-platform collaboration
- Live Share for real-time ADB-to-ADB exchange
- Bi-directional sharing capabilities
- No data duplication required
- Secure, versioned, and auditable
Business Impact:
- Break Down Silos: Unified access to data across platforms
- Accelerate Insights: Query data where it lives without movement
- Enable Collaboration: Secure sharing within and across organizations
- Reduce Costs: Object storage pricing with database performance
- Future-Proof: Open standards and multi-cloud flexibility
Whether querying petabytes of data across multiple clouds, sharing real-time data between Autonomous Database instances, or collaborating with external partners through Delta Sharing, Autonomous Database provides the comprehensive platform necessary for modern data-driven organizations.
The combination of powerful data lake analytics and secure data sharing capabilities positions Oracle Autonomous Database as the foundation for enterprise data platforms that demand openness, performance, security, and collaboration without compromise.
Top comments (0)