Oracle Autonomous Database integrates powerful machine learning capabilities and generative AI features that democratize data science and make database interactions accessible to everyone. Oracle Machine Learning AutoML automates complex machine learning workflows, while Select AI enables natural language database queries without SQL expertise.
Oracle Machine Learning for Python (OML4Py)
OML4Py Overview
Oracle Machine Learning for Python (OML4Py) is a Python API that supports the machine learning process including data exploration and preparation, machine learning modeling, and solution deployment using your Oracle AI Database or Oracle Autonomous AI Database.
Key Capabilities:
- In-Database Machine Learning: Run algorithms directly in the database
- Embedded Python Execution: Execute Python functions in database-spawned engines
- Automated Machine Learning (AutoML): Automated algorithm selection, feature selection, and model tuning
- Scalability: Leverage database parallel processing capabilities
- Data Proximity: Process data where it resides without extraction
AutoML: Automating Machine Learning Workflows
The AutoML Value Proposition
Automated Machine Learning (AutoML) enhances user productivity and machine learning results through automated algorithm and feature selection, as well as automated model tuning and selection.
AutoML automates tedious parts of data scientists' workflow and enables non-experts to produce models, democratizing machine learning across organizations.
Benefits:
- Enhanced Productivity: Reduces time spent on routine ML tasks
- Non-Expert Enablement: Allows business users to build models
- Best Practice Automation: Incorporates data science expertise automatically
- Reduced Errors: Eliminates manual mistakes in model selection
- Faster Time-to-Value: Accelerates ML project delivery
AutoML Workflow Stages
The AutoML pipeline consists of six essential stages that guide data from raw input to production-ready models:
1. Dataset Preparation:
- Prepare and clean your training data
- Create Oracle Machine Learning DataFrame proxy objects
- Separate predictors from target variables
- Handle missing values and data quality issues
2. Algorithm Selection:
Identify top-k algorithms most suitable for your dataset and prediction task.
Algorithm Selection Process:
from oml import automl
# Define algorithm selection object
as_object = automl.AlgorithmSelection(
mining_function='classification',
score_metric='accuracy',
parallel=2
)
# Select top algorithms
algorithm_ranking = as_object.select(X_data, y_data, k=4)
Supported Algorithms:
- Support Vector Machines (SVM) - Gaussian and Linear
- Neural Networks
- Random Forest
- Decision Trees
- Logistic Regression
- Naive Bayes
3. Adaptive Sampling:
Select a suitable subsample for model training to optimize performance and training time.
Sampling Benefits:
- Reduces training time for large datasets
- Maintains statistical significance
- Oracle's advanced meta-learning intelligence suggests optimal sample sizes
- Balances accuracy with computational efficiency
4. Feature Selection:
Select relevant features that contribute most significantly to prediction accuracy while reducing dimensionality.
Feature Selection Implementation:
# Define feature selection object
fs_object = automl.FeatureSelection(
mining_function='classification',
score_metric='accuracy',
parallel=2
)
# Reduce features
selected_features = fs_object.reduce(
selected_algorithm,
X_data,
y_data
)
Feature Selection Benefits:
- Reduces model training time without compromising performance
- Uses multiple feature-ranking algorithms
- Oracle advanced meta-learning techniques prune search space efficiently
- Handles both classification and regression problems
5. Model Tuning:
Identify optimal hyperparameters for the selected algorithm to maximize model performance.
Model Tuning Process:
# Define model tuning object
mt_object = automl.ModelTuning(
mining_function='classification',
score_metric='accuracy',
parallel=2
)
# Tune model
tuned_model = mt_object.tune(
selected_algorithm,
X_reduced,
y_data
)
Hyperparameter Optimization:
- Automated parameter search across algorithm-specific hyperparameter space
- Cross-validation for robust performance estimation
- Grid search and random search strategies
- Oracle meta-learning for intelligent parameter recommendations
6. Feature Prediction Impact:
Compute the impact of each input column on the predictions of the final tuned model.
Impact Analysis Benefits:
- Provides insights into tuned AutoML model behavior
- Helps business users understand model decisions
- Identifies most influential features
- Supports model explainability and transparency
AutoML UI: Visual Model Development
No-Code ML Development:
To use the Oracle Machine Learning AutoML UI, you start by creating an experiment. An experiment is a unit of work that minimally specifies the data source, prediction target, and prediction type.
AutoML UI Workflow:
- Create Experiment: Define data source, target, and prediction type
- Configure Settings: Set maximum models, run duration, metrics
- Start Experiment: Choose between "Faster Results" or "Better Accuracy"
- Review Results: Examine ranked models by quality metric
- Deploy or Generate Notebook: Deploy models or generate Python code
Experiment Configuration:
- Maximum Top Models: Select number of top models to create (default 5)
- Maximum Run Duration: Set time limit (default 8 hours)
- Database Service Level: Configure query parallelism
- Model Metric: Choose evaluation metric (accuracy, F1, precision, recall)
- Algorithm Selection: Select specific algorithms or use all candidates
After an experiment runs successfully, it presents you a list of machine learning models in order of model quality according to the metric selected.
Deployment Options:
- Direct Deployment: Deploy selected models for scoring
- Notebook Generation: Generate Python code using OML4Py and specific settings AutoML used to produce the model
- Model Comparison: Compare models across multiple metrics
- Model Explanation: Understand feature importance and prediction drivers
AutoML API: Programmatic Model Development
OML4Py AutoML Classes:
- automl.AlgorithmSelection: Rank algorithms by predicted score
- automl.FeatureSelection: Identify most relevant feature subsets
- automl.ModelTuning: Tune hyperparameters for specific algorithms
- automl.ModelSelection: Select top algorithm and tune automatically
Model Selection Example:
from oml import automl
# Automatic model selection and tuning
ms_object = automl.ModelSelection(
mining_function='classification',
score_metric='f1_weighted',
parallel=4
)
# Select and tune best model
best_model = ms_object.select(X_train, y_train)
# Score on test data
predictions = best_model.predict(X_test)
Select AI: Natural Language Database Queries
Select AI Overview
Select AI makes it easy to get answers about your business using natural language, transforming how users interact with databases by eliminating the need for SQL expertise.
How Select AI Works:
Select AI processes questions using an AI Large Language Model (LLM), translating natural language prompts into SQL queries that the database executes automatically.
The DBMS_CLOUD_AI package enables integration with a user-specified LLM for generating SQL code using natural language prompts.
Select AI Process:
- User Provides Natural Language Prompt: "What were our top 5 products by revenue last quarter?"
- Database Augments Prompt: Adds schema metadata to provide context
- LLM Generates SQL: Translates augmented prompt to SQL query
- Database Executes Query: Runs generated SQL and returns results
- Natural Language Response: Optionally narrates results in natural language
Key Select AI Features
Simple and Intuitive:
- Use plain language instead of complex SQL syntax
- No need to understand database schema or table relationships
- Multilingual support for global businesses
- Voice command support for hands-free queries
Future-Enabled:
- Integration with cutting-edge LLM technologies
- Continuous improvement as LLMs evolve
- Support for emerging AI capabilities
- Regular feature enhancements and updates
Secure:
- Respects existing database security policies
- Access control based on user permissions
- Data never leaves your organization
- Compliant with data governance requirements
Easy to Extend:
- Build new natural language processing applications
- Integrate with existing tools and workflows
- Develop custom AI-driven database interactions
- API support for programmatic access
Supported LLM Providers
Select AI supports multiple Large Language Model providers, giving organizations flexibility to choose their preferred AI platform:
OCI Generative AI:
- Oracle's native generative AI service
- Models from Meta (Llama) and Cohere hosted on OCI
- Strong data governance and security
- Custom models trained on customer data
- Region-specific deployment options
OpenAI:
- GPT-3.5 and GPT-4 models
- Advanced natural language understanding
- Broad language support
- Regular model updates and improvements
Cohere:
- Enterprise-focused LLMs
- Specialized text generation and understanding
- Multilingual capabilities
- Privacy-preserving features
Azure OpenAI:
- Microsoft's enterprise OpenAI service
- GPT models with Azure integration
- Enterprise security and compliance
- Hybrid cloud deployment options
Select AI Actions
Natural Language to SQL:
Generate and execute SQL queries from natural language prompts.
SELECT AI what were the top 5 selling products last month;
SHOWSQL - View Generated Query:
Display the SQL query generated from natural language without executing it.
SELECT AI SHOWSQL what are the top 5 products by revenue;
NARRATE - Natural Language Results:
Execute query and return results as natural language description instead of tabular data.
SELECT AI NARRATE what were our total sales by region last quarter;
CHAT - General AI Conversation:
Submit general requests and questions to the LLM, with optional context from vector stores.
SELECT AI CHAT explain the concept of customer lifetime value;
Retrieval Augmented Generation (RAG)
Select AI with RAG:
Select AI with RAG augments your natural language prompt by retrieving content from your specified vector store using semantic similarity search. This reduces hallucinations by using your specific and up-to-date content and provides more relevant natural language responses.
RAG Benefits:
- Reduced Hallucinations: LLM uses your actual data and documents
- Current Information: Access to latest business content
- Improved Accuracy: Context-specific responses
- Domain Expertise: Incorporate organizational knowledge
AI Vector Search:
AI Vector Search enables similarity queries on text, documents, images, and other unstructured data represented as vectors, powering RAG implementations.
Select AI Profile Configuration
Creating an AI Profile:
BEGIN
DBMS_CLOUD_AI.CREATE_PROFILE(
profile_name => 'my_ai_profile',
attributes => JSON_OBJECT(
'provider' VALUE 'oci',
'credential_name' VALUE 'OCI_CRED',
'object_list' VALUE JSON_ARRAY(
JSON_OBJECT(
'owner' VALUE 'ADMIN',
'name' VALUE 'CUSTOMERS'
),
JSON_OBJECT(
'owner' VALUE 'ADMIN',
'name' VALUE 'ORDERS'
)
)
)
);
END;
/
Profile Components:
- Provider: LLM provider (OCI, OpenAI, Cohere, Azure)
- Credentials: Authentication for LLM access
- Object List: Schemas, tables, and views available for queries
- Settings: Model-specific parameters and configurations
Select AI Use Cases
Business Intelligence:
- Sales performance analysis
- Customer behavior insights
- Market trend identification
- Financial reporting and analysis
Data Exploration:
- Ad-hoc data analysis by business users
- Quick insights without waiting for IT
- Exploratory data analysis for new datasets
- Validation of business hypotheses
Application Development:
- Build conversational AI applications
- Integrate natural language interfaces into existing apps
- Develop chatbots with database access
- Enable voice-activated database queries
Citizen Data Science:
- Empower non-technical users to analyze data
- Self-service analytics for departments
- Reduce IT bottlenecks for simple queries
- Democratize data access across organization
Integration: AutoML and Select AI
Complementary Capabilities
AutoML for Model Building:
Use AutoML to build predictive models from your data, automating the machine learning workflow.
Select AI for Model Insights:
Use Select AI to query model predictions and understand model behavior through natural language.
Combined Workflow:
- Use AutoML to build customer churn prediction model
- Deploy model in database
- Use Select AI to query: "Which customers are most likely to churn next month?"
- Get natural language explanation of predictions and contributing factors
End-to-End AI/ML Platform
Oracle Autonomous Database provides:
- Integrated machine learning platform (OML4Py, OML4R, OML4SQL)
- Automated model development (AutoML UI and API)
- Natural language data access (Select AI)
- Vector database capabilities (AI Vector Search)
- Generative AI integration (LLM providers)
- Scalable in-database processing
Best Practices
AutoML Best Practices
Data Preparation:
- Clean and validate data before AutoML
- Handle missing values appropriately
- Ensure sufficient training data
- Balance classes for classification problems
Experiment Configuration:
- Start with "Faster Results" for quick validation
- Use "Better Accuracy" for production models
- Set reasonable time limits based on dataset size
- Monitor resource consumption during training
Model Evaluation:
- Evaluate multiple top models, not just the best
- Test models on holdout data
- Consider business metrics alongside statistical metrics
- Validate model explainability and fairness
Select AI Best Practices
Profile Configuration:
- Include only relevant schemas and tables in profiles
- Use descriptive table and column comments
- Maintain accurate database metadata
- Create separate profiles for different use cases
Prompt Engineering:
- Be specific and clear in natural language prompts
- Provide context when needed
- Start simple and refine based on results
- Use SHOWSQL to understand generated queries
Security and Governance:
- Leverage existing database security policies
- Monitor LLM usage and costs
- Review generated SQL for sensitive operations
- Implement appropriate access controls
Quality Assurance:
- Validate generated SQL queries
- Review LLM responses for accuracy
- Test with various prompt formulations
- Implement guardrails for production use
Conclusion
Oracle Autonomous Database's integration of Oracle Machine Learning AutoML and Select AI represents a significant advancement in making advanced analytics accessible to all users. AutoML automates tedious parts of data scientists' workflow and enables non-experts to produce models, while Select AI makes it easy to get answers about your business using natural language.
Key Capabilities:
AutoML:
- Six-stage automated ML pipeline
- Algorithm selection, feature selection, and model tuning
- Visual AutoML UI for no-code development
- Programmatic API for advanced users
- Embedded Python execution for scalability
Select AI:
- Natural language to SQL generation
- Multiple LLM provider support (OCI, OpenAI, Cohere, Azure)
- RAG for improved accuracy with organizational data
- Simple, secure, and easily extensible
- Voice and multilingual support
Business Impact:
- Democratize Data Science: Enable business users to build ML models
- Accelerate Insights: Get answers in natural language instantly
- Reduce IT Bottlenecks: Self-service analytics and model building
- Improve Decision Making: Data-driven insights accessible to all
- Enhance Productivity: Automate routine ML and query tasks
Whether you're a data scientist looking to accelerate model development with AutoML, or a business user seeking insights through natural language queries with Select AI, Oracle Autonomous Database provides the integrated platform and automation necessary for modern AI/ML-driven applications.
Top comments (0)