Based on the scripts, I'll create a detailed deployment plan for Azure Functions (v2 model) that addresses the MongoDB/Oracle sources and MSSQL Managed Instance target. The solution will include proper connection management, logging, and incremental sync logic.
Project Structure
oracle-to-mssql-sync/
├── .vscode/
│ └── settings.json
├── oracle/ # Oracle Instant Client for Linux
│ ├── libclntsh.so
│ └── ... (other Oracle libs)
├── scripts/
│ ├── __init__.py
│ ├── accountstatementpremium.py
│ ├── agency.py
│ ├── agents.py
│ ├── allocatedpremiums.py
│ └── happilymaster.py
├── function_app.py
├── requirements.txt
├── host.json
├── local.settings.json
└── .gitignore
Step-by-Step Deployment Guide
1. Set Up Project Directory
mkdir oracle-to-mssql-sync
cd oracle-to-mssql-sync
mkdir .vscode scripts oracle
2. Install Oracle Instant Client
- Download Basic Light Package from Oracle Instant Client
- Unzip and place contents in
oracle/
directory - Verify files:
ls oracle
should showlibclntsh.so
and other shared libraries
3. Create Environment File (local.settings.json)
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "UseDevelopmentStorage=true",
"ORACLE_CONN_STR": "oracle+oracledb://reporter:reporter@40.112.50.79:1521/HEIRSDB",
"MONGODB_CONN_STR": "mongodb+srv://heirs-prod-retail-service:a0UQ3WLTD8J7sTng@production-middleware-d.3tprz.mongodb.net/",
"MSSQL_HLA_CONN_STR": "DRIVER={ODBC Driver 18 for SQL Server};SERVER=hig-prod-sql-mi.public.ae890fe634a5.database.windows.net,3342;DATABASE=HLA;UID=heirsdata;PWD=P@ssw0rd;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;",
"MSSQL_MONGODB_CONN_STR": "DRIVER={ODBC Driver 18 for SQL Server};SERVER=hig-prod-sql-mi.public.ae890fe634a5.database.windows.net,3342;DATABASE=MONGODB;UID=heirsdata;PWD=P@ssw0rd;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;",
"LD_LIBRARY_PATH": "/home/site/wwwroot/oracle"
}
}
4. requirements.txt
azure-functions==3.3.0
pandas==1.5.3
sqlalchemy==1.4.41
pyodbc==4.0.39
pymongo==4.3.3
cx_Oracle==8.3.0
python-dotenv==0.21.0
5. Refactored Scripts (common pattern)
scripts/init.py
import os
import logging
import cx_Oracle
from dotenv import load_dotenv
# Initialize environment
load_dotenv()
# Configure Oracle client
try:
cx_Oracle.init_oracle_client(lib_dir=os.getenv("LD_LIBRARY_PATH"))
except Exception as e:
logging.warning(f"Oracle client already initialized: {str(e)}")
# Common database functions
def get_mssql_engine(database_type="HLA"):
conn_str = os.getenv(f"MSSQL_{database_type}_CONN_STR")
quoted = urllib.parse.quote_plus(conn_str)
return create_engine(f"mssql+pyodbc:///?odbc_connect={quoted}")
def get_max_value(engine, table, column):
with engine.connect() as conn:
result = conn.execute(text(f"SELECT MAX({column}) FROM {table}"))
return result.scalar()
scripts/accountstatementpremium.py (example)
import pandas as pd
from . import get_mssql_engine, get_max_value
from pymongo import MongoClient
from sqlalchemy import text
import logging
import os
def sync():
logging.info("Starting AccountStatementPremium sync")
# Setup connections
mongo_client = MongoClient(os.getenv("MONGODB_CONN_STR"))
db = mongo_client['production-retail-db']
collection = db['AccountStatementPremium']
sql_engine = get_mssql_engine("MONGODB")
# Get max date from target
max_date = get_max_value(sql_engine, 'AccountStatementPremium1', 'date')
# Fetch new data
query = {"date": {"$gt": max_date}} if max_date else {}
data = list(collection.find(query).sort("date", -1))
if not data:
logging.info("No new data found")
return
df = pd.DataFrame(data)
# ... (rest of processing logic)
# Batch insert
batch_size = 10000
for i in range(0, len(df), batch_size):
batch = df.iloc[i:i+batch_size]
batch.to_sql('AccountStatementPremium1', sql_engine, if_exists='append', index=False)
logging.info(f"Inserted batch {i//batch_size + 1}/{(len(df)//batch_size)+1}")
logging.info(f"Completed sync: {len(df)} records")
6. function_app.py
import azure.functions as func
import logging
from scripts import (
accountstatementpremium,
agency,
agents,
allocatedpremiums,
happilymaster
)
app = func.FunctionApp()
@app.function_name(name="AccountStatementPremiumSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def account_sync(timer: func.TimerRequest):
accountstatementpremium.sync()
@app.function_name(name="AgencySync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def agency_sync(timer: func.TimerRequest):
agency.sync()
@app.function_name(name="AgentsSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def agents_sync(timer: func.TimerRequest):
agents.sync()
@app.function_name(name="AllocatedPremiumsSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def allocated_sync(timer: func.TimerRequest):
allocatedpremiums.sync()
@app.function_name(name="HappylyMasterSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def happily_sync(timer: func.TimerRequest):
happilymaster.sync()
7. host.json
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[3.15.0, 4.0.0)"
}
}
8. .vscode/settings.json
{
"azureFunctions.deploySubpath": ".",
"azureFunctions.projectLanguage": "Python",
"azureFunctions.projectRuntime": "~3",
"azureFunctions.scmDoBuildDuringDeployment": true,
"azureFunctions.pythonVenv": ".venv",
"azureFunctions.subscriptions": "<your-subscription-id>",
"azureFunctions.resourceGroup": "sync-resources"
}
9. .gitignore
.venv
__pycache__
.vscode
local.settings.json
.env
*.log
Deployment to Azure
- Create Azure Resources:
az group create --name sync-resources --location eastus
az storage account create --name syncstorageacc --location eastus --resource-group sync-resources --sku Standard_LRS
az functionapp create --resource-group sync-resources --consumption-plan-location eastus --runtime python --runtime-version 3.9 --functions-version 4 --name oracle-sync-app --storage-account syncstorageacc --os-type Linux
- Set Environment Variables in Azure:
az functionapp config appsettings set --name oracle-sync-app --resource-group sync-resources --settings \
"ORACLE_CONN_STR=oracle+oracledb://reporter:reporter@40.112.50.79:1521/HEIRSDB" \
"MONGODB_CONN_STR=mongodb+srv://heirs-prod-retail-service:a0UQ3WLTD8J7sTng@production-middleware-d.3tprz.mongodb.net/" \
"MSSQL_HLA_CONN_STR=DRIVER={...};SERVER=...;DATABASE=HLA;..." \
"MSSQL_MONGODB_CONN_STR=DRIVER={...};SERVER=...;DATABASE=MONGODB;..." \
"LD_LIBRARY_PATH=/home/site/wwwroot/oracle" \
"WEBSITE_TIME_ZONE=Africa/Lagos"
-
Deploy from VS Code:
- Install Azure Functions Extension
- Press F1 → "Azure Functions: Deploy to Function App"
- Select your subscription and "oracle-sync-app"
- Confirm deployment
Post-Deployment Configuration
-
Enable Managed Identity for SQL:
- In Azure Portal, go to SQL Managed Instance
- Add function app's managed identity as login:
CREATE USER [oracle-sync-app] FROM EXTERNAL PROVIDER ALTER ROLE db_datareader ADD MEMBER [oracle-sync-app] ALTER ROLE db_datawriter ADD MEMBER [oracle-sync-app]
-
Connection Security:
- Add Azure Function's outbound IPs to MongoDB Atlas IP whitelist
- Enable VNet integration for SQL Managed Instance connectivity
-
Monitoring:
- Configure Application Insights for logging
- Set up alerts for function timeouts (max duration: 10 minutes)
Key Improvements in Refactored Code
-
Secure Credential Management:
- All credentials moved to environment variables
- Azure Key Vault integration ready (use
@Microsoft.KeyVault
references)
-
Efficient Data Transfer:
- Batched inserts (10k records/batch)
- Incremental sync using MAX(date/id) checks
- Connection pooling with SQLAlchemy
-
Error Handling:
- Comprehensive logging to Application Insights
- Automatic retries built into Azure Functions
- Connection validation before data transfer
-
Oracle Client Management:
- Instant client included in deployment
- Proper Linux library path configuration
- Lazy initialization of connections
Troubleshooting Tips
- Oracle Client Issues:
# SSH into function app
func azure functionapp fetch-app-settings oracle-sync-app
func azure functionapp publish oracle-sync-app --build-native-deps
# Check library path
kudu console → ls -la oracle
-
Connection Problems:
- Test connectivity from Kudu Console:
python -c "import pyodbc; pyodbc.connect('DRIVER={ODBC Driver 18 for SQL Server};SERVER=...')"
-
Performance Optimization:
- Increase timeout in host.json:
"functionTimeout": "00:10:00"
- Use Premium plan for longer execution times
This implementation provides a production-ready solution that handles the hybrid data source scenario while following Azure Functions best practices and maintaining security compliance.
Top comments (0)