DEV Community

Goodluck Ekeoma Adiole
Goodluck Ekeoma Adiole

Posted on

MongoDB/Oracle sources and MSSQL Managed Instance target

Based on the scripts, I'll create a detailed deployment plan for Azure Functions (v2 model) that addresses the MongoDB/Oracle sources and MSSQL Managed Instance target. The solution will include proper connection management, logging, and incremental sync logic.

Project Structure

oracle-to-mssql-sync/
├── .vscode/
│   └── settings.json
├── oracle/                     # Oracle Instant Client for Linux
│   ├── libclntsh.so
│   └── ... (other Oracle libs)
├── scripts/
│   ├── __init__.py
│   ├── accountstatementpremium.py
│   ├── agency.py
│   ├── agents.py
│   ├── allocatedpremiums.py
│   └── happilymaster.py
├── function_app.py
├── requirements.txt
├── host.json
├── local.settings.json
└── .gitignore
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Deployment Guide

1. Set Up Project Directory

mkdir oracle-to-mssql-sync
cd oracle-to-mssql-sync
mkdir .vscode scripts oracle
Enter fullscreen mode Exit fullscreen mode

2. Install Oracle Instant Client

  1. Download Basic Light Package from Oracle Instant Client
  2. Unzip and place contents in oracle/ directory
  3. Verify files: ls oracle should show libclntsh.so and other shared libraries

3. Create Environment File (local.settings.json)

{
  "IsEncrypted": false,
  "Values": {
    "FUNCTIONS_WORKER_RUNTIME": "python",
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "ORACLE_CONN_STR": "oracle+oracledb://reporter:reporter@40.112.50.79:1521/HEIRSDB",
    "MONGODB_CONN_STR": "mongodb+srv://heirs-prod-retail-service:a0UQ3WLTD8J7sTng@production-middleware-d.3tprz.mongodb.net/",
    "MSSQL_HLA_CONN_STR": "DRIVER={ODBC Driver 18 for SQL Server};SERVER=hig-prod-sql-mi.public.ae890fe634a5.database.windows.net,3342;DATABASE=HLA;UID=heirsdata;PWD=P@ssw0rd;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;",
    "MSSQL_MONGODB_CONN_STR": "DRIVER={ODBC Driver 18 for SQL Server};SERVER=hig-prod-sql-mi.public.ae890fe634a5.database.windows.net,3342;DATABASE=MONGODB;UID=heirsdata;PWD=P@ssw0rd;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;",
    "LD_LIBRARY_PATH": "/home/site/wwwroot/oracle"
  }
}
Enter fullscreen mode Exit fullscreen mode

4. requirements.txt

azure-functions==3.3.0
pandas==1.5.3
sqlalchemy==1.4.41
pyodbc==4.0.39
pymongo==4.3.3
cx_Oracle==8.3.0
python-dotenv==0.21.0
Enter fullscreen mode Exit fullscreen mode

5. Refactored Scripts (common pattern)

scripts/init.py
import os
import logging
import cx_Oracle
from dotenv import load_dotenv

# Initialize environment
load_dotenv()

# Configure Oracle client
try:
    cx_Oracle.init_oracle_client(lib_dir=os.getenv("LD_LIBRARY_PATH"))
except Exception as e:
    logging.warning(f"Oracle client already initialized: {str(e)}")

# Common database functions
def get_mssql_engine(database_type="HLA"):
    conn_str = os.getenv(f"MSSQL_{database_type}_CONN_STR")
    quoted = urllib.parse.quote_plus(conn_str)
    return create_engine(f"mssql+pyodbc:///?odbc_connect={quoted}")

def get_max_value(engine, table, column):
    with engine.connect() as conn:
        result = conn.execute(text(f"SELECT MAX({column}) FROM {table}"))
        return result.scalar()
Enter fullscreen mode Exit fullscreen mode
scripts/accountstatementpremium.py (example)
import pandas as pd
from . import get_mssql_engine, get_max_value
from pymongo import MongoClient
from sqlalchemy import text
import logging
import os

def sync():
    logging.info("Starting AccountStatementPremium sync")

    # Setup connections
    mongo_client = MongoClient(os.getenv("MONGODB_CONN_STR"))
    db = mongo_client['production-retail-db']
    collection = db['AccountStatementPremium']

    sql_engine = get_mssql_engine("MONGODB")

    # Get max date from target
    max_date = get_max_value(sql_engine, 'AccountStatementPremium1', 'date')

    # Fetch new data
    query = {"date": {"$gt": max_date}} if max_date else {}
    data = list(collection.find(query).sort("date", -1))

    if not data:
        logging.info("No new data found")
        return

    df = pd.DataFrame(data)
    # ... (rest of processing logic)

    # Batch insert
    batch_size = 10000
    for i in range(0, len(df), batch_size):
        batch = df.iloc[i:i+batch_size]
        batch.to_sql('AccountStatementPremium1', sql_engine, if_exists='append', index=False)
        logging.info(f"Inserted batch {i//batch_size + 1}/{(len(df)//batch_size)+1}")

    logging.info(f"Completed sync: {len(df)} records")
Enter fullscreen mode Exit fullscreen mode

6. function_app.py

import azure.functions as func
import logging
from scripts import (
    accountstatementpremium,
    agency,
    agents,
    allocatedpremiums,
    happilymaster
)

app = func.FunctionApp()

@app.function_name(name="AccountStatementPremiumSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def account_sync(timer: func.TimerRequest):
    accountstatementpremium.sync()

@app.function_name(name="AgencySync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def agency_sync(timer: func.TimerRequest):
    agency.sync()

@app.function_name(name="AgentsSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def agents_sync(timer: func.TimerRequest):
    agents.sync()

@app.function_name(name="AllocatedPremiumsSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def allocated_sync(timer: func.TimerRequest):
    allocatedpremiums.sync()

@app.function_name(name="HappylyMasterSync")
@app.schedule(schedule="*/2 * * * *", arg_name="timer", run_on_startup=True)
def happily_sync(timer: func.TimerRequest):
    happilymaster.sync()
Enter fullscreen mode Exit fullscreen mode

7. host.json

{
  "version": "2.0",
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "excludedTypes": "Request"
      }
    }
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[3.15.0, 4.0.0)"
  }
}
Enter fullscreen mode Exit fullscreen mode

8. .vscode/settings.json

{
  "azureFunctions.deploySubpath": ".",
  "azureFunctions.projectLanguage": "Python",
  "azureFunctions.projectRuntime": "~3",
  "azureFunctions.scmDoBuildDuringDeployment": true,
  "azureFunctions.pythonVenv": ".venv",
  "azureFunctions.subscriptions": "<your-subscription-id>",
  "azureFunctions.resourceGroup": "sync-resources"
}
Enter fullscreen mode Exit fullscreen mode

9. .gitignore

.venv
__pycache__
.vscode
local.settings.json
.env
*.log
Enter fullscreen mode Exit fullscreen mode

Deployment to Azure

  1. Create Azure Resources:
az group create --name sync-resources --location eastus
az storage account create --name syncstorageacc --location eastus --resource-group sync-resources --sku Standard_LRS
az functionapp create --resource-group sync-resources --consumption-plan-location eastus --runtime python --runtime-version 3.9 --functions-version 4 --name oracle-sync-app --storage-account syncstorageacc --os-type Linux
Enter fullscreen mode Exit fullscreen mode
  1. Set Environment Variables in Azure:
az functionapp config appsettings set --name oracle-sync-app --resource-group sync-resources --settings \
  "ORACLE_CONN_STR=oracle+oracledb://reporter:reporter@40.112.50.79:1521/HEIRSDB" \
  "MONGODB_CONN_STR=mongodb+srv://heirs-prod-retail-service:a0UQ3WLTD8J7sTng@production-middleware-d.3tprz.mongodb.net/" \
  "MSSQL_HLA_CONN_STR=DRIVER={...};SERVER=...;DATABASE=HLA;..." \
  "MSSQL_MONGODB_CONN_STR=DRIVER={...};SERVER=...;DATABASE=MONGODB;..." \
  "LD_LIBRARY_PATH=/home/site/wwwroot/oracle" \
  "WEBSITE_TIME_ZONE=Africa/Lagos"
Enter fullscreen mode Exit fullscreen mode
  1. Deploy from VS Code:
    • Install Azure Functions Extension
    • Press F1 → "Azure Functions: Deploy to Function App"
    • Select your subscription and "oracle-sync-app"
    • Confirm deployment

Post-Deployment Configuration

  1. Enable Managed Identity for SQL:

    • In Azure Portal, go to SQL Managed Instance
    • Add function app's managed identity as login:
     CREATE USER [oracle-sync-app] FROM EXTERNAL PROVIDER
     ALTER ROLE db_datareader ADD MEMBER [oracle-sync-app]
     ALTER ROLE db_datawriter ADD MEMBER [oracle-sync-app]
    
  2. Connection Security:

    • Add Azure Function's outbound IPs to MongoDB Atlas IP whitelist
    • Enable VNet integration for SQL Managed Instance connectivity
  3. Monitoring:

    • Configure Application Insights for logging
    • Set up alerts for function timeouts (max duration: 10 minutes)

Key Improvements in Refactored Code

  1. Secure Credential Management:

    • All credentials moved to environment variables
    • Azure Key Vault integration ready (use @Microsoft.KeyVault references)
  2. Efficient Data Transfer:

    • Batched inserts (10k records/batch)
    • Incremental sync using MAX(date/id) checks
    • Connection pooling with SQLAlchemy
  3. Error Handling:

    • Comprehensive logging to Application Insights
    • Automatic retries built into Azure Functions
    • Connection validation before data transfer
  4. Oracle Client Management:

    • Instant client included in deployment
    • Proper Linux library path configuration
    • Lazy initialization of connections

Troubleshooting Tips

  1. Oracle Client Issues:
   # SSH into function app
   func azure functionapp fetch-app-settings oracle-sync-app
   func azure functionapp publish oracle-sync-app --build-native-deps

   # Check library path
   kudu console → ls -la oracle
Enter fullscreen mode Exit fullscreen mode
  1. Connection Problems:

    • Test connectivity from Kudu Console:
     python -c "import pyodbc; pyodbc.connect('DRIVER={ODBC Driver 18 for SQL Server};SERVER=...')"
    
  2. Performance Optimization:

    • Increase timeout in host.json:
     "functionTimeout": "00:10:00"
    
  • Use Premium plan for longer execution times

This implementation provides a production-ready solution that handles the hybrid data source scenario while following Azure Functions best practices and maintaining security compliance.

Top comments (0)