Azure ML Feature Store is a specialized workspace that manages feature engineering, offline materialization to storage, and online serving with Redis. Terraform provisions the infrastructure, SDK defines feature sets. Here's how to build it.
In the previous posts, we set up the ML workspace and deployed endpoints. Now we need consistent features feeding those endpoints. Training uses historical features from batch sources. Inference needs the latest values in real time. When these diverge, your model's accuracy degrades silently.
Azure ML Feature Store is implemented as a special type of Azure ML workspace (kind = "FeatureStore"). It manages feature transformation pipelines, materializes features to offline storage (ADLS/Blob) and an online store (Redis), and provides point-in-time feature retrieval for training. Terraform provisions the infrastructure; the SDK defines entities, feature sets, and materialization schedules. π―
ποΈ Feature Store Architecture
| Component | What It Does |
|---|---|
| Feature Store | Specialized ML workspace with kind = "FeatureStore"
|
| Entity | Logical key (e.g., customer_id, account_id) shared across feature sets |
| Feature Set | Collection of features with transformation code and source definition |
| Offline Store | ADLS/Blob storage for materialized historical features |
| Online Store | Redis cache for low-latency inference lookups |
| Materialization | Spark jobs that compute and sync features on a schedule |
The key concept: feature sets include transformation code. Raw data goes in, computed features come out. The same transformation runs for both offline materialization (training) and online materialization (inference), eliminating training-serving skew.
π§ Terraform: Provision Feature Store Infrastructure
Feature Store Workspace
# feature_store/workspace.tf
resource "azurerm_machine_learning_workspace" "feature_store" {
name = "${var.environment}-feature-store"
location = azurerm_resource_group.ml.location
resource_group_name = azurerm_resource_group.ml.name
application_insights_id = azurerm_application_insights.ml.id
key_vault_id = azurerm_key_vault.ml.id
storage_account_id = azurerm_storage_account.ml.id
kind = "FeatureStore"
identity {
type = "SystemAssigned"
}
tags = var.tags
}
kind = "FeatureStore" is the critical setting. This creates a workspace optimized for feature management rather than general ML development.
Offline Materialization Store
# feature_store/offline_store.tf
resource "azurerm_storage_account" "offline_store" {
name = "${var.environment}fsoffline${random_string.suffix.result}"
location = azurerm_resource_group.ml.location
resource_group_name = azurerm_resource_group.ml.name
account_tier = "Standard"
account_replication_type = var.storage_replication
is_hns_enabled = true # ADLS Gen2
tags = var.tags
}
resource "azurerm_storage_container" "features" {
name = "features"
storage_account_id = azurerm_storage_account.offline_store.id
container_access_type = "private"
}
is_hns_enabled = true enables ADLS Gen2 hierarchical namespace, which is required for efficient feature materialization with Parquet files.
Online Store (Redis Cache)
# feature_store/online_store.tf
resource "azurerm_redis_cache" "online_store" {
count = var.enable_online_store ? 1 : 0
name = "${var.environment}-fs-redis"
location = azurerm_resource_group.ml.location
resource_group_name = azurerm_resource_group.ml.name
capacity = var.redis_capacity
family = var.redis_family
sku_name = var.redis_sku
minimum_tls_version = "1.2"
redis_configuration {
maxmemory_policy = "allkeys-lru"
}
tags = var.tags
}
The online store is optional. Enable it when you need low-latency feature lookups during inference. Skip it in dev if you only need offline features for training.
Compute for Materialization
# feature_store/compute.tf
resource "azurerm_machine_learning_compute_cluster" "materialization" {
name = "${var.environment}-materialization"
machine_learning_workspace_id = azurerm_machine_learning_workspace.feature_store.id
location = azurerm_resource_group.ml.location
vm_size = var.materialization_vm_size
vm_priority = "LowPriority"
identity {
type = "SystemAssigned"
}
scale_settings {
min_node_count = 0
max_node_count = var.materialization_max_nodes
scale_down_nodes_after_idle_duration = "PT5M"
}
tags = var.tags
}
Materialization jobs run as Spark pipelines on this compute cluster. min_node_count = 0 means you pay nothing when no materialization is running.
π Define Entities and Feature Sets (SDK)
Terraform provisions infrastructure. The SDK defines the feature engineering logic:
Create an Entity
from azure.ai.ml import MLClient
from azure.ai.ml.entities import FeatureStoreEntity, DataColumn
from azure.identity import DefaultAzureCredential
fs_client = MLClient(
DefaultAzureCredential(),
subscription_id="...",
resource_group_name="...",
workspace_name="prod-feature-store",
)
account_entity = FeatureStoreEntity(
name="account",
version="1",
index_columns=[DataColumn(name="accountID", type="string")],
description="Account entity for transaction features",
)
fs_client.feature_store_entities.begin_create_or_update(account_entity).result()
Entities define shared join keys. Multiple feature sets can reference the same entity, ensuring consistent joins.
Define Feature Set with Transformation Code
Feature set specification (YAML):
# featuresets/transactions/spec/FeaturesetSpec.yaml
$schema: https://azuremlschemas.azureedge.net/latest/featureSetSpec.schema.json
source:
type: parquet
path: abfss://data@storage.dfs.core.windows.net/transactions/
timestamp_column:
name: timestamp
feature_transformation_code:
path: ./transformation_code
transformer_class: transaction_transform.TransactionFeatureTransformer
features:
- name: transaction_count_7d
type: integer
- name: avg_transaction_amount_7d
type: float
- name: total_spend_3d
type: float
- name: max_transaction_amount
type: float
index_columns:
- name: accountID
type: string
Transformation code (Spark):
# transformation_code/transaction_transform.py
from pyspark.sql import DataFrame
from pyspark.sql import functions as F
from pyspark.sql.window import Window
class TransactionFeatureTransformer:
def transform(self, raw_data: DataFrame) -> DataFrame:
window_7d = Window.partitionBy("accountID").orderBy("timestamp").rangeBetween(-7*86400, 0)
window_3d = Window.partitionBy("accountID").orderBy("timestamp").rangeBetween(-3*86400, 0)
return raw_data.select(
"accountID",
"timestamp",
F.count("*").over(window_7d).alias("transaction_count_7d"),
F.avg("amount").over(window_7d).alias("avg_transaction_amount_7d"),
F.sum("amount").over(window_3d).alias("total_spend_3d"),
F.max("amount").over(window_7d).alias("max_transaction_amount"),
)
Register and Materialize
from azure.ai.ml.entities import FeatureSet, FeatureSetSpecification
transaction_fset = FeatureSet(
name="transactions",
version="1",
description="7-day and 3-day rolling transaction aggregations",
entities=["azureml:account:1"],
specification=FeatureSetSpecification(
path="./featuresets/transactions/spec"
),
tags={"data_type": "nonPII"},
)
fs_client.feature_sets.begin_create_or_update(transaction_fset).result()
Configure Materialization Schedule
from azure.ai.ml.entities import (
MaterializationSettings,
MaterializationComputeResource,
RecurrenceTrigger,
)
materialization = MaterializationSettings(
resource=MaterializationComputeResource(instance_type="Standard_E8s_v3"),
schedule=RecurrenceTrigger(frequency="Hour", interval=6),
offline_enabled=True,
online_enabled=True,
)
fset = fs_client.feature_sets.get(name="transactions", version="1")
fset.materialization_settings = materialization
fs_client.feature_sets.begin_create_or_update(fset).result()
π Environment Configuration
# environments/dev.tfvars
environment = "dev"
enable_online_store = false # No Redis in dev
storage_replication = "LRS"
materialization_vm_size = "Standard_E4s_v3"
materialization_max_nodes = 2
# environments/prod.tfvars
environment = "prod"
enable_online_store = true
redis_sku = "Standard"
redis_capacity = 1
redis_family = "C"
storage_replication = "GRS"
materialization_vm_size = "Standard_E8s_v3"
materialization_max_nodes = 8
β οΈ Gotchas and Tips
Feature store is a workspace. It's implemented as kind = "FeatureStore" on azurerm_machine_learning_workspace. It needs the same dependencies (storage, KV, App Insights) as a regular workspace.
Transformation code runs as Spark. Feature transformations execute on the materialization compute cluster using PySpark. Test your transformations locally with a Spark session before registering.
Entities enforce consistent joins. Define entities once (e.g., "account" with key "accountID") and reuse across feature sets. This prevents mismatched join keys between teams.
Materialization costs. Each scheduled run spins up the compute cluster, runs the Spark job, and writes to storage. LowPriority VMs reduce cost. min_node_count = 0 ensures you pay nothing between runs.
Redis cost for online store. Standard Redis starts at ~$40/month. Premium with replication is ~$200/month. Skip online store in dev unless you're testing real-time inference.
Feature set versioning. Feature sets are versioned. Changing the transformation logic? Create version "2". This maintains backward compatibility for models still using version "1".
βοΈ What's Next
This is Post 3 of the Azure ML Pipelines & MLOps with Terraform series.
- Post 1: Azure ML Workspace π¬
- Post 2: Azure ML Online Endpoints π
- Post 3: Azure ML Feature Store (you are here) ποΈ
- Post 4: Azure ML Pipelines + Azure DevOps
Your features have a home. ADLS for offline training, Redis for online inference, Spark transformations that run the same code for both. No training-serving skew. Versioned feature sets with scheduled materialization, all provisioned with Terraform. ποΈ
Found this helpful? Follow for the full ML Pipelines & MLOps with Terraform series! π¬
Top comments (0)