Decommissioning a table sounds like a five-minute job — drop it, done. In a real enterprise environment, it almost never is. Somewhere downstream, someone built a report, a job, or a linked server query that still points at the "dead" table, and you won't find out until it breaks in production.
This is a walkthrough of how I approach decommissioning a table in Databricks/Delta Lake when there's a hybrid environment involved — specifically when SQL Server (accessed via SSMS) still has visibility into or dependencies on that data, which is common in enterprises mid-migration to Azure.
## The problem
A table gets marked for removal — maybe it's a legacy staging table, a duplicate created during a migration, or a dataset that's been replaced by a new Delta table. The instinct is to just drop it. But in most enterprise setups, tables don't exist in isolation:
- Linked servers or external tables in SQL Server may reference the same underlying data
- Power BI or Tableau reports may query it directly via SSMS-managed connections
- Downstream ETL/ELT jobs (Data Factory pipelines, stored procedures) may join against it
- Other teams may have built ad hoc queries against it without your knowledge
If you drop the Databricks table without checking any of this, you're one Monday-morning Power BI refresh away from an incident ticket.
*## Step 1: Confirm nobody is actively querying it
*
Before touching anything, check Databricks query history and cluster logs for read activity against the table over the past 30–90 days. If you have Unity Catalog enabled, table-level access logs make this much easier — you can see exactly which users, jobs, and service principals touched the table recently.
On the SSMS side, if the table is exposed via a linked server or external table definition (common when SQL Server is querying Azure Data Lake via Polybase or an external data source), check sys.dm_exec_query_stats or your SQL Server's query store for recent references to it. Don't rely on memory or documentation here — documentation is almost always out of date.
## Step 2: Check for hard dependencies, not just soft ones
This is the step people skip. Soft dependencies (a stakeholder mentioning "oh yeah someone might use that") are easy to catch by asking around. Hard dependencies — a stored procedure, a synonym, a linked server object, a view — are silent until something breaks.
In SSMS, search for:
- Views or synonyms referencing the table name
- Stored procedures with the table name in their definition (
sys.sql_modulesis useful for a text search across all procedure definitions) - Linked server objects pointing at the external table
In Databricks:
- Check if any Delta Live Tables pipelines, notebooks, or scheduled jobs reference the table
- If Unity Catalog is in use, check lineage — it will show you upstream/downstream dependencies directly, which is far more reliable than grepping through notebooks
## Step 3: Deprecate before you delete
Rather than dropping the table outright, I usually go through a short deprecation window:
- Rename the table with a
_deprecatedor_DEPR_<date>suffix rather than deleting it immediately - Revoke write access, but leave read access for a short window (1–2 weeks depending on how critical the workload is)
- Communicate the change to any team with known access — even if you think nobody uses it, a short "this table is going away on X date, reply if you need it" message costs you five minutes and saves a lot of pain
- Monitor query logs during the deprecation window for any read attempts against the renamed table — if something breaks, you'll see the failed queries and know exactly who to talk to
## Step 4: Drop, and clean up the metadata trail
Once the deprecation window has passed with no complaints:
- Drop the Delta table (
DROP TABLE) and, if you want to reclaim storage immediately, runVACUUMon the underlying path — otherwise Delta's time-travel retention will keep old files around for the default retention period - If the table was registered as an external table or linked object in SQL Server, drop that reference too — leftover external table definitions pointing at nothing are a common source of confusing errors months later
- Update any data catalog, wiki, or lineage documentation so the next person doesn't rediscover the same table by accident ** ## The actual lesson**
The technical part of dropping a table is trivial. The part that actually takes skill is confirming, with evidence rather than assumptions, that nothing downstream depends on it — especially in hybrid environments where Databricks and SQL Server/SSMS both have partial visibility into the same data. Query logs and lineage tools beat documentation and memory every time. If your platform doesn't have those (Unity Catalog lineage, query history, SQL Server query store), that's worth flagging as a gap before your next decommissioning task, not after an incident.
If you've hit a similar situation — decommissioning tables in a hybrid Databricks/SQL Server setup — I'd be curious how your team handles dependency checks. Drop a comment.
Top comments (0)