DEV Community

Cover image for Bulletproof Power Query (Part 2): A Smart, Fuzzy-Match Rename Function
Ahmed Essam
Ahmed Essam

Posted on

Bulletproof Power Query (Part 2): A Smart, Fuzzy-Match Rename Function

In Part 1 of this series, we tackled the fragile ChangeType step and built a function to handle data conversion errors gracefully.

But what happens before you change types? What if the column name itself changes?

We have all been there. You build a perfect Power BI report or Excel dashboard. It runs smoothly for weeks. Then, one day, the refresh fails.

The Error:

Expression.Error: The column 'Total Sales' of the table wasn't found.

The Cause:
The source system updated a column header from Total Sales to total_sales (case/format change) or Total Sale (typo).

Standard Power Query functions are rigid. They demand exact matches. However, in the real world, data is rarely that clean. You need a function that "thinks" like a human.

The Solution: Table.RenameColumnsSmart

I have developed a robust replacement for the standard rename function. Instead of blindly looking for an exact text match, this function uses a 3-tier search logic to find your columns, ensuring your query survives dirty data updates.

How it works:

  1. Exact Match: Checks if the column exists exactly as requested.
  2. Normalization: Ignores case, underscores, and extra spaces (e.g., Cust ID will match cust_id).
  3. Fuzzy Matching: Uses the Jaccard Similarity algorithm. If you ask for Revenue but the source has Revenu (typo), it calculates the similarity score. If it's high enough (e.g., >80%), it accepts the match automatically.

The "Professional" Touch: Metadata Logging

Just like in Part 1, we don't want to fail silently. If a column is truly missing (e.g., the similarity score is too low), this function:

  1. Does NOT break the query.
  2. Logs the warning to the table's Metadata.

This allows your data to keep flowing while giving you a hidden audit trail to check for quality issues.

The M Code

Copy and paste the code below into a Blank Query in Power Query and name it fxSmartRename.

let
    Table.RenameColumnsSmart = (
        SourceTable as table, 
        Renames as list, 
        optional SimilarityThreshold as number
    ) as table =>
    let
        /* CONFIGURATION 
           Default threshold is 0.8 (80% similarity) if not provided.
           1.0 = Exact match required.
           0.0 = Matches anything (Dangerous).
        */
        Threshold = if SimilarityThreshold = null then 0.8 else SimilarityThreshold,

        // 1. Helper: Clean text for comparison (Lower case, remove spaces/punctuation)
        CleanText = (t as text) => 
            Text.Lower(Text.Select(t, {"a".."z", "0".."9", "A".."Z"})),

        // 2. Helper: Calculate Jaccard Similarity between two strings
        GetSimilarity = (txt1 as text, txt2 as text) as number =>
            let
                l1 = Text.ToList(CleanText(txt1)),
                l2 = Text.ToList(CleanText(txt2)),
                Intersect = List.Count(List.Intersect({l1, l2})),
                Union = List.Count(List.Distinct(List.Combine({l1, l2}))),
                Score = if Union = 0 then 0 else Intersect / Union
            in
                Score,

        // Get actual column names from the source table
        ActualColumns = Table.ColumnNames(SourceTable),

        /* PROCESS RENAMES
           Iterate through the user's requested renames. 
           Try to find the best match in the actual table.
        */
        ProcessedRenames = List.Transform(Renames, (item) => 
            let
                TargetName = item{0},
                NewName = item{1},

                // Tier 1: Try Exact Match
                ExactMatch = List.Contains(ActualColumns, TargetName),

                // Tier 2: Try Cleaned Match (Case Insensitive + Ignore Space)
                CleanedMatch = if ExactMatch then TargetName else 
                    List.First(List.Select(ActualColumns, each CleanText(_) = CleanText(TargetName))),

                // Tier 3: Try Fuzzy Match (Similarity Score)
                BestMatch = if CleanedMatch <> null then CleanedMatch else
                    let
                        Scored = List.Transform(ActualColumns, each {_, GetSimilarity(TargetName, _)}),
                        Sorted = List.Sort(Scored, {each _{1}, Order.Descending}),
                        TopResult = List.First(Sorted),
                        IsGoodMatch = TopResult{1} >= Threshold
                    in
                        if IsGoodMatch then TopResult{0} else null
            in
                // Return a record with details on what happened
                [
                    Requested = TargetName, 
                    NewName = NewName, 
                    FoundMatch = BestMatch, 
                    Status = if BestMatch <> null then "Renamed" else "Missing"
                ]
        ),

        // Filter down to the list of successful renames for the native function
        ValidRenamesList = List.Transform(
            List.Select(ProcessedRenames, each [Status] = "Renamed"), 
            each {[FoundMatch], [NewName]}
        ),

        // Create a list of warnings (Missing columns)
        MissingColumns = List.Transform(
            List.Select(ProcessedRenames, each [Status] = "Missing"), 
            each [Requested]
        ),

        // Apply the standard RenameColumns function using our smart list
        // MissingField.Ignore is used as a safety net.
        RenamedTable = Table.RenameColumns(SourceTable, ValidRenamesList, MissingField.Ignore),

        // Attach the missing column warnings to the Table's Metadata.
        TableWithMeta = Value.ReplaceType(
            RenamedTable, 
            Value.Type(RenamedTable) meta [
                SmartRename_Warnings = MissingColumns, 
                SmartRename_Log = ProcessedRenames
            ]
        )
    in
        TableWithMeta
in
    Table.RenameColumnsSmart

Enter fullscreen mode Exit fullscreen mode

Usage Example

Imagine your source data is messy:

  • Expected: Employee ID, Annual Salary
  • Actual Data: emp_id, Anual Salary (Note the typo in Salary) Here is how you call the function:
// Apply the Smart Rename
CleanData = fxSmartRename(
    SourceTable, 
    {
        {"Employee ID", "ID"},      // Finds 'emp_id' via normalization
        {"Annual Salary", "Salary"} // Finds 'Anual Salary' via Fuzzy Match
    }, 
    0.8 // Sensitivity (80% match required)
)
Enter fullscreen mode Exit fullscreen mode

The Result

  1. emp_id is successfully renamed to ID.
  2. Anual Salary is successfully renamed to Salary.
  3. The query does not break.

Checking for Warnings

If you want to see if any columns were missed (and therefore skipped), you can check the metadata of your result step:

// Add a custom step to view warnings
CheckWarnings = Value.Metadata(CleanData)[SmartRename_Warnings]
Enter fullscreen mode Exit fullscreen mode

Conclusion

By combining the Smart Type Change from Part 1 and this Smart Rename from Part 2, you now have a "Self-Healing" ETL foundation. These functions allow you to build Power BI reports and Excel tools that are resilient to the chaos of real-world data.

Stay tuned for more robust M patterns!

Top comments (0)