DEV Community

Cover image for Solved: About to do a mass license swap and I’m having trouble with the scripting part
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: About to do a mass license swap and I’m having trouble with the scripting part

🚀 Executive Summary

TL;DR: Mass Microsoft 365 license swap scripts often fail silently because M365 API operations are asynchronous, meaning the backend processes requests with a delay, creating race conditions. The solution involves implementing robust verification loops or batching methods to ensure changes are fully applied before proceeding, preventing data inconsistencies.

🎯 Key Takeaways

  • M365 API operations, including license changes, are asynchronous; initial ‘OK’ responses do not guarantee immediate backend processing.
  • Simple PowerShell foreach loops for license swaps can create race conditions, leading to silent failures where only a subset of users receive the intended license.
  • The ‘Trust but Verify’ method is a professional approach that actively checks a user’s license status in a loop, ensuring the new license is applied before removing the old one, providing resilience.
  • For large-scale operations (2000+ users), a ‘Batching & Logging’ method is recommended, decoupling data gathering, execution, and reporting for fault tolerance and recovery.
  • Using New-MsolLicenseOptions with Set-MsolUserLicense is crucial for clean license swaps, especially when services like Exchange Online are present in both old and new licenses, to ensure service continuity.

Struggling with a mass Microsoft 365 license swap script that fails silently? We break down why simple PowerShell loops fail due to API delays and provide three battle-tested solutions, from a quick fix to a robust, permanent approach.

You’re Not Crazy, M365 License Swaps Are Hard: A Senior Engineer’s Playbook

I remember it like it was yesterday. It was 10 PM on a Thursday, staring at a PowerShell console on our management bastion, mgmt-util-01. The task was “simple”: swap 500 users in the finance department from an E3 to an E5 license before a compliance audit the next morning. I wrote a clean foreach loop, kicked it off, and watched it fly through all 500 users. No errors. Green text everywhere. I closed my laptop, confident in a job well done. The next morning, my Teams was a bonfire of panicked messages. Only about 350 of the users actually had the new license. The script lied to me. We’ve all been there, and that night taught me a hard lesson about the gap between what PowerShell tells you and what’s actually happening in the M365 backend.

The “Why”: The Lie of the Asynchronous API

Here’s the problem in a nutshell: when you run a command like Set-MsolUserLicense, you’re not actually changing the license yourself. You’re handing a request to the Microsoft Graph API and saying, “Hey, please do this for me.” The API responds almost instantly with, “Sure thing, I’ve got your request!” and your script happily moves on to the next user. Your terminal shows success, but in reality, the M365 backend has just put your request in a queue. It might take a few seconds, or even a minute, for the change to actually be processed and replicated. If your script tries to add and remove licenses in quick succession within the same command or loop, you’re essentially creating a race condition. You’re telling it to remove a license that, from the API’s perspective, might not have been fully removed yet, or add a new one while the old one is still in a pending state. This is why scripts fail silently—the initial API call succeeded, but the subsequent operations on the backend failed.

A Note from the Trenches: This applies to more than just licenses. Anytime you’re doing rapid, consecutive operations against the same object in M365 (or Azure AD), assume it’s asynchronous. Group memberships, user properties, you name it. Never trust the initial ‘OK’ response on a critical, state-changing operation.

Three Ways to Tame the Beast

Over the years, my team and I at TechResolve have developed a few methods to handle this, ranging from the quick-and-dirty to the enterprise-grade. Let’s break them down.

Solution 1: The ‘Get It Done’ Sleepy Loop (The Hacky Fix)

This is the first thing everyone tries, and I’ll be honest, sometimes it’s all you need for a small, one-off job. The idea is to just force the script to wait after each operation, giving the backend time to catch up. It’s ugly, slow, and unreliable, but it can work.

# The "Old" license SKU to remove
$oldLicenseSku = "techresolve:ENTERPRISEPACK" # E3

# The "New" license SKU to add
$newLicenseSku = "techresolve:ENTERPRISEPREMIUM" # E5

$users = Get-MsolUser -All | Where-Object {($_.licenses).AccountSkuId -contains $oldLicenseSku}

foreach ($user in $users) {
    Write-Host "Processing user: $($user.UserPrincipalName)..."

    # Add the new license first
    Set-MsolUserLicense -UserPrincipalName $user.UserPrincipalName -AddLicenses $newLicenseSku

    # This is the hack. Just... wait.
    Write-Host "Pausing for 5 seconds to let the API catch up..."
    Start-Sleep -Seconds 5

    # Now remove the old license
    Set-MsolUserLicense -UserPrincipalName $user.UserPrincipalName -RemoveLicenses $oldLicenseSku

    Write-Host "Successfully swapped license for $($user.UserPrincipalName)"
    Write-Host "--------------------------------------------------"
}
Enter fullscreen mode Exit fullscreen mode

Why it’s bad: What if the API is slow and needs 6 seconds? Or 10? The script will still fail. You’re just guessing. For 1000 users, this adds over an hour of dead time to your script. Use this only if you’re in a pinch.

Solution 2: The Idempotent ‘Trust but Verify’ Loop (The Right Fix)

This is the professional approach. Instead of guessing with a Start-Sleep, we actively check the user’s status in a loop. We don’t proceed with removing the old license until we have programmatic confirmation that the new license has been successfully applied. This makes the script resilient and self-healing.

$oldLicenseSku = "techresolve:ENTERPRISEPACK" # E3
$newLicenseSku = "techresolve:ENTERPRISEPREMIUM" # E5

$users = Get-MsolUser -All | Where-Object {($_.licenses).AccountSkuId -contains $oldLicenseSku}

foreach ($user in $users) {
    Write-Host "Processing user: $($user.UserPrincipalName)..."
    Set-MsolUserLicense -UserPrincipalName $user.UserPrincipalName -AddLicenses $newLicenseSku

    # --- Verification Loop ---
    $licenseApplied = $false
    $attempts = 0
    while (-not $licenseApplied -and $attempts -lt 10) {
        $attempts++
        Write-Host "Verification attempt $attempts for $($user.UserPrincipalName)..."
        $currentUser = Get-MsolUser -UserPrincipalName $user.UserPrincipalName
        if (($currentUser.Licenses).AccountSkuId -contains $newLicenseSku) {
            $licenseApplied = $true
            Write-Host "Verified: New license is active."
        } else {
            # Wait a bit before checking again
            Start-Sleep -Seconds 3
        }
    }

    if ($licenseApplied) {
        Write-Host "Removing old license..."
        Set-MsolUserLicense -UserPrincipalName $user.UserPrincipalName -RemoveLicenses $oldLicenseSku
        Write-Host "Successfully swapped license for $($user.UserPrincipalName)."
    } else {
        Write-Host "ERROR: Failed to verify new license for $($user.UserPrincipalName) after $attempts attempts."
    }
    Write-Host "--------------------------------------------------"
}
Enter fullscreen mode Exit fullscreen mode

This is my go-to for most jobs. It’s robust, gives you clear output, and only waits as long as it absolutely has to. The timeout (the $attempts counter) is crucial to prevent an infinite loop if something is genuinely broken.

Solution 3: The ‘Too Big to Fail’ Batching Method (The Architect’s Fix)

What if you have 20,000 users? The script above could run for hours, and if your session token expires or the bastion host mgmt-util-01 reboots, you lose all your progress. For massive-scale operations, we need to decouple the process into stages: data gathering, execution, and reporting.

Step 1: Export your target list.

Don’t rely on a live Get-MsolUser call in your main loop. Get the data first.

Get-MsolUser -All | Where-Object {($_.licenses).AccountSkuId -contains "techresolve:ENTERPRISEPACK"} | Select-Object UserPrincipalName | Export-Csv -Path "C:\temp\users_to_swap.csv" -NoTypeInformation
Enter fullscreen mode Exit fullscreen mode

Step 2: Build a resilient, logged execution script.

This script reads from the CSV and writes successes and failures to different log files. This way, if it fails halfway through, you can just re-run it against the failures.csv.

# Same $oldLicenseSku and $newLicenseSku variables as before...

$users = Import-Csv -Path "C:\temp\users_to_swap.csv"

foreach ($user in $users) {
    $upn = $user.UserPrincipalName
    try {
        # Here we combine the Add and Remove in one go, which can be more reliable.
        # This requires creating a LicenseOptions object.
        $licenseOptions = New-MsolLicenseOptions -AccountSkuId $newLicenseSku
        Set-MsolUserLicense -UserPrincipalName $upn -LicenseOptions $licenseOptions -RemoveLicenses $oldLicenseSku

        # You can still add the 'Trust but Verify' loop here for maximum safety.
        # For simplicity in this example, we assume it works or throws an error.

        Write-Host "SUCCESS: $upn"
        $upn | Out-File -FilePath "C:\temp\success_log.txt" -Append
    }
    catch {
        Write-Host "FAILURE: $upn - Error: $($_.Exception.Message)"
        $upn | Out-File -FilePath "C:\temp\failures_log.txt" -Append
    }
}
Enter fullscreen mode Exit fullscreen mode

This architectural approach separates your concerns. It’s built for failure and recovery, which is essential when you’re making changes that could impact the entire company.

Pro Tip on Licensing Options: Be very careful when assigning a new license that contains services also present in the old one (like Exchange Online). If you just add the new E5 and remove the old E3, you can temporarily de-provision the user’s mailbox. The correct way is often to use the -LicenseOptions parameter with Set-MsolUserLicense to perform a clean swap of plans for a specific user, ensuring service continuity.

Summary Table

Here’s a quick cheat sheet for when to use each method:

Method Complexity Reliability Best For
1. Sleepy Loop Low Low < 50 users, non-critical, one-off tasks.
2. Trust but Verify Medium High 50-2000 users, critical changes, standard practice.
3. Batching & Logging High Very High 2000+ users, enterprise-scale changes, automation pipelines.

So next time you’re faced with a mass license swap, don’t just trust the green text. Remember that the API is working on its own schedule. Build verification into your scripts, plan for failure, and you’ll avoid that morning of panicked Teams messages. Happy scripting.


Darian Vance

👉 Read the original article on TechResolve.blog


Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)