DEV Community

uyriq
uyriq

Posted on • Edited on

Powershell to fix incorrect encoding of MP3 ID3 tags

This PowerShell script corrects the encoding of ID3 tags in MP3 files. It is specifically designed for the case of a problem where tags encoded in WINDOWS-1251 have been saved to files as WINDOWS-1252 text, and so programs incorrectly output grainy characters (like this one"îñòðîâ") because they read the text as a Western European encoding. So the task is to save tags in modern UTF-8 encoding. TagLibSharp.dll does all the work of reading tags from the file, so to make the script work you need to place this library in the script execution directory. The main purpose of the script is to fix file encoding, but by calling it with the arguments -AllArtists, -AllAlbum, -AllPicture, -AllComment, you can also batch change related meta-properties of files. This script should not be used with diacritic languages, as it considers all these languages to be incorrect Cyrillic characters.

Prerequisites

Before running this script, make sure you have the following:

  • PowerShell 5.1 or higher.
  • TagLibSharp.dll: This is required to work with MP3 tags. You can obtain `TagLibSharp.dll by either: By downloading it from NuGet. Or by compiling the source code available on GitHub.
  • in case of PowerShell 5.1 you need System.Text.Encoding.CodePages.dll (netstandard 2.0) obtain binary package from NuGet that library is defaulted to Powershell Core

How to use

  1. Get TagLibSharp.dll: and optionaly for Powershell 5.1 Get System.Text.Encoding.CodePages
  2. Locate TagLibSharp.dll/System.Text.Encoding.CodePages.dll in the same directory as the script: The script expects TagLibSharp.dll to be in the same directory from which the script is run.
  3. Prepare MP3 files: Make sure that all MP3 files you want to fix are placed in the same directory. The script will process all .mp3 files in the directory from which it is run.
  4. Run the script:.
    • Open PowerShell and navigate to the directory containing the script and TagLibSharp.dll.
    • Run the script by typing .\convertIDv3tags.ps1 and pressing Enter. The script will process each MP3 file in the catalog, correcting the ID3 tag encoding as described.
    • You can optionally pass args as -AllArtists, -AllAlbum, -AllPicture, -AllComment to set the appropriate meta properties for a group of files

Notes

  • Backup your files:Before running the script, it is recommended that you back up your MP3 files to prevent unintentional data loss.
  • Script Restrictions: The script is specifically restricted to working with MP3 files. Changing the -Filter *.mp3 parameter to work with other file formats may not produce the desired results, although in general TagLibSharp.dll supports even video.

**NotaBene: if u wish experiment more with TagLibSharp dive in with

# .description: "This script reads the ID3 tags of MP3 files in a directory, corrects the encoding from WINDOWS-1251 (incorrectly displayed as WINDOWS-1252) to UTF-8, and saves the corrected tags back to the files."
# .prerequisites: "To run this script, you must first obtain TagLibSharp.dll. This can be done by downloading it from https://nuget.org/packages/TagLibSharp/2.3.0 or by compiling the sources available at https://github.com/mono/taglib-sharp."
# .how_to_use: "Execute this script in the directory containing the MP3 files you wish to correct. Ensure TagLibSharp.dll is accessible to the script, adjusting the library loading path as necessary."
# You can optionally pass args as -AllArtists, -AllAlbum, -AllPicture, -AllComment to set the appropriate meta properties for a group of files
param(
# assign $null by default
[string]$AllArtist = $null,
[string]$AllAlbum = $null,
[string]$AllPicture = $null,
[string]$AllComment = $null
)
Add-Type -Path ".\TagLibSharp.dll"
# Register the code page provider to ensure Windows-1251 is available
# Check if running in PowerShell Core (version 6 and above)
if ($PSVersionTable.PSEdition -eq 'Core') {
# Register the code page provider to ensure Windows-1251 is available
[System.Text.Encoding]::RegisterProvider([System.Text.CodePagesEncodingProvider]::Instance)
}
else {
# obtain binary package from https://nuget.info/packages/System.Text.Encoding.CodePages/ (netstandard 2.0)
Add-Type -Path ".\System.Text.Encoding.CodePages.dll"
[System.Text.Encoding]::RegisterProvider([System.Text.CodePagesEncodingProvider]::Instance)
}
function Test-NeedsEncodingCorrection {
param (
[string]$text
)
# Define the allowed Cyrillic range and additional allowed characters
$cyrillicRangeStart = [int][char]'А' # U+0410
$cyrillicRangeEnd = [int][char]'я' #p U+044F
$additionalAllowedChars = @('!', '=', '-', '+', '~', ' ', ',', '.', '\', '(', ')', "'", '"') + ('0'..'9') | ForEach-Object { [int][char]$_ }
# Convert text to array of Unicode code points
# Add uppercase Latin alphabet (A-Z) lowercase Latin alphabet (a-z) and 0-9
$latinLowercase = [char[]]'QWERTYUIOPASDFGHJKLZXCVBNMqwertyuiopasdfghjklzxcvbnm1234567890' | ForEach-Object { [int][char]$_ }
$additionalAllowedChars += $latinLowercase
$codePoints = $text.ToCharArray() | ForEach-Object { [int]$_ }
$needsCorrection = $false
# Check each character against the allowed range
foreach ($codePoint in $codePoints) {
$isAllowed = ($codePoint -ge $cyrillicRangeStart -and $codePoint -le $cyrillicRangeEnd) -or $additionalAllowedChars -contains $codePoint
# Write-Host "CodePoint: $codePoint IsAllowed: $isAllowed" uncommnet for debug purposes
if (-not $isAllowed) {
$needsCorrection = $true
break
}
}
# If all characters are within the allowed range, return false
return $needsCorrection
}
# Define the directory containing the MP3 files
$directoryPath = (Get-Location).Path
# Get all MP3 files in the directory
$mp3Files = Get-ChildItem -Path $directoryPath -Filter *.mp3
foreach ($file in $mp3Files) {
# Use TagLib# to read the MP3 file
$mp3 = [TagLib.File]::Create($file.FullName)
# display progress current file number of total files
# move cursor back to start of line and print normally it could be as a breeze
Write-Host $file.Name, "of total:", $mp3Files.Count, "`r" -NoNewline
# Read existing tags
$title = $mp3.Tag.Title
$album = $mp3.Tag.Album
# Assuming the incorrect encoding is Windows-1251, and we need to convert to UTF-8
# This step might need adjustments based on the actual encoding issues
# Correct encoding conversion logic
if (-not [string]::IsNullOrWhiteSpace($title) -and (Test-NeedsEncodingCorrection -text $title)) {
$bytesTitle = [System.Text.Encoding]::GetEncoding(1252).GetBytes($title)
$titleCorrected = [System.Text.Encoding]::GetEncoding(1251).GetString($bytesTitle)
}
else {
$bytesTitle = $null #
$titleCorrected = $title
}
if ($AllComment -ne $null -and $AllComment -ne '') {
$commentCorrected = $AllComment
}
else {
if (-not [string]::IsNullOrWhiteSpace($mp3.Tag.Comment) -and (Test-NeedsEncodingCorrection -text ($mp3.Tag.Comment)) ) {
$commentCorrected = [System.Text.Encoding]::GetEncoding(1251).GetString([System.Text.Encoding]::GetEncoding(1252).GetBytes($mp3.Tag.Comment))
}
else {
$commentCorrected = $mp3.Tag.Comment
}
}
if ($AllArtist -ne $null -and $AllArtist -ne '') {
$artistCorrected = $AllArtist
}
else {
$performers = $mp3.Tag.Performers -join ", "
if (-not [string]::IsNullOrWhiteSpace($performers) -and (Test-NeedsEncodingCorrection -text $performers)) {
$artistCorrected = [System.Text.Encoding]::GetEncoding(1251).GetString([System.Text.Encoding]::GetEncoding(1252).GetBytes($performers))
}
else {
$artistCorrected = $performers # No conversion needed or performers is empty
}
}
if ($AllAlbum -ne $null -and $AllAlbum -ne '') {
$albumCorrected = $AllAlbum
}
else {
$album = $mp3.Tag.Album
if (-not [string]::IsNullOrWhiteSpace($album) -and (Test-NeedsEncodingCorrection -text $album)) {
$albumCorrected = [System.Text.Encoding]::GetEncoding(1251).GetString([System.Text.Encoding]::GetEncoding(1252).GetBytes($album))
}
else {
$albumCorrected = $album # No conversion needed or album is empty
}
}
# test if $AllPicture contains valid path to image file and if it does, add it to the mp3 file
if (-not [string]::IsNullOrWhiteSpace($AllPicture) -and (Test-Path $AllPicture -PathType Leaf)) {
$mp3.Tag.Pictures = [TagLib.Picture]::CreateFromPath($AllPicture)
}
# Update tags with corrected values
$mp3.Tag.Title = $titleCorrected
$mp3.Tag.Performers = $artistCorrected -split ", "
$mp3.Tag.Album = $albumCorrected
$mp3.Tag.Comment = $commentCorrected
# Save the changes
$mp3.Save()
# Dispose the MP3 object to free resources
$mp3.Dispose()
}

Reinvent your career. Join DEV.

It takes one minute and is worth it for your career.

Get started

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay