DEV Community

Luciano Strika
Luciano Strika

Posted on • Originally published at datastuff.tech on

1 1

Magic: The Gathering Meets Data Science

Magic: The Gathering has been one of my hobbies for years. Its large card base and long history make it a perfect fit for Data Analysis and Machine Learning.

In case you missed my previous article, I applied K-Means Clustering (an Unsupervised Learning technique) to a Magic: The Gathering Dataset I scraped myself from mtgtop8. That article explains the technical side, but doesn’t get into the results, because I didn’t think my readers would be into it.

Since many people have stood up to voice their disagreement, I will now show you some of the things the Algorithm learned.

This will not be the first nor the last time that I say that unsupervised learning can be spooky with all it learns, even when you know how it works.

The Data

The Dataset I used for this project contained only professional decks from last year, from the Modern format. I did not include sideboards into this analysis. All of the decks I used for training and visualizations are available, alongside the code, in this GitHub project.

If you know of any good Dataset for casual decks, I’ll be happy to know in the comments. Otherwise, I may scrape one in the future.

For this analysis, I’m looking at 777 different decks, containing a total of 642 unique cards (counting lands).

The Results

First of all, I strongly encourage you to pull the repository and try the Jupyter Notebook yourself, as there may be some particular insights you find interesting that I may be missing.

That said, if you want to see what the Data say about a particular card (provided it is part of the competitive meta, which we’ve seen is small enough) ask me in the comments if you don’t see it here!

Now, the first question we’ll ask ourselves is…

What does each Magic: The Gathering cluster look like?

Remember, we clustered decks, not cards, so we would expect each cluster to roughly represent an archetype, particularly one seeing play in the Modern meta.

First of all: here are the counts for each cluster. That is, how many decks fell into each.

Quantity of decks that fell on each cluster after applying K-Means Clustering.

We can see right off the bat there are two particularly small clusters, with less than 30 decks each. Let’s take a closer look.

Cards on each cluster

For cluster number 4, I got the set of 40 cards that appeared the most times for each deck in it, and then took the intersection to see what they all had in common. I repeated that procedure for cluster number 6.

Cluster number 4:
{'Devoted Druid', 'Horizon Canopy', 'Ezuri, Renegade Leader', 'Forest', 'Elvish Archdruid', 'Pendelhaven', "Dwynen\\'s Elite", 'Llanowar Elves', 'Collected Company', 'Windswept Heath', 'Temple Garden', 'Westvale Abbey', 'Razorverge Thicket', 'Heritage Druid', 'Elvish Mystic', 'Nettle Sentinel','Eternal Witness', 'Cavern of Souls', 'Chord of Calling', 'Vizier of Remedies', 'Selfless Spirit'}
Cluster number 6:
{'Funeral Charm', 'Liliana of the Veil', "Raven\\'s Crime", 'Fatal Push', 'Thoughtseize', 'Wrench Mind', 'Bloodstained Mire', 'Smallpox', 'Inquisition of Kozilek', 'Mutavault', 'Urborg, Tomb of Yawgmoth','Infernal Tutor', 'Swamp', 'The Rack', "Bontu\\'s Last Reckoning", 'Shrieking Affliction'}
Enter fullscreen mode Exit fullscreen mode

It appears one of them is playing a green deck, using elves and green lands, while the other one combines milling and discarding, with cards like Liliana and Inquisition of Kozilek.

Here’s the result for the previous algorithm for all of the clusters, see if you can tell which archetype each belongs to. This also tells us about the distribution of the meta back when I got the data.

Cluster number 0:
{'Scavenger Grounds', 'Eternal Scourge', 'Thought-Knot Seer',
'Eldrazi Temple', 'Mutavault', 'Reality Smasher', 'Wastes',
'Blinkmoth Nexus', 'Eldrazi Mimic', 'Simian Spirit Guide', 'Matter Reshaper',
'Dismember', 'Sea Gate Wreckage', 'Chalice of the Void', 'Ghost Quarter',
'Gemstone Caverns', "Smuggler\\'s Copter", 'Serum Powder'}
Cluster number 1:
{'Scavenging Ooze', 'Liliana of the Veil', 'Forest', 'Tarmogoyf',
'Blood Crypt', 'Terminate', 'Verdant Catacombs', 'Maelstrom Pulse',
'Mountain', 'Fatal Push', 'Thoughtseize', 'Overgrown Tomb',
'Kalitas, Traitor of Ghet', "Kolaghan\\'s Command", 'Wooded Foothills',
'Dark Confidant', 'Bloodstained Mire', 'Inquisition of Kozilek',
'Grim Lavamancer', 'Treetop Village', 'Bloodbraid Elf', 'Raging Ravine',
'Collective Brutality', 'Lightning Bolt', 'Swamp', 'Stomping Ground',
'Blackcleave Cliffs', 'Liliana, the Last Hope'}
Cluster number 2:
{"Urza\\'s Mine", 'Ancient Stirrings', 'Oblivion Stone', 'Firespout',
'Forest', 'Chromatic Star', "Urza\\'s Power Plant", 'Chromatic Sphere',
'Karn Liberated', 'Expedition Map', 'Walking Ballista', 'Ugin, the Spirit Dragon',
'World Breaker', 'Sanctum of Ugin', "Urza\\'s Tower", 'Sylvan Scrying',
'Wurmcoil Engine', 'Ulamog, the Ceaseless Hunger', 'Sea Gate Wreckage',
"Kozilek\\'s Return", 'Grove of the Burnwillows'}
Cluster number 3:
{'Island', 'Flooded Strand', 'Mountain', 'Izzet Charm',
'Blood Moon', 'Desolate Lighthouse', 'Cryptic Command',
'Jace, the Mind Sculptor', 'Serum Visions', 'Snapcaster Mage',
'Bonfire of the Damned', 'Lightning Bolt', 'Remand',
'Opt', 'Steam Vents', 'Electrolyze', 'Through the Breach',
'Emrakul, the Aeons Torn', 'Sulfur Falls', 'Scalding Tarn'}
Cluster number 4:
{'Devoted Druid', 'Horizon Canopy', 'Ezuri, Renegade Leader', 'Forest',
'Elvish Archdruid', 'Pendelhaven', "Dwynen\\'s Elite", 'Llanowar Elves',
'Collected Company', 'Windswept Heath', 'Temple Garden', 'Westvale Abbey',
'Razorverge Thicket', 'Heritage Druid', 'Elvish Mystic', 'Nettle Sentinel',
'Eternal Witness', 'Cavern of Souls', 'Chord of Calling', 'Vizier of Remedies',
'Selfless Spirit'}
Cluster number 5:
{'Champion of the Parish', 'Horizon Canopy', 'Mirran Crusader',
'Reflector Mage', 'Seachrome Coast', "Thalia\\'s Lieutenant",
'Aether Vial', 'Dark Confidant', 'Kitesail Freebooter', 'Plains',
'Dire Fleet Daredevil', 'Ancient Ziggurat', 'Mantis Rider',
'Thalia, Guardian of Thraben', 'Noble Hierarch', 'Cavern of Souls',
'Meddling Mage', 'Phantasmal Image', 'Unclaimed Territory'}
Cluster number 6:
{'Funeral Charm', 'Liliana of the Veil', "Raven\\'s Crime", 'Fatal Push',
'Thoughtseize', 'Wrench Mind', 'Bloodstained Mire', 'Smallpox',
'Inquisition of Kozilek', 'Mutavault', 'Urborg, Tomb of Yawgmoth',
'Infernal Tutor', 'Swamp', 'The Rack', "Bontu\\'s Last Reckoning", 'Shrieking Affliction'}
Cluster number 7:
{"Artificer\\'s Intuition", 'Crucible of Worlds', 'Tectonic Edge',
'Island', 'Bottled Cloister', 'Tolaria West', 'Pithing Needle',
'Ipnu Rivulet', 'Mountain', "Tormod\\'s Crypt", 'Oboro, Palace in the Clouds',
'Mox Opal', 'Field of Ruin', 'Expedition Map',
"Inventors\\' Fair", "Grafdigger\\'s Cage", 'Darksteel Citadel',
'Sorcerous Spyglass', 'Tezzeret the Seeker', 'Welding Jar', 'Ensnaring Bridge',
"Mishra\\'s Bauble", 'Engineered Explosives', 'Chalice of the Void', 'Ghost Quarter',
'Academy Ruins', 'Witchbane Orb', 'Whir of Invention'}

The same analysis on a more recent Dataset may even be useful in and of itself, if you’re into competitive tournaments.

Particular Cards

Three cards stood out to me in those lists: “Mutavault“, “Inquisition of Kozilek” and “Llanowar Elves“.

I wonder if they’re more common in other clusters? I didn’t really know Mutavault was so common in competitive play, and I think Llanowar Elves appearing on a deck tells us some stuff about it.

Well, that’s a one-trick pony. Clearly one of the things characterizing Cluster number 4 is the presence of Llanowar Elves.

With 35 decks using it out of 777, Mutavault appears in 5 out of 8 clusters. Not bad, but not as unexpected a diversity from such a versatile card.

This one appears in half of the clusters, but it’s three times as likely to appear on the first one.

As always, you can generate these graphs for any of the cards, or ask me if you’re interested in a particular one.

Versatile Cards

Lastly, I’ll define a new category of card: a card’s versatility will mean how many different clusters contain at least a deck that uses it.

I agree that that definition, admittedly, could be refined a bit more. For instance, by counting apparitions instead of just whether the card is in a deck or not.

However, the results this way are coherent enough, so I don’t think it needs any more tweaking. Here’s a list with the top 10 most versatile cards, after filtering Basic Lands out.

  1. Dismember
  2. Ghost Quarter
  3. Field of Ruin
  4. Cavern of Souls
  5. Thoughtseize
  6. Mutavault
  7. Sacred Foundry
  8. Stomping Ground
  9. Engineered Explosives
  10. Botanical Sanctum

They’re pretty much the ones you’d expect. However, I’m surprised Lightning Bolt didn’t make the cut. I wasn’t sure whether non-Basic Lands should count, but I left them in in the end.

The fact that I have no idea which card “Engineered Explosives” is, proves I’m out of touch with the state-of-the-meta, and maybe I should be playing more, but that’s beside the point.

Conclusion

As we expected, Magic: The Gathering can be a fun source of Data, and I think we have all learned a bit by seeing all this.

Personally, I’m still surprised a bit of glorified linear algebra could learn all about the meta of competitive play.

I’d be even more surprised if it learned about archetypes in casual play, where decks are more diverse, though my intuition tells me with enough clusters, even that should be properly characterized.

What do you think? Would you have liked to see any other bits of information? Were you expecting the algorithm to perform well? And finally, what other domains do you think are fit for a proper Data Analysis, particularly using other Unsupervised Learning Techniques?

Please let me know any or all of that in the comments!

Follow me on Medium or Twitter for more Articles, tutorials and analysis.

The post Magic: The Gathering Meets Data Science appeared first on Data Stuff, my old blog.

You can see me current blog at Strikingloo.github.io

Heroku

Build apps, not infrastructure.

Dealing with servers, hardware, and infrastructure can take up your valuable time. Discover the benefits of Heroku, the PaaS of choice for developers since 2007.

Visit Site

Top comments (0)

Retry later
Retry later