DEV Community

Damian R
Damian R

Posted on

Self-Education by AI

Image description

While studying for AWS Certified Cloud Practitioner, I started with the AWS Skill Builder, completed the Cloud Practitioner Essentials and Technical Essentials courses, and took the official practice tests. I also looked for some other free tests on the web, but results were mixed, often limited by paywalls and generally uninspiring.

It's not news that LLMs can be used for self-education, but I wondered how good they would be at providing a multi-choice quiz?
And then I thought: while an LLM was testing me, why not turn the tables and test them back? Why not use this opportunity to try the quizzing capabilities of several mainstream LLMs and see what happens?

Meet Your Gladiators

The line up for this battle of champions, in alphabetical order:

  • ChatGPT
  • Claude
  • Copilot
  • Gemini
  • Perplexity

These competitors were chosen by a combination of fame, appearances in the news and referrals at work. Five seemed like a nice round number.

Round 1: Presentation

Let's start with a simple prompt, and see how they do:

Simulate a multiple choice test on AWS services, asking me one question at a time and waiting for my response, keep going until I write "STOP".

ChatGPT came out of the gate with a plain but effective interface:

Image description

Claude was much the same:

Image description

Copilot offered a few suggestions:

Image description

Gemini broke the mould with a link to a blog where it found the material for the question, which I appreciated:

Image description

And Perplexity tried to out-do both Copilot and Gemini by providing suggestions at the bottom, and links to several websites where it found the material for the question.

Image description

So who won the round? Well, it's personal, but I found the simplicity of ChatGPT and Claude preferable here. I was looking for a quiz. This is not the time to look up extra suggested information, ala Copilot and Perplexity. And Perplexity gets a call-out for crowing the screen.

Round 2: Correctness

Let's see what happens if we pick the right answer.

Oh, that's nice! ChatGPT displays a green tick. Intuitive.

Image description

Claude makes you work for it, with a rather bland response.

Image description

Copilot does the same, with its usual suggestions thrown in.

Image description

Gemini keeps it simple but effective. No green tick but clearly correct.

Image description

And Perplexity rounds off the field with a wall of text.

Image description

Round 3: The Scent of Failure

Let's pick the wrong answer this time.

ChatGPT makes you feel the burn with a nice red cross.

Image description

Claude is as reserved with its wrong answers as with its right answers.

Image description

Copilot is equally uninspiring.

Image description

After a good showing in round 2, Gemini drops the ball with an uninspiring response.

Image description

Perplexity. Another wall.

Image description

Intermission

What's the state of play so far?
Well, it's clear all the competitors are capable of simulating a quiz quite effectively, though some pad it with extra baggage. Whether that's a hinderance to a user being tested is subjective.

But did you notice something else?

In the first round, ChatGPT and Copilot asked the same question. As did Claude and Gemini. Only Perplexity went it alone.
Curious.

In the second round, ChatGPT and Claude conspired to ask the same question. Copilot was in a similar vein to those two, but apparently didn't get the memo and went with ECS instead of Fargate.
Gemini teamed up with Perplexity to share a question.

In the third round, the answer of AWS Lambda is shared between Claude, Gemini and Perplexity, whereas ChatGPT and Copilot chose unique answers each.

Why the similarities? Surely, with the entirety of the internet or whatever datasets these LLMs are trained on, there should be so much data that the random chance of picking the same questions, especially repeatedly, should be statistically unlikely.

There's something interesting going on here, but I'll look into that another time.

Rankings

So, who is the winner? Well, we are.
In addition to all our existing self-education resources, we now have an infatiguable teacher who can quiz you on any subject, at any time, and in any way you want.
In my experiments, one of the LLMs even asked if I would like to also be asked questions that required writing out answers!
This is tool like no other. And it continues to surprise me.

Which LLM you pick is up to you. They seem fairly equal.
I have preferences on brevity of presentation, Claude seems slower to respond than the others, and I find Copilot's numbered questions less appealing than the letters all the others use, but these are trivial and subjective things.

I don't have any analysis on how accurate they are on the subject of AWS. I've been testing myself for a while and I've found them to be spot on so far. Obviously, this will vary by subject, and you should always take LLMs with a grain of salt.

The important thing is to get out there and educate youself, in whatever takes your fancy. Use LLMs as the tools they are, and when they get it wrong, correct them.
That'll make them even better tools for all of us.

Get learning!

Top comments (2)

Collapse
 
dotallio profile image
Dotallio

I've been using LLMs for study too, and it's wild how much more flexible and fun prep can be now.

Collapse
 
zenrajko profile image
Damian R

I'd love to hear any suggestions you have on ways to use it. I've got a few more ideas I'm going to try. If anything comes of it, I'll write them up.