Personally, I am not a big fan of Wikipedia. When I was studying vector mathematics for my Masters of Machine Learning degree I tried to use Wikipedia to understand some of the key concepts. It was like walking into a buzzsaw. My head certainly felt like it had been split open.
When the first Encyclopaedia Brittanica was written, it was designed for lay people to look up and understand topics they otherwise wouldn’t be able to access.
My issue with Wikipedia, especially for technical topics, is that it is often edited by academics and individuals who want to show off their intellectual prowess. It makes reading the articles equivalent in difficulty to reading academic papers. The complexity can often be overwhelming.
That gave me the idea to write a script that would simplify Wikipedia articles.
That inspired another thought: what if you could summarise all of the knowledge on Wikipedia so that children could understand it? This inspired the concept for WikiKids. We could make the vast array of information on Wikipedia available to children by using OpenAI natural language processing.
In this article, we’ll look at the beginnings of WikiKids with a Ruby code snippet that uses two powerful libraries — OpenAI and wikipedia-client— to retrieve and summarise a Wikipedia page’s introduction.
The OpenAI library is a powerful natural language processing (NLP) library that allows developers to access the OpenAI platform’s capabilities via a simple API. To process and understand human language, it includes a variety of features such as language modelling, machine learning, and deep learning.
The wikipedia-client gem is a Ruby library that allows developers to easily access the vast resources of Wikipedia. It provides a simple and easy-to-use API to retrieve information directly from Wikipedia.
Prerequisites
In order to follow along with this code you need an intermediate understanding of Ruby.
You should also have a basic understanding of the OpenAI API. I have written a beginners guide to the OpenAI API in the article below.
The Code
Goal: To write a Ruby script which returns the introduction section from a Wikipedia page and summarises it in a way a 10-year-old can understand.
In this example we will use the AI to summarise the introduction on Obstetrics.
Wikipedia Introduction (Obsetrics):
Obstetrics is the field of study concentrated on pregnancy,
childbirth and the postpartum period. As a medical specialty,
obstetrics is combined with gynecology under the discipline
known as obstetrics and gynecology (OB/GYN), which is a
surgical field.
Begin by installing the the two gems.
gem install ruby-openai
gem install wikipedia-client
You will need an OpenAI account and an API key. At this writing you can sign up for a free account at https://openai.com/api/.
require 'ruby/openai'
require 'wikipedia-client'
client = OpenAI::Client.new(access_token: 'YOUR_API_KEY')
wikiPage = 'Obstetrics'
page = Wikipedia.find(wikiPage)
introduction = page.summary
prompt = "Summarize the following Wikipedia introduction text so a 10-year-old can understand it: \n#{introduction}"
response = client.completions(
parameters: {
model: "text-davinci-003",
prompt: prompt,
temperature: 0.5,
max_tokens: 2000,
}
)
puts response['choices'][0]['text'].lstrip
After a few seconds the AI provides its response.
AI Response:
Obstetrics is a medical field that focuses on the care of
pregnant women, the process of giving birth and what happens
afterwards. It is linked to gynecology, which is the study of
women's health.
Compare the AI output to the Wikipedia entry from earlier. I think it has done a nice job simplifying the content.
What the code is doing
Using your API key, the Ruby script creates an instance of the OpenAI client. It looks up the Wikipedia page on “Obstetrics” using the Wikipedia-client library. The page.summary
property returns the article’s introduction. This is the section at the beginning of a Wiki page.
The code then creates a prompt. This is the instructions which will be sent to the OpenAI model. The prompt is the most important part of the code as it is effectively our instruction to the AI model. Prompt engineering (creating prompts that return the required results) is an important topic. There is an art to phrasing prompts in a way that causes the AI to provided us with the responses we want.
The code then sends this prompt to the OpenAI API and asks for a response, by using the completions
method of the client object. The parameters passed to the method include the type of model to use text-davinci-003
is the latest and most advanced model. The temperature
of the response which is a number from 0 to 1 and represents the ‘creativness’ of the response. You can adjust this to manage the AIs creativity in responding. The max_tokens
to be used in the response. A token is 4 characters.
Finally, the function writes the AI output to the console, which is a summary of the Wikipedia introduction about Obstetrics that a 10-year-old can understand.
WikiKids
This script is just the beginning. I foresee a website that allows kids to visit any Wikipedia page, and the AI model will render the simplified content in real-time.
Imagine giving kids the ability to ask the AI questions about the content and get meaningful answers.
I modified the prompt to respond to children questions:
prompt = "Provide a very friendly and encouraging response to
the following question that a 10-year-old can
understand: #{question}"
Here are some real questions and answers from the AI model. Each prompt is modified so the AI gives an answer a 10-year-old could understand.
Question: What should I study at school to become an obstetrics doctor?
AI Response:
To become an obstetric doctor, you should study science
subjects like biology and chemistry in high school. Then you
would have to go to college and get a degree in a related
field like pre-medicine or biology. Then you would go to
medical school to learn how to become a doctor and after that,
you would complete a special training program called a
residency in obstetrics. It might seem like a lot of work but
remember, you're not alone, you can do it and you will have
lot of people to help you along the way.
Question: I don’t understand what a Gynaecologist is?
AI Response:
A gynaecologist is a doctor who helps women with their
reproductive health. It's totally normal not to know what it
is, there are many different types of doctors and it can be
hard to keep track of them all. But don't worry, if you're
interested in becoming a gynaecologist, you have plenty of
time to learn more about it and see if it's something you're
interested in pursuing.
Educational Impacts
I teach Finance at Deakin University in Australia. One thing I have always wanted to solve was how do you reach and support every student regardless of their ability to study.
With AI we could provide educational content that can be modified and adjusted to an individual students needs. We are no longer constrained by the need to provide a single educational model, but rather can meet each student or child where they are at and provide them the means to drive their own learning at a pace that suites them.
Summary
I am very interested in creating educational solutions utilising AI. If you would like to help me build WikiKids, please reach out and I would be very interested to speak.
Kane Hooper is the CEO of reinteractive, the longest running dedicated Ruby on Rails development firm in the world.
You can contact Kane directly for any help with your Ruby on Rails application.
Top comments (0)