<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Olorundara Akojede (dvrvsimi)</title>
    <description>The latest articles on DEV Community by Olorundara Akojede (dvrvsimi) (@dvrvsimi).</description>
    <link>https://dev.to/dvrvsimi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1063257%2F1e9846f3-89e6-434b-acc5-6ef48240a588.jpeg</url>
      <title>DEV Community: Olorundara Akojede (dvrvsimi)</title>
      <link>https://dev.to/dvrvsimi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dvrvsimi"/>
    <language>en</language>
    <item>
      <title>Building on the Edge: A How-To Guide on Object Detection with Edge Impulse</title>
      <dc:creator>Olorundara Akojede (dvrvsimi)</dc:creator>
      <pubDate>Fri, 08 Sep 2023 11:55:28 +0000</pubDate>
      <link>https://dev.to/dvrvsimi/building-on-the-edge-a-how-to-guide-on-object-detection-with-edge-impulse-58hn</link>
      <guid>https://dev.to/dvrvsimi/building-on-the-edge-a-how-to-guide-on-object-detection-with-edge-impulse-58hn</guid>
      <description>&lt;p&gt;In the fast-evolving landscape of technology, edge devices have emerged as the unsung heroes, bringing innovation closer to humans than ever before. From smart homes that make living easier to  self-driving cars, these gadgets have redefined convenience and connectivity in ways that we could have never conceived.&lt;/p&gt;

&lt;p&gt;At the heart of this disruptive transformation lies one of the powerful fields - object detection – a dynamic field of machine learning that equips machines with the ability to understand and infer from their visual surroundings.&lt;/p&gt;

&lt;p&gt;This article talks about how you can build your own object detection model - a simple face mask detection project on Edge Impulse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisite Knowledge:&lt;/strong&gt;&lt;br&gt;
Readers should have basic knowledge of machine learning concepts. Some understanding of IOTs is an added advantage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt;&lt;br&gt;
You would need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an Edge Impulse account&lt;/li&gt;
&lt;li&gt;a Computer&lt;/li&gt;
&lt;li&gt;a &lt;a href="https://www.kaggle.com/" rel="noopener noreferrer"&gt;Kaggle&lt;/a&gt; account (optional)&lt;/li&gt;
&lt;li&gt;a Phone (optional)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Introduction to Edge Computing
&lt;/h2&gt;

&lt;p&gt;As you may already know, edge computing brings machine learning capabilities directly to edge devices, enabling real-time processing and decision-making without relying on the cloud, this means that your data stays with you in most cases. Unlike the traditional machine learning cycle, building for edge devices requires more iterative processes before they can be optimally deployed. Certain trade-offs have to be made depending on the the model's intended use and it is usually a contest between the model's size/compute time versus the model's performance.&lt;/p&gt;

&lt;p&gt;There are various optimization techniques that are adopted for edge deployment and for compressing learning algorithms in general, some of them include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quantization: reduces the precision of the models parameters from 32-bit floating point values to 8-bit (int8) values without compromising the accuracy.&lt;/li&gt;
&lt;li&gt;Pruning: reduces the size of decision trees by removing sections of the tree that are non-critical and redundant.&lt;/li&gt;
&lt;li&gt;Distillation: transfers knowledge from a large model (teacher) to a smaller one(student).&lt;/li&gt;
&lt;li&gt;Decomposition: uses smaller matrices or vectors in place of larger matrices while still retaining as much information as the original matrix.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See this &lt;a href="https://vitalflux.com/model-compression-techniques-machine-learning/" rel="noopener noreferrer"&gt;article&lt;/a&gt; that extensively explains the above concepts, you can also check out this &lt;a href="https://docs.google.com/presentation/d/1S-ZIHDx9WzdQFYBJuQIuuCc5LzM9J6pQ/edit#slide=id.p19" rel="noopener noreferrer"&gt;presentation&lt;/a&gt; by &lt;a href="https://twitter.com/Iam_Dwiiight" rel="noopener noreferrer"&gt;Kenechi&lt;/a&gt;, it simplifies the above listed techniques and a couple of other relevant information.&lt;/p&gt;
&lt;h2&gt;
  
  
  What is Edge Impulse?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://edgeimpulse.com/" rel="noopener noreferrer"&gt;Edge Impulse&lt;/a&gt; is a leading dev platform for machine learning on edge devices. It allows you to carry out everything from data collection to deployment and everything in between.&lt;br&gt;
Whether you are  dev looking to build models for deployment on your development board or an enterprise looking to deploy accurate AI solutions faster on literally any edge device, Edge Impulse has got you covered!&lt;/p&gt;

&lt;p&gt;You can get around the platform easily and build projects in simple steps - build your dataset, train your model, test your model, and deploy it!&lt;br&gt;
If you have done some form of ML before, these steps should be familiar to you.&lt;/p&gt;
&lt;h2&gt;
  
  
  Getting Started with Edge Impulse
&lt;/h2&gt;

&lt;p&gt;Head over to &lt;a href="https://edgeimpulse.com" rel="noopener noreferrer"&gt;https://edgeimpulse.com&lt;/a&gt; and create a free Developer account. Choose the Enterprise account if you intend to use your account for business purposes, it offers perks for .&lt;/p&gt;

&lt;p&gt;Next, click the &lt;strong&gt;+ Create new project&lt;/strong&gt; button from the dropdown on the top-right of the screen and name your project.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhzr15s6dcqpc1bs7ohr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhzr15s6dcqpc1bs7ohr.jpg" alt="edge impulse - create new project"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Acquiring Data
&lt;/h2&gt;

&lt;p&gt;To build a model that can tell when it sees a face mask, you need to show it images of varieties of face masks. If you have a face mask and would like to create your own dataset from scratch or you would rather use a public dataset to train your model, Edge Impulse provides data acquisition channels for both options.&lt;/p&gt;

&lt;p&gt;Scroll down on the Dashboard page and click on the channel you'd like to use. Depending on the application, it is advisable to collect data with both channels to improve your model accuracy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy56qpz6b2nte1rfrh372.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy56qpz6b2nte1rfrh372.png" alt="edge impulse - dataset"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When adding existing data, you can either upload the data (as single units or as  folder) or add a storage bucket. You can find and download face mask datasets &lt;a href="https://www.kaggle.com/search?q=face+mask+dataset" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To collect your own data, you have 3 options; your computer, your phone, and/or a development board. To use your phone, simply scan the QR code to load the data acquisition page on your mobile device, you can proceed to collect data with your phone's camera.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note that you can only use your back camera.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;See the list of Edge Impulse &lt;a href="https://docs.edgeimpulse.com/docs/development-platforms/fully-supported-development-boards" rel="noopener noreferrer"&gt;supported dev boards&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Preparing Data
&lt;/h2&gt;

&lt;p&gt;Once you start uploading or collecting data, you would see a pop up asking if you want to build an object detection project. Scroll down to &lt;strong&gt;Project info&lt;/strong&gt; on your Dashboard and change the Labeling method to &lt;strong&gt;Bounding boxes (object detection)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihzs60rxonedoq1lx9y9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihzs60rxonedoq1lx9y9.jpg" alt="edge impulse - bounding box"&gt;&lt;/a&gt;&lt;br&gt;
After changing the Labeling method, you would notice a new section called &lt;strong&gt;Labeling queue&lt;/strong&gt; with a number in parenthesis, that number represents the number of images collected or uploaded.&lt;/p&gt;

&lt;p&gt;Just like every drag and release operation you've done with your mouse/trackpad, select the region that contains the face mask and click on &lt;strong&gt;Save labels&lt;/strong&gt; to save that entry. Repeat this process for all your samples.&lt;/p&gt;

&lt;p&gt;You may decide that you don't want to train your model with some samples, you can easily click on &lt;strong&gt;Delete sample&lt;/strong&gt; to remove them as you go through your dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0icgezl9mr67ck6hqt1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0icgezl9mr67ck6hqt1.jpg" alt="labeling"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Still on the Labeling queue page, you would notice a dropdown menu called &lt;strong&gt;Label suggestions&lt;/strong&gt;. If you were detecting common objects like cars, animals, or even faces, you can switch to the YOLOv5, it saves you the stress of having to draw bounding boxes manually. &lt;strong&gt;Track objects between frames&lt;/strong&gt; works best if you collected the series of data in the same instance, it just finds patterns and predicts where the bounding box should be in the next sample.&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating an Impulse
&lt;/h2&gt;

&lt;p&gt;On the left menu, click on &lt;strong&gt;Impulse design&lt;/strong&gt;, this is where you create the pipeline from input to processing to output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7t2pyrj2kpbi99we6j14.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7t2pyrj2kpbi99we6j14.jpg" alt="creating an impulse"&gt;&lt;/a&gt;&lt;br&gt;
Click &lt;strong&gt;Add an input block&lt;/strong&gt; and add &lt;strong&gt;Images&lt;/strong&gt;, you can change the Resize mode however you see fit. For &lt;strong&gt;Add a processing block&lt;/strong&gt;, add &lt;strong&gt;Image&lt;/strong&gt; and finally, add the &lt;strong&gt;Object detection(Images)&lt;/strong&gt; learning block authored by Edge Impulse, be sure to tick the Image box. Click &lt;strong&gt;Save Impulse&lt;/strong&gt; to save your configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffh6ez4z1cby5zpcyxo2m.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffh6ez4z1cby5zpcyxo2m.jpg" alt="saving impulse"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Generating Features
&lt;/h2&gt;

&lt;p&gt;For each block you add, a new option is created under Impulse design. Click &lt;strong&gt;Image&lt;/strong&gt; to explore how you can generate features from the samples.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It is advisable to change your colour depth from RBG to Grayscale, it significantly improves your model accuracy if you decide to use FOMO which would be covered in later sections of this article.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsie5rpzmfhg3zs8tkk5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsie5rpzmfhg3zs8tkk5.jpg" alt="saving parameters"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on &lt;strong&gt;Save parameters&lt;/strong&gt; to go to the Generate features section and click &lt;strong&gt;Generate features&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx56cmiyzjj933a884uaa.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx56cmiyzjj933a884uaa.jpg" alt="features generation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you see the &lt;code&gt;Job completed&lt;/code&gt; status, hover back to the left menu and click on &lt;strong&gt;Object detection&lt;/strong&gt;. It's finally time to train your model!&lt;/p&gt;
&lt;h2&gt;
  
  
  Training your model.
&lt;/h2&gt;

&lt;p&gt;Tweak the hyperparameters to your preference and choose your model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxmdzgsqlv8h09zcne3z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxmdzgsqlv8h09zcne3z.png" alt="hyperparams"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;About model selection, you can see a mini info card on each model when you click on &lt;strong&gt;Choose a different model&lt;/strong&gt;. MobileNetV2 should do just fine for this use case but if you read through their info cards, you'd deduce that FOMO is a better option.&lt;/p&gt;
&lt;h3&gt;
  
  
  Fear Of Missing Out?
&lt;/h3&gt;

&lt;p&gt;Anyone who hears or reads it for the first time would naturally think the same but &lt;strong&gt;FOMO&lt;/strong&gt;(Faster Objects, More Objects) is a novel solution developed at Edge Impulse for more optimal object detection applications on highly constrained devices.&lt;/p&gt;

&lt;p&gt;Apart from its negligible size, it is also very fast and can work on a wide range of boards. Check the documentation on FOMO &lt;a href="https://docs.edgeimpulse.com/docs/edge-impulse-studio/learning-blocks/object-detection/fomo-object-detection-for-constrained-devices" rel="noopener noreferrer"&gt;here&lt;/a&gt; to know more and see how people are using it in interesting ways. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;To use FOMO, ensure that you set the learning rate to 0.001 before training your model.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When you have adjusted the hyperparams and selected the model of your choice, click on &lt;strong&gt;Start training&lt;/strong&gt; and wait for training completion.&lt;/p&gt;
&lt;h2&gt;
  
  
  Evaluating your model
&lt;/h2&gt;

&lt;p&gt;After training, you should see an interface like the one below, it is a summary of the model's performance with the train data. Let's evaluate the model against new data.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fycbtbt0dlirwhv12xn5c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fycbtbt0dlirwhv12xn5c.jpg" alt="performance"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Notice how you can switch the model's version from int8 to float32.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On the left menu, click on &lt;strong&gt;Model testing&lt;/strong&gt; to evaluate your model with your test data (remember that if you didn't specify train to test ratio, Edge Impulse automatically splits the data into 80:20). Click on &lt;strong&gt;Classify all&lt;/strong&gt; and wait for the result.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyw81yk5a5ot0ea4zyh0d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyw81yk5a5ot0ea4zyh0d.jpg" alt="test data"&gt;&lt;/a&gt;&lt;br&gt;
If you used a diverse dataset, your number should be higher the one in the image above. Diverse in the sense that your model should learn from face mask images in different colour, lightings, backgrounds, and angles. You probably want to include people with different skin colour so you don't end up with a bias model. &lt;/p&gt;
&lt;h2&gt;
  
  
  Inferencing
&lt;/h2&gt;

&lt;p&gt;Now, you would perform face mask detection in real time. Go back to your Dashboard, click on the button that says &lt;strong&gt;Launch in browser&lt;/strong&gt;, it is just below the QR code. Ensure that your computer has a webcam or an external one attached, if it doesn't, you can scan the code with your phone camera and switch to &lt;strong&gt;Classification mode&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On the newly opened tab, a mini web app for inferencing would be created. When the build is complete, you would be asked to grant access to your webcam. Happy inferencing!&lt;/p&gt;
&lt;h2&gt;
  
  
  What next?
&lt;/h2&gt;

&lt;p&gt;You have learned how to build a simple object detection model on Edge Impulse but it gets more interesting when you start integrating hardware.&lt;/p&gt;

&lt;p&gt;A good application of this project would be to create an automatic door system that won't open for a person that isn't putting on a face mask. On an Arduino board, the .ino *code would be something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="n"&gt;softmax_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="c1"&gt;// depending on your model performance&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;softmax_prediction&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;softmax_threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Serial&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Welcome"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;digitalWrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;door_gear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HIGH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Serial&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"No entry without face mask"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;digitalWrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;door_gear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;LOW&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;*this is nowhere near complete or accurate and is just meant to give you an idea. An ideal code would take into account the &lt;code&gt;#include&lt;/code&gt; for the required drivers and &lt;code&gt;#define&lt;/code&gt; for the individual hardware pin configurations but that is outside the scope of this article.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Continue exploring Edge Impulse's capabilities by building more advanced models and trying your hands on hardware, the documentation is all there for you.&lt;/p&gt;

&lt;p&gt;You can also explore other machine learning applications for edge computing.&lt;/p&gt;

&lt;p&gt;Until next time, Tschüss!&lt;/p&gt;

</description>
      <category>edgecomputing</category>
      <category>objectdetection</category>
      <category>howtoguide</category>
      <category>edgeimpulse</category>
    </item>
    <item>
      <title>The Magic of Attention: How Transformers Improved Generative AI</title>
      <dc:creator>Olorundara Akojede (dvrvsimi)</dc:creator>
      <pubDate>Wed, 19 Jul 2023 07:20:41 +0000</pubDate>
      <link>https://dev.to/dvrvsimi/the-magic-of-attention-how-transformers-improved-generative-ai-1h3c</link>
      <guid>https://dev.to/dvrvsimi/the-magic-of-attention-how-transformers-improved-generative-ai-1h3c</guid>
      <description>&lt;h3&gt;
  
  
  Table of Contents.
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Preamble&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Encoder-Decoder Architecture&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Encoder Network&lt;/li&gt;
&lt;li&gt;The Decoder Network&lt;/li&gt;
&lt;li&gt;Training on Encoder-Decoder Architecture&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Limitations of the Traditional Encoder-Decoder.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;"Attention is all you need"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Overview of Attention Mechanism&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Conclusion&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Preamble.
&lt;/h2&gt;

&lt;p&gt;Generative AI is the new buzzword in the world of AI, big enterprises are looking to incorporate generative features into their solutions and AI engineers are working now more than ever to train models that are taking strides that were once inconceivable to the human mind when it comes to generating content.&lt;/p&gt;

&lt;p&gt;Watch this video of Sundar Pichai (Chief Executive Officer at Google), it compiles all the times he said "AI" and "generative AI" during his keynote speech at Google IO, 2023: &lt;iframe class="tweet-embed" id="tweet-1669177584175677443-177" src="https://platform.twitter.com/embed/Tweet.html?id=1669177584175677443"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1669177584175677443-177');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1669177584175677443&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;p&gt;Generative AI refers to a branch of artificial intelligence that focuses on generating new content based on patterns and examples from existing data, these contents may be in the form of a captivating story in text, an image of a landscape scenery from the Paleolithic era, or even an audio of what Mozart would sound like in a different genre like jazz.&lt;/p&gt;

&lt;p&gt;Generative AI involves training a model using large datasets and algorithms, enabling it to produce near original contents that expand on the patterns it has learned. In this article, I will talk about the technologies on which generative AI models are built and how transformers have improved generative AI over the years so stay glued!&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisite.
&lt;/h2&gt;

&lt;p&gt;Readers should have a good understanding of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Machine learning and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Artificial intelligence.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Encoder-Decoder Architecture.
&lt;/h2&gt;

&lt;p&gt;To properly communicate with AI models, it is important to make them understand the information that is being conveyed and regular human languages would not suffice. This is why the encoder-decoder architecture was developed, it is a neural network sequence-to-sequence architecture that was specifically designed for machine translation, text summarization, question-answering, and other machine learning use cases.&lt;/p&gt;

&lt;p&gt;Just as its nomenclature suggests, it has two networks- the encoder and the decoder network, these networks serve as the final gateways for input-output (I/O) operations in the model.&lt;/p&gt;

&lt;p&gt;At the encoder part, an input sequence in natural language is converted into its corresponding vector representation. This vector representation attempts to capture all the relevant bits from the input sequence(or prompt).&lt;/p&gt;

&lt;p&gt;This vector representation is then fed into the decoder network which generates an output after a series of internal processes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi222o8xreykynb3a9mi1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi222o8xreykynb3a9mi1.png" alt="a visual representation of the Encoder-Decoder architecture" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Encoder Network.
&lt;/h3&gt;

&lt;p&gt;For an encoder with a &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fwww.geeksforgeeks.org%2Fintroduction-to-recurrent-neural-network%2F" rel="noopener noreferrer"&gt;Recurrent Neural Network (RNN)&lt;/a&gt; internal architecture, each token in an input sequence like &lt;strong&gt;&lt;em&gt;The man is going to the bank&lt;/em&gt;&lt;/strong&gt; must first be tokenized, this &lt;a href="https://www.mygreatlearning.com/blog/tokenization/#:~:text=Tokenisation%20is%20the%20process%20of,like%20parsing%20and%20text%20mining." rel="noopener noreferrer"&gt;tokenization&lt;/a&gt; process converts the natural language into understandable sets of bits that the model can process. It recurs until the input sequence at the encoder has been completely tokenized.&lt;/p&gt;

&lt;p&gt;In most NLP character tokenization adoptions, each token is usually a representation of 4 characters so the example above would be a minimum of 6 tokens.&lt;/p&gt;

&lt;p&gt;These tokens are then passed into an embedding layer, this is where they are converted into a single vector representation. The encoder passes the vector representation onto the decoder through a &lt;a href="https://en.wikipedia.org/wiki/Feedforward_neural_network" rel="noopener noreferrer"&gt;Feedforward neural network&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Decoder Network.
&lt;/h3&gt;

&lt;p&gt;The encoder and decoder can be built on different architectures and more complex blocks but cases of the same architecture are not unlikely.&lt;/p&gt;

&lt;p&gt;The decoder would have its own set of input sequences which would also have been tokenized and embedded. Introducing this sequence of tokens to the decoder would trigger it to attempt a prediction of the next token based on the contextual understanding provided by the encoder, the first prediction is outputted through a softmax output layer.&lt;/p&gt;

&lt;p&gt;After the first token is generated, the decoder repeats this prediction process until there are no more tokens left to predict, the first and last predicted tokens are called the &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt; tokens respectively.&lt;/p&gt;

&lt;p&gt;The final sequence of tokens is detokenized back into natural language. In a language translation use case, the output generated would be: &lt;strong&gt;&lt;em&gt;Der Mann geht zur Bank&lt;/em&gt;&lt;/strong&gt; for a German target language.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqdhyp7c65onnt1lexxa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqdhyp7c65onnt1lexxa.png" alt="encoder-decoder architecture showing the important blocks" width="800" height="682"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Training on Encoder-Decoder Architecture.
&lt;/h3&gt;

&lt;p&gt;Training on an encoder-decoder architecture is more complicated than regular predictive models, having a collection of input/output pairs of the type data from a reference model for imitation is important.&lt;/p&gt;

&lt;p&gt;Likewise, the decoder needs to be trained on the correct previously translated token rather than what it is triggered to generate, this technique is called &lt;a href="https://en.wikipedia.org/wiki/Teacher_forcing" rel="noopener noreferrer"&gt;teacher forcing&lt;/a&gt; and is a good practice only when you have a credible &lt;a href="https://datascience.stackexchange.com/questions/17839/what-is-ground-truth#:~:text=This%20is%20a%20simplified%20explanation%20%3A%20Ground%20truth%20is%20a%20term%20used%20in%20statistics%20and%20machine%20learning%20that%20means%20checking%20the%20results%20of%20machine%20learning%20for%20accuracy%20against%20the%20real%20world.%20The%20term%20is%20borrowed%20from%20meteorology%2C%20where%20%22ground%20truth%22%20refers%20to%20information%20obtained%20on%20site." rel="noopener noreferrer"&gt;ground truth&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The decoder network generates the next token based on which token has the highest probability in the softmax layer, there are 2 common algorithms for “choosing” the next token in NLP, they are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Greedy search: this algorithm chooses the token with the highest conditional probability from the vocabulary as the next generated token. Take a look at the image below, can you tell what sentence the decoder generated? Note that the red  saturation decreases as the probability decreases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhd436g77n1u8hlhhfq3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhd436g77n1u8hlhhfq3.png" alt="an example of greedy search" width="603" height="592"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If your answer was &lt;strong&gt;&lt;em&gt;the last global war is abbreviated as WWII&lt;/em&gt;&lt;/strong&gt;, you are correct. Greedy search is easy to implement but it does not always generate optimal results, a better approach is the beam search algorithm.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Beam search: Instead of deciding off the probability of a single token, the algorithm searches for the sequence or series of tokens with the highest probability so the example above would be chosen among a pool of &lt;strong&gt;&lt;em&gt;the war is last abbreviated global WWII as&lt;/em&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;em&gt;last war abbreviated is the WWII as global&lt;/em&gt;&lt;/strong&gt;, etc. This approach is more efficient because it reduces computation time and provides some extra level of context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The encoder-decoder architecture is great because the input sequence and the generated output can be of varying lengths, this is very useful in image/video captioning as well as question-answering use cases. However, this architecture has a bottleneck that has made it obsolete over the years.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations of the Traditional Encoder-Decoder.
&lt;/h2&gt;

&lt;p&gt;When the encoder converts an input sequence into a vector, it compresses all the contextual information into that single vector, this poses a problem when the input sequence is too long. It may prove difficult for both the encoder and decoder because the encoder would struggle with keeping the relevant bits, and the decoder would expend more time on decoding and may lose some relevant bits of information in the process regardless of whether the generated output is short or not. This may lead to inaccuracy of the generated output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ug76jatbpj5bw1q34rv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ug76jatbpj5bw1q34rv.png" alt="malfunctioning bot" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;How was this problem tackled without jeopardizing the context of the sequence?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funs3vpatsu6an7n06kd8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funs3vpatsu6an7n06kd8.png" alt="money heist meme" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  "Attention is all you need"
&lt;/h2&gt;

&lt;p&gt;This is the title of a paper published in 2017 by Vaswani et al, this groundbreaking paper introduced the Transformer model, a novel architecture that revolutionized the field of NLP and became the foundation of the popular &lt;a href="https://en.wikipedia.org/wiki/Large_language_model" rel="noopener noreferrer"&gt;Large Language Models (LLMs)&lt;/a&gt; that are around today (GPT, PaLM, Bard, etc.) The paper proposes a neural network architecture with an entirely attention-based mechanism instead of the traditional RNNs, click &lt;a href="https://arxiv.org/abs/1706.03762" rel="noopener noreferrer"&gt;here&lt;/a&gt; if you wish to read the paper.&lt;/p&gt;

&lt;p&gt;A transformer can be summarized as an encoder-decoder model with an attention mechanism. The image below is from the paper, note how the attention layers are grafted in both encoder and decoder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj41jarln32vszhbv2y09.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj41jarln32vszhbv2y09.png" alt="Flow diagram of transformer model from the paper " width="428" height="581"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview of Attention Mechanisms.
&lt;/h3&gt;

&lt;p&gt;Attention mechanism is built to focus on the most important parts of the input sequence and not its entirety. Rather than building a single context vector out of the last hidden state of the encoder, attention mechanism creates shortcuts between the entire input sequence and the context vector.&lt;/p&gt;

&lt;p&gt;The weights of these context vectors vary for each output element. Hence, the context vector learns the alignment of the input sequence with the target output by noting the emphasized tokens.&lt;/p&gt;

&lt;p&gt;"But how does the model know where to channel its attention?"&lt;br&gt;
It calculates a score known as alignment score which quantifies how much attention should be given to each input. Look at the heatmap below from the &lt;a href="https://arxiv.org/abs/1409.0473" rel="noopener noreferrer"&gt;Neural Machine Translation by Jointly Learning to Align and Translate&lt;/a&gt; paper showing how attention works in a translation use case.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A640%2Fformat%3Awebp%2F1%2ABg_Hg0p9ta4l95c5DvsqWA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A640%2Fformat%3Awebp%2F1%2ABg_Hg0p9ta4l95c5DvsqWA.png" alt="heatmap" width="482" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With respect to the previous input sequence: &lt;strong&gt;&lt;em&gt;The man is going to the bank&lt;/em&gt;&lt;/strong&gt;, the translation should not pose any problem for regular encoder-decoder models but what if the sequence is longer and has more context?&lt;/p&gt;

&lt;p&gt;Take a new input sequence like &lt;strong&gt;&lt;em&gt;The man is going to the bank to fish&lt;/em&gt;&lt;/strong&gt;. For regular encoder-decoder models, the generated output in the target language may not align with the contextual meaning of the source language because "bank" now exists with more than one possible translation.&lt;/p&gt;

&lt;p&gt;While spotting this distinction is an easy feat for humans, it may be hard for the traditional encoder-decoder, hence, it may produce an output like &lt;strong&gt;&lt;em&gt;Der Mann geht zum Bank, um zu fischen&lt;/em&gt;&lt;/strong&gt; instead of &lt;strong&gt;&lt;em&gt;Der Mann geht zum Flussufer, um zu fischen&lt;/em&gt;&lt;/strong&gt;, the later is more accurate and would make more sense to a German because *Flussufer means riverbank. &lt;br&gt;
&lt;em&gt;*another translation could be "Ufer" which means "shore".&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the above instance, "bank" and "fish" would have the heaviest weight in an attention mechanism encoder-decoder.&lt;/p&gt;

&lt;p&gt;In application, attention layers need to be integrated with the regular encoder-decoder architecture and these layers exist in various types which include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generalized attention layer&lt;/li&gt;
&lt;li&gt;Self-attention layer&lt;/li&gt;
&lt;li&gt;Multi-head attention layer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To know more about layer types, check this &lt;a href="https://towardsdatascience.com/attention-and-its-different-forms-7fc3674d14dc" rel="noopener noreferrer"&gt;article&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion.
&lt;/h2&gt;

&lt;p&gt;The advent of attention mechanisms has revolutionized generative AI, enabling machines to better understand us and generate complex sequences with remarkable precision such that humans are sometimes bewildered by it. Applications across machine translation, question answering, text summarization, and more have benefitted from attention's ability to capture contextual relationships.&lt;/p&gt;

&lt;p&gt;As we look to the future, combining attention mechanisms with other architectural innovations holds immense potential for handling even more challenging tasks. Generative AI is just at its best milestone yet and it would continue to get better with more attention-driven applications as machines keep surpassing previous landmarks like never before. It is the responsibility of humans to shape this trajectory for the betterment of life.&lt;/p&gt;

&lt;p&gt;If you enjoyed this article, I would appreciate it if you leave a reaction or a comment. Know someone else that would find this article insightful? shares are very much welcome too! I am on Twitter &lt;a href="https://twitter.com/dvrvsimi" rel="noopener noreferrer"&gt;@dvrvsimi&lt;/a&gt; and Medium &lt;a href="https://medium.com/@daraakojede01" rel="noopener noreferrer"&gt;@daraakojede01&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prost!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>writing</category>
      <category>generativeai</category>
    </item>
    <item>
      <title>Creating a Twitter Bot with OpenAI Models in Python: A Beginner’s Guide</title>
      <dc:creator>Olorundara Akojede (dvrvsimi)</dc:creator>
      <pubDate>Tue, 06 Jun 2023 15:00:26 +0000</pubDate>
      <link>https://dev.to/dvrvsimi/creating-a-twitter-bot-with-openai-models-in-python-a-beginners-guide-18a5</link>
      <guid>https://dev.to/dvrvsimi/creating-a-twitter-bot-with-openai-models-in-python-a-beginners-guide-18a5</guid>
      <description>&lt;p&gt;It is no secret that AI is the new disruptive thing in the tech ecosystem and almost all other sectors are looking for innovative ways to adopt automated solutions in their repetitive processes and companies would pay for a seamless and more efficient workflow within their organizations.&lt;/p&gt;

&lt;p&gt;Generative Pre-trained Transformers (&lt;a href="https://en.wikipedia.org/wiki/GPT#:~:text=Generative%20pre%2Dtrained%20transformer%2C%20a%20family%20of%20artificial%20intelligence%20language%20models" rel="noopener noreferrer"&gt;GPTs&lt;/a&gt;) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence, they have been in existence for a while but OpenAi’s chatGPT found a way to make them accessible to even the non-technical population.&lt;/p&gt;

&lt;p&gt;In this article, you would learn how to integrate one of OpenAi’s models in a Python program that automates a Twitter bot. Although there are tons of bots on Twitter, it would be cool to build your own as a developer. Who knows? your idea might be featured on the Twitter Dev forum.&lt;/p&gt;

&lt;p&gt;The processes would be in easy-to-follow and beginner-friendly steps, let’s go!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Getting Started&lt;/strong&gt;&lt;br&gt;
Before starting properly, there are a couple of setups that should be put into place and they include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Access to both Twitter API and OpenAi API.&lt;/li&gt;
&lt;li&gt;A text editor.&lt;/li&gt;
&lt;li&gt;A platform to periodically run a Python script.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To gain access to Twitter APIs, you should signup on &lt;a href="https://developer.twitter.com/en/portal/dashboard" rel="noopener noreferrer"&gt;Twitter Developer Portal&lt;/a&gt; for dev access. You cannot have a dev account without a Twitter account so make sure to use an account that would be used strictly for automated tweets, this would allow other users to know that your account is a bot account and prevent your account from getting flagged. To learn how to properly set up an automated account, check &lt;em&gt;Automated Account labeling for bots&lt;/em&gt; on &lt;a href="https://developer.twitter.com/en/docs/apps/overview" rel="noopener noreferrer"&gt;Twitter’s docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You would be asked a couple of questions to know the scope of your API usage and there are various levels of access that can be granted for Twitter API. After a series of changes at Twitter, Elevated access was revoked for new applications, Free access is the lowest tier. It has restricted functions but it should do just fine for the purpose of what you would be building. To know more about Twitter access levels and versions, check the &lt;em&gt;&lt;a href="https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api" rel="noopener noreferrer"&gt;About the Twitter API&lt;/a&gt;&lt;/em&gt; page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7w0ev44wagc10p1xyfa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7w0ev44wagc10p1xyfa.png" alt="monthly cap usage for different tiers" width="700" height="122"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can proceed to the Dashboard page after being granted access, on this page, you can name your app and make authentication settings, these settings determine what your generated tokens/secrets can/cannot do. The default environment is usually on &lt;em&gt;Production&lt;/em&gt; and can be changed as more developments are made. Remember to store all your tokens in a safe place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzsggqcwwrf9bip5b49zd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzsggqcwwrf9bip5b49zd.png" alt="dashboard interface" width="689" height="181"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Likewise, look up OpenAi’s API page and create a personal account. The free trial should do for most basic call frequencies and should last for a considerable while, however, if you plan to do some heavy lifting, go ahead to add your preferred subscription.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4st1ovv92v5r04rmczky.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4st1ovv92v5r04rmczky.png" alt="OpenAi usage metrics page" width="700" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are so many platforms for hosting and deploying scripts, WayScript is a good choice and we would get to set it up in a bit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Preparing the code environment&lt;/strong&gt;&lt;br&gt;
Now that all the preparatory steps are out of the way, it is time to write the program. There is an endless list of cool features to add to a Twitter bot but you need to preinstall some dependencies that have in-built functions to make code writing efficient. You would build a bot that gives facts about a specific topic based on the user’s request but you would also be able to add a twist!&lt;/p&gt;

&lt;p&gt;Tweepy and OpenAi are the required libraries, run &lt;code&gt;pip install tweepy openai&lt;/code&gt; in your terminal to install them. Installing dependencies globally is not a good practice, see this article on how to set up virtual environments from your terminal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What next?&lt;/strong&gt;&lt;br&gt;
Create a &lt;code&gt;main.py&lt;/code&gt; file inside your project folder after you have activated your virtual environment, then import the preinstalled libraries, you should also import the &lt;code&gt;time&lt;/code&gt; library which would be used later on, do not install &lt;code&gt;time&lt;/code&gt; since it is an in-built Python package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# importing required libraries
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that you have tweepy imported, you can use one of its functions to authenticate your initially generated credentials but before that, create a new .py file for storing your credentials and name it &lt;code&gt;config.py&lt;/code&gt;. In the file, assign variables to each credential and save the file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## config.py&lt;/span&gt;
consumer_key = "xxxxxxxxxxxxxxxxxxxxxx"
consumer_secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
access_token = "13812xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
access_token_secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
openai_key = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
client_ID = "S1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
client_secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can now import all your credentials by importing config into your main file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# importing required libraries
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="c1"&gt;# note the use of * to universally import config content
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You cannot access Twitter APIs without authenticating these credentials. &lt;code&gt;OAuthHandler&lt;/code&gt; , also known as &lt;code&gt;OAuth1UserHandler&lt;/code&gt; in newer versions of tweepy, is a tweepy class for authenticating credentials and it would suffice for this level of access, to know more about other supported authentication methods, check &lt;em&gt;Authentication&lt;/em&gt; in &lt;a href="https://docs.tweepy.org/en/stable/authentication.html#introduction" rel="noopener noreferrer"&gt;Tweepy docs&lt;/a&gt;. Set up your OpenAi credential too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# authorizing with tweepy, note that "auth" is a variable
&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;OAuthHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;consumer_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;consumer_secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# .set_access_token() is a tweepy method
&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_access_token&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;access_token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;access_token_secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# creating API object
&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;API&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# setting up OpenAi key
&lt;/span&gt;&lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another good practice would be to add &lt;code&gt;config.py&lt;/code&gt; to a &lt;code&gt;.gitignore&lt;/code&gt; file, this would ensure that your credentials are untracked if you push your project to Github.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAi chat models&lt;/strong&gt;&lt;br&gt;
There are various OpenAi chat models, each with its own syntax. All chat models generate responses with tokens. For every 4 characters, 1 token is expended, longer words cost more tokens so it is good to take that into consideration in order not to exceed your API call limit.&lt;/p&gt;

&lt;p&gt;We would be using the &lt;code&gt;text-davinci-003&lt;/code&gt; model(a type of GPT-3.5 model) but the &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; is the most capable type of GPT-3.5 model because it is optimized for chat completions and it costs fewer tokens. Read extensively about their chat models on &lt;a href="https://platform.openai.com/docs/guides/chat/introduction" rel="noopener noreferrer"&gt;OpenAi's Docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing Functions&lt;/strong&gt;&lt;br&gt;
We would need to write a &lt;code&gt;generate_fact()&lt;/code&gt; function that would generate a response with our selected chat model from whatever topic the user decides. The &lt;code&gt;Completion&lt;/code&gt; class allows us to create an instance of chat completion with the option to tweak some parameters, code below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;#function to geenrate facts about a topic
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_fact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="c1"&gt;# play around with the prompt
&lt;/span&gt;    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;you are a grumpy computer programmer, tell a fact about &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; in a rude and sarcarstic tone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;# bear in mind that this engine has a token limit of 4000+ tokens
&lt;/span&gt;        &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-davinci-003&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# note that max_token accomodates both prompt_token and completion_token so provide enough tokens
&lt;/span&gt;        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# set to n=1 for single response and n&amp;gt;1 for multiple responses
&lt;/span&gt;        &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# temperature ranges from 0.0 to 1.0, nearness to 0 would make the model more deterministic and repititive, nearness to 1 would make it fluid
&lt;/span&gt;        &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fact&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fact&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that you have created a fact-generating model, you would need to write a &lt;code&gt;handle_mentions()&lt;/code&gt; function that handles the mentions of your bot account, you would use Tweepy to listen to specific mentions that would trigger a response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# function to handle mentions
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_mentions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;mentions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mentions_timeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# only retrieve last 5 mentions
&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mention&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;mentions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;in_reply_to_status_id&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# skips tweet replies(optional)
&lt;/span&gt;            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;screen_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bot_account_username&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# skip mentions from self
&lt;/span&gt;            &lt;span class="k"&gt;continue&lt;/span&gt;

        &lt;span class="c1"&gt;# parse mention_text, it can be "tell me about", anything
&lt;/span&gt;        &lt;span class="n"&gt;mention_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;casefold&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# casefold to prevent any error that may arise from different cases
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tell a fact about&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;mention_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mention_text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tell a fact about&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# generate a fact by calling our initisl function
&lt;/span&gt;        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;fact&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_fact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# post the fact as a reply to the mention
&lt;/span&gt;            &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;screen_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fact&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;in_reply_to_status_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# handle OpenAI API errors
&lt;/span&gt;        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenAIError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;screen_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; sorry, an error occurred while processing your request.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;in_reply_to_status_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# handle Tweepy errors
&lt;/span&gt;        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TweepyException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;screen_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; sorry, an error occurred while processing your request.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;in_reply_to_status_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Handle other errors that may arise in your code
&lt;/span&gt;        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;screen_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; sorry, an error occurred while processing your request.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;in_reply_to_status_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mention&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Remember to replace &lt;code&gt;bot_account_username&lt;/code&gt; with your bot account's username, adding error handlers is also a very good Python practice.&lt;/p&gt;

&lt;p&gt;To ensure that the program keeps streaming for mentions and sending responses after a specified period of time, you need to add a code block that serves as a loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# by adding if __name__ == "__main__":, we can ensure that certain code blocks, handle_mentions() in this case, will only be executed when main.py is run as the main program
&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;handle_mentions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;tweepy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TweepyException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# handle twitter API errors
&lt;/span&gt;            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;an error occurred while handling mentions:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# handle other errors
&lt;/span&gt;            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;an error occurred while handling mentions:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# wait for 2 minutes before checking for new mentions(reduce the time if traffic increases on your bot account, this is why we imported time)
&lt;/span&gt;        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt;&lt;br&gt;
When copying the above code blocks, there may be a series of `IndentationError instances so be sure to look out for them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deploying the Python Script on WayScript.&lt;/strong&gt;&lt;br&gt;
Setting up WayScript can be a little challenging but this &lt;a href="https://youtu.be/YtPyFAmFopg" rel="noopener noreferrer"&gt;tutorial video&lt;/a&gt; by WayScript thoroughly helps to break the processes down from start to finish.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsue0kwu1ycezvqmzxl3i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsue0kwu1ycezvqmzxl3i.png" alt="WayScript home page" width="700" height="345"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
I have been testing a bot with more functions, take a look at the &lt;a href="https://github.com/dvrvsimi/twitter-assistant-bot" rel="noopener noreferrer"&gt;repository&lt;/a&gt;. I would periodically update the &lt;code&gt;README.md&lt;/code&gt; with more information.&lt;/p&gt;

&lt;p&gt;You want to check more of my articles? you can find them &lt;a href="https://medium.com/@daraakojede01" rel="noopener noreferrer"&gt;here&lt;/a&gt;, you can also connect with me on &lt;a href="https://twitter.com/dvrvsimi" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Great job on building your own Twitter bot with an OpenAI model! As you have learned, Python and OpenAI make it easy to create a powerful bot that can interact with users and provide valuable information. With the techniques you’ve learned, the possibilities are endless for what you can create. Whether it’s a language translator bot, a fact generator, or a reminder bot, the key is to use your imagination and build something that provides value to your audience. So go ahead, build your own Twitter bot, and watch it come to life!&lt;/p&gt;

</description>
      <category>python</category>
      <category>beginners</category>
      <category>openai</category>
      <category>twitter</category>
    </item>
  </channel>
</rss>
