<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anthony Mipawa</title>
    <description>The latest articles on DEV Community by Anthony Mipawa (@tonyloyt).</description>
    <link>https://dev.to/tonyloyt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F556970%2Fb3bc9b05-1dfd-4924-a080-d32b76bc583a.jpg</url>
      <title>DEV Community: Anthony Mipawa</title>
      <link>https://dev.to/tonyloyt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tonyloyt"/>
    <language>en</language>
    <item>
      <title>NLP Communities for Data Professionals to Join</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Wed, 30 Nov 2022 13:03:25 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/nlp-communities-for-data-professionals-to-join-18do</link>
      <guid>https://dev.to/neurotech_africa/nlp-communities-for-data-professionals-to-join-18do</guid>
      <description>&lt;p&gt;Are you a data professional, engineer, or aspiring person to grow in NLP fields?&lt;/p&gt;

&lt;p&gt;Yes, this is for you&lt;/p&gt;

&lt;p&gt;One of the best methods to stay current with all the newest technologies and tools connected to NLP in the tech industry is to join NLP communities.&lt;/p&gt;

&lt;p&gt;Tech communities are beneficial in keeping tech enthusiasts updated and motivated in building impactful tools even in idea growth, also project success. You may be working in the same industry, and having a channel to meet with different folks working in the same industry can help you on improving your expertise but also in time expose you to new tools. I bring this up because it is one of the strategies I have been using for more than four years.&lt;/p&gt;

&lt;p&gt;I came across many people asking about NLP communities to engage and grow their expertise from that experience, I planned to put every piece together and share with folks out there.&lt;/p&gt;

&lt;p&gt;I hope this will help a lot of folks looking for these communities, stay with me to explore the NLP communities.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Masakhane:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.masakhane.io/"&gt;Masakhane&lt;/a&gt; pushing to build datasets and tools to facilitate Natural Language Processing in African languages and pose new research problems to enrich the NLP  research landscape. A research effort originally for &lt;a href="http://translate.masakhane.io/"&gt;Machine translation&lt;/a&gt; focused on African languages that are open-source, continent-wide, and distributed online. It aimed to build a community of Natural Language Processing researchers, connect and grow it, spurring and sharing further research to enable language preservation, tool building, and increasing its global visibility and relevance.&lt;/p&gt;

&lt;p&gt;You can join Masakhane slack community workspace through  👉  &lt;strong&gt;&lt;a href="https://masakhane-nlp.slack.com/join/shared_invite/enQtODM3ODA3ODE0ODIwLTAyYzg3M2E3Nzg4Y2I3NzgxNDg4MmNlZDE4OTBjMzBjMjg4NTcxMWZlYTg3ZDljMTU4M2FjOTk3MDVjOWM2NGM#/shared-invite/email"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can join the Masakhane mail list group through  👉  &lt;strong&gt;&lt;a href="https://groupc/"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;NeuralSpace Community:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A group of NLP enthusiasts led by NeuroSpace company with the mission to create a platform that helps bridge the massive language gap, that is prevalent around the world and prevents many from accessing vital services or education.&lt;/p&gt;

&lt;p&gt;They do use slack as a channel for exchanging information and organizing NLP events with collaboration from experts from Meta AI, NeuralSpace, LoResMT, and Masakhane.&lt;/p&gt;

&lt;p&gt;You can join NeuralSpace slack community workspace through  👉  &lt;strong&gt;&lt;a href="https://neuralspacecommunity.slack.com/ssb/redirect"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Hugging Face Community:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A place where a broad community of data scientists, researchers, and ML engineers can come together and share ideas, get support and contribute to open source projects.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Hugging Face is a community and data science platform that provides tools that enable users to build, train and deploy ML models based on open-source code and technologies.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It is one of the most awesome community I have ever encountered in the NLP space, each day people share cutting-edge tools which are essential to the NLP ecosystem. Everyone can exchange and examine models and datasets at the Hugging Face central hub. In order to democratize AI for everyone, they aspire to become the location with the largest collection of models and datasets.&lt;/p&gt;

&lt;p&gt;You can join Hagging Face community workspace through  👉  &lt;strong&gt;&lt;a href="https://huggingface.co/"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Spark NLP:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Slack group for developers and Spark NLP users to help get started to solve common NLP use cases and exchange ideas on best NLP practices. This community was built on the grounds of knowledge and communication management.&lt;/p&gt;

&lt;p&gt;You can join the Spark NLP community slack workspace through  👉  &lt;strong&gt;&lt;a href="https://app.slack.com/client/T9BRVC9AT/setup-people"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Lanfrica Community:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Lanfrica aims to mitigate the difficulty encountered in the discovery of African language resources by creating a centralized hub. They do organize a series of talks to highlight and showcase language technology efforts (research, projects, software, applications, datasets, models, initiatives, etc.) geared towards under-represented languages around the world.&lt;/p&gt;

&lt;p&gt;Lanfrica is equally interested in efforts targeting (or that can be transferred to) low-resource languages (these are languages with not much data, societal/research efforts or technologies, and recognition) and endangered languages.&lt;/p&gt;

&lt;p&gt;You can join the Lanfrica community mailing list through  👉  &lt;strong&gt;&lt;a href="https://lanfrica.com/mailing-list/subscribe"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can join the Lanfrica community slack workspace through  👉  &lt;strong&gt;&lt;a href="https://lanfrica.slack.com/join/shared_invite/zt-12x0oo6i8-tZ182NK~aUXroVE5tgRNaw#/shared-invite/email"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Other DS &amp;amp; ML Communities:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Kaggle:&lt;/strong&gt; is a well-known data science competition platform. It boasts a community of over 5 million users, where you can compete and share data sets and projects. Inside Kaggle you’ll find all the code and data you need to do your data science work. Use over 50,000 public &lt;a href="https://www.kaggle.com/datasets"&gt;datasets&lt;/a&gt; and 400,000 public &lt;a href="https://www.kaggle.com/kernels"&gt;notebooks&lt;/a&gt; to conquer any analysis in no time. The best thing I like about kaggle is they have a &lt;a href="https://www.kaggle.com/learn"&gt;well-structured and interactive learning&lt;/a&gt; environment even for beginners to start their journey in data science and machine learning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zindi Africa:&lt;/strong&gt;  for sure this platform played an essential role in my career, not saying like dumped everything in my head but I consumed a lot of challenges to improve my data science understanding.&lt;/p&gt;

&lt;p&gt;Zindi hosts the largest community of African data scientists, working to solve the world’s most pressing challenges using machine learning and Artificial Intelligence.&lt;/p&gt;

&lt;p&gt;You can join the Zindi community through  👉  &lt;strong&gt;&lt;a href="https://zindi.africa/?referralCode%3D4WtlJO"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Driven Data:&lt;/strong&gt; works on projects at the intersection of data science and social impact, in areas like international development, health, education, research and conservation, and public services. They focused to give more organizations access to the capabilities of data science and engage more data scientists with social challenges where their skills can make a difference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DataTalks:&lt;/strong&gt; This is another awesome community I do prefer to join their &lt;a href="https://datatalks.club/events.html"&gt;events&lt;/a&gt; and training programs. &lt;a href="https://datatalks.club/"&gt;DataTalks&lt;/a&gt; is the place to talk about data, the global online community of data enthusiasts. Also, they do post their events on youtube through their &lt;a href="https://www.youtube.com/@DataTalksClub"&gt;channel&lt;/a&gt;, which is a very resourceful platform for data professional growth.&lt;/p&gt;

&lt;p&gt;You can join the DataTalks community slack workspace through  👉  &lt;strong&gt;&lt;a href="https://datatalks.club/slack.html"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MLOps Community:&lt;/strong&gt; The great community for learning topics related to machine learning models into production, they fill the swiftly growing need to share real-world Machine Learning Operations best practices from engineers in the field.&lt;/p&gt;

&lt;p&gt;MLOps community hosts weekly talks and fireside chats about everything that has to do with the new space emerging around DevOps for Machine Learning aka MLOps aka Machine Learning Operations.&lt;/p&gt;

&lt;p&gt;Curious to dig more about this awesome community?&lt;/p&gt;

&lt;p&gt;You can join the MLOps community slack workspace through  👉 &lt;strong&gt;&lt;a href="https://home.mlops.community/"&gt;Join here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final Thoughts:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;When it comes to the advancement of AI, the open-source community is becoming more and more significant. Sharing information and resources to advance and advance is where the future is headed because no firm, not even the tech giants, will be able to "solve AI" on their own!&lt;/p&gt;

&lt;p&gt;I hope this article opened new thoughts in the machine learning space, please spread the love by sharing with others on socials.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--M1JuqBZS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0wznc5nyxlai97tlaug7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--M1JuqBZS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0wznc5nyxlai97tlaug7.jpg" alt="Image description" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>community</category>
    </item>
    <item>
      <title>Understanding How to Evaluate Textual Problems</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 13 Sep 2022 09:47:52 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/understanding-how-to-evaluate-textual-problems-32md</link>
      <guid>https://dev.to/neurotech_africa/understanding-how-to-evaluate-textual-problems-32md</guid>
      <description>&lt;p&gt;As a data professional, building models is a common topic what differs is just what that model is for? models, should solve certain challenges? then after we consider measuring the quality and performance of these models using &lt;a href="https://deepai.org/machine-learning-glossary-and-terms/evaluation-metrics"&gt;evaluation metrics&lt;/a&gt; and these are essential to confirm something concerning built models.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Evaluation metrics are used to measure the quality of the statistical or machine learning model.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article was originally published on the &lt;a href="https://blog.neurotech.africa/evaluation-metrics-for-textual-problems/"&gt;Neurotech Africa&lt;/a&gt; blog.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Need for evaluation?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The aim of building AI solutions is to apply them to real-world challenges. Mind you, our real world is complicated, so how do we decide which model to use and when? that is when their metrics come into application.&lt;/p&gt;

&lt;p&gt;A failure to know how to justify why your choosing a certain model instead of others or why a certain model is good or not, indicates you are not aware of what your solving or the model you built.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"When you can measure what you are speaking of and express it in numbers, you know that on which you are discussing. But when you cannot measure it and express it in numbers, your knowledge is of a very meager and unsatisfactory kind." ~ Lord Kelvin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Today let's have a sense of what are the metrics used in Natural Language Processing challenges.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Textual Evaluation Metrics&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In the Natural Language Processing (NLP) field, it is difficult to measure the performance of models for different tasks, challenge with labels is easier to evaluate but in the case of NLP task, the ground truth or result can be varied.&lt;/p&gt;

&lt;p&gt;We have lots of downstream tasks such as text or sentiment analysis, language generation, question answering, text summarization, text recognition, and translation.&lt;/p&gt;

&lt;p&gt;It is possible that biases creep into models based on the dataset or evaluation criteria. Therefore it is necessary to make Standard Performance Benchmarks to evaluate the performance of models for NLP tasks. These Performance metrics give us an indication of which model is better for which task.&lt;/p&gt;

&lt;p&gt;Let's jump right in to discuss some of the textual evaluation metrics  😊&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accuracy:&lt;/strong&gt; common metric in &lt;a href="https://en.wikipedia.org/wiki/Sentiment_analysis"&gt;sentiment analysis&lt;/a&gt; and &lt;a href="https://blog.neurotech.africa/swahili-text-classification-using-transformers/"&gt;classification&lt;/a&gt;, not the best one but denotes the fraction of times the model makes a correct prediction as compared to the total predictions it makes. Best used when the output variable is categorical or discrete. For example, how often a sentiment classification algorithm is correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confusion Matrix:&lt;/strong&gt; also used in &lt;a href="https://blog.neurotech.africa/swahili-text-classification-using-transformers/"&gt;classification&lt;/a&gt; challenges, It provides a clear report on the prediction of models in different categories, from the primary objective visualization of the model the following questions can be answered:-&lt;/p&gt;

&lt;p&gt;What percentage of the positive class is actually positive? (Precision)&lt;/p&gt;

&lt;p&gt;What percentage of the positive class gets captured by the model? (Recall)&lt;/p&gt;

&lt;p&gt;What percentage of predictions are correct? (Accuracy)&lt;/p&gt;

&lt;p&gt;Also, we can consider Precision and Recall are complementary metrics that have an inverse relationship. If both are of interest to us then we’d use the F1 score to combine precision and recall into a single metric.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Perplexity:&lt;/strong&gt; is a great probabilistic measure used to evaluate exactly how confused our model is. It’s typically used to evaluate &lt;a href="https://www.techtarget.com/searchenterpriseai/definition/language-modeling"&gt;language models&lt;/a&gt;, but it can be used in dialog generation tasks.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The language model refers to how machine-generated text is similar to humans write it. In other words, given w previous word and the correct score of generating w+1 token. The lower you get the perplexity, the better model you have.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Find this article about the perplexity evaluation metric, and take your time to explore &lt;em&gt;&lt;a href="https://towardsdatascience.com/perplexity-in-language-models-87a196019a94"&gt;Perplexity in Language Models&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bits-per-character(BPC) and bits-per-word:&lt;/strong&gt;  are other metrics often used for language models evaluations tasks. It measures exactly the quantity that it is named after the average number of bits needed to encode on character.&lt;/p&gt;

&lt;p&gt;“&lt;em&gt;if the language is translated into binary digits (0 or 1) in the most efficient way, the entropy is the average number of binary digits required per letter of the original language.&lt;/em&gt;" ~ Shannon&lt;/p&gt;

&lt;p&gt;Entropy is the average number of BPC. The reason that some language models report both cross entropy loss and BPC is purely technical.&lt;/p&gt;

&lt;p&gt;In practice, if everyone uses a different base, it is hard to compare results across models. For the sake of consistency, when we report entropy or cross-entropy, we report the values in bits.&lt;/p&gt;

&lt;p&gt;Mind you, BPC is specific to character-level language models. When we have word-level language models, the quantity is called bits-per-word (BPW) – the average number of bits required to encode a word&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;General Language Understanding Evaluation (GLUE):&lt;/strong&gt; this is a multi-task benchmark based on different types of tasks rather than evaluating a single task. As language models are increasingly being used for the purposes of transfer learning to other NLP tasks, the intrinsic evaluation of a language model is less important than its performance on downstream tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Super General Language Understanding Evaluation(superGLUE):&lt;/strong&gt; methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. This is the better or modified version of the GLUE benchmark with a new set of more difficult language understanding tasks, and improved resources after a GLUE benchmark performance comes close to the level of non-expert humans.&lt;/p&gt;

&lt;p&gt;It comprised new ways to test creative approaches on a range of difficult NLP tasks including sample-efficient, transfer, multitask, and self-supervised learning&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BiLingual Evaluation Understudy(BLEU):&lt;/strong&gt; commonly used in &lt;a href="https://blog.neurotech.africa/understanding-the-concept-of-machine-translation/"&gt;Machine translation&lt;/a&gt; and &lt;a href="https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/"&gt;Caption Generation&lt;/a&gt;, Since manual labeling for professional translation is very expensive the metric used in comparing a candidate translation(&lt;em&gt;by machine&lt;/em&gt;) to one or more reference translations(&lt;em&gt;by a human being&lt;/em&gt;). And the output lies in the range of 0-1, where a score closer to 1 indicates good quality translations.&lt;/p&gt;

&lt;p&gt;The calculation of BLEU involves the concept of n-gram precision and sentence brevity penalty.&lt;/p&gt;

&lt;p&gt;This metric has some drawbacks such as It doesn’t consider the meaning, It doesn’t directly consider sentence structure and It doesn’t handle morphologically rich languages.&lt;/p&gt;

&lt;p&gt;Rachael Tatman wrote an amazing article about BLEU just take your time to read it &lt;a href="https://towardsdatascience.com/evaluating-text-output-in-nlp-bleu-at-your-own-risk-e8609665a213"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-BLEU:&lt;/strong&gt; this ****is a smart use of the traditional BLEU metric for capturing and quantifying diversity in the generated text.&lt;/p&gt;

&lt;p&gt;The lower the value of the self-bleu score, the higher the diversity in the generated text. Long text generation tasks like story generation, news generation, etc could be a good fit to keep an eye on such metrics, helping evaluate the redundancy and monotonicity in the model. This metric can be complemented with other text generation evaluation metrics that account for the goodness and relevance of the generated text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Metric for Evaluation of Translation with Explicit ORdering(METEOR):&lt;/strong&gt; Precision-based metric to measure the quality of the generated text. Sort of a more robust BLEU. Allows synonyms and stemmed words to be matched with the reference word. Mainly used in machine translation.&lt;/p&gt;

&lt;p&gt;METEOR solved two BLEU drawbacks' of not taking recall into account and only allowing exact 𝑛-gram matching. Instead, METEOR first performs exact word mapping, followed by stemmed-word matching, and finally, synonym and paraphrase matching then computes the F-score using this relaxed matching strategy.&lt;/p&gt;

&lt;p&gt;METEOR only considers unigram matches as opposed to 𝑛-gram matches it seeks to reward longer contiguous matches using a penalty term known as &lt;strong&gt;fragmentation penalty&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BERTScore:&lt;/strong&gt; this is an automatic evaluation metric used for testing the goodness of text generation systems. Unlike existing popular methods that compute token-level syntactical similarity, BERTScore focuses on computing semantic similarity between tokens of reference and hypothesis.&lt;/p&gt;

&lt;p&gt;Bidirectional Encoder Representations from Transformers compute the cosine similarity of each hypothesis token 𝑗 with each token 𝑖 in the reference sentence using contextualized embeddings. They use a greedy matching approach instead of a time-consuming best-case matching approach and then compute the F1 measure.&lt;/p&gt;

&lt;p&gt;BERTScore correlates better with human judgments and provides stronger model selection performance than existing metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Character Error Rate (CER):&lt;/strong&gt;  this is a common metric of the performance of an automatic speech recognition system. This value indicates the percentage of characters that were incorrectly predicted. The lower the value, the better the performance of the ASR system with a CER of 0 being a perfect score.&lt;/p&gt;

&lt;p&gt;Possible tasks  CER can be applied to measure the performance are Speech Recognition, Optical Character Recognition (OCR), and Handwriting Recognition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Word Error Rate (WER):&lt;/strong&gt; this is a common performance metric mainly used for speech recognition, optical character recognition (OCR), and handwriting recognition.&lt;/p&gt;

&lt;p&gt;When recognizing speech and transcribing it into text, some words may be left out or misinterpreted. WER compares the predicted output and the reference transcript word by word to figure out the number of differences between them.&lt;/p&gt;

&lt;p&gt;There are three types of errors considered when computing WER:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Insertions: when the predicted output contains additional words that are not present in the transcript(for example, &lt;em&gt;SAT&lt;/em&gt; becomes &lt;em&gt;essay tea&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Substitutions:&lt;/em&gt; when the predicted output contains some misinterpreted words that replace words in the transcript(for example, &lt;em&gt;noose&lt;/em&gt; is transcribed as &lt;em&gt;moose&lt;/em&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Deletions:&lt;/em&gt; when the predicted output doesn’t contain words that are present in the transcript(for example, &lt;em&gt;turn it&lt;/em&gt; around becomes &lt;em&gt;turn around&lt;/em&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For understanding let's consider the following reference transcript and predicted output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reference transcript: “&lt;em&gt;Understanding textual evaluation metrics is awesome for a data professional”&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Predicted output: “&lt;em&gt;Understanding textual metrics is great for a data professiona*l&lt;/em&gt;”*.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this case, the predicted output has one deletion (the word “&lt;em&gt;textual&lt;/em&gt;” disappears) and one substitution (“&lt;em&gt;awesome”&lt;/em&gt; becomes “&lt;em&gt;great”&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;So, what is the Word Error Rate of this translation? Basically, WER is the number of errors divided by the number of words in the reference transcript.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;WER = (num inserted + num deleted + num substituted) / num words in the reference&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thus, in our example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;WER = (0 + 1 + 1) / 10 = 0.2&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lower WER often indicates that the Automated Speech Recognition (ASR) software is more accurate in recognizing speech. A higher WER, then, often indicates lower ASR accuracy.&lt;/p&gt;

&lt;p&gt;The drawback is that it assumes the impact of different errors is the same. Sometimes, insertion error may have a bigger impact than deletion. Another limitation is that this metric cannot distinguish a substitution error from combing, deletion and insertion error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recall-Oriented Understudy for Gisting Evaluation&lt;/strong&gt; (&lt;strong&gt;ROUGE):&lt;/strong&gt; this is Recall based, unlike BLEU which is Precision based. ROUGE metric includes a set of variants: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S. ROUGE-N is similar to BLEU-N in counting the 𝑛-gram matches between the hypothesis and reference.&lt;/p&gt;

&lt;p&gt;This is a set of metrics used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references or a human-produced summary or translation.&lt;/p&gt;

&lt;p&gt;Mind you, in summarization tasks where it’s important to evaluate how many words a model can recall (recall = % of true positives versus both true and false positives)&lt;/p&gt;

&lt;p&gt;Feel free to check out the python package &lt;em&gt;&lt;a href="https://pypi.org/project/rouge/"&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final Thoughts:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Understanding which performance measure to use and the best one for the problem at hand help to validate the right solution to meet the needs of the particular challenge.&lt;/p&gt;

&lt;p&gt;The challenge with NLP solutions is on measuring their performance for various tasks. Speaking of other Machine learning tasks, it is easier to measure the performance because the cost function or evaluation criteria are well defined and have a clear picture of what is to be evaluated.&lt;/p&gt;

&lt;p&gt;One more reason for this is that labels are well-defined in other tasks, but in the NLP task, the ground result can vary a lot. Coming up with the best model depends on various factors but evaluation metrics are an essential factor to consider depending on the nature of the task you are solving.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;References:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://thegradient.pub/understanding-evaluation-metrics-for-language-models/"&gt;Evaluation Metrics for Language Modeling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/evaluating-text-output-in-nlp-bleu-at-your-own-risk-e8609665a213"&gt;Evaluating Text Output in NLP: BLEU at your own risk&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2006.14799.pdf"&gt;Evaluation of Text Generation: A Survey&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aclanthology.org/2021.triton-1.6.pdf"&gt;Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-oriented Text&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1904.09675"&gt;Evaluating Text Generation with BERT&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.paperspace.com/automated-metrics-for-evaluating-generated-text/"&gt;Automated metrics for evaluating the quality of text generation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>nlp</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>The Future of Customer Service: What You Need to Know About Conversational AI</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Fri, 09 Sep 2022 13:27:13 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/the-future-of-customer-service-what-you-need-to-know-about-conversational-ai-33nb</link>
      <guid>https://dev.to/neurotech_africa/the-future-of-customer-service-what-you-need-to-know-about-conversational-ai-33nb</guid>
      <description>&lt;p&gt;This article was originally published on the &lt;a href="https://blog.neurotech.africa/the-future-of-customer-service-what-you-need-to-know-about-conversational-ai/"&gt;Neurotech Africa blog&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Today’s consumers are more informed, connected, and intractable than ever before. As a result, brands that fail to meet their high standards face an uphill battle. 74% of consumers will not recommend a brand again after a negative experience. Moreover, 90% of customers expect to be able to communicate directly with a company through chat or messaging as if they were friends. Conversational AI has the potential to revolutionize the customer service experience by making it more personal and accessible for end users. This blog post breaks down the whys and hows of conversational AI in customer service, so keep reading to learn more.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;About Conversational AI?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Conversational AI, also known as natural language processing, is the ability of machines to understand human language and respond accordingly. Natural language processing is key to implementing conversational interfaces or interfaces that allow people to communicate with computers through spoken language and written text as if they were having a conversation with another person. A conversational interface has two main parts — An automated system (e.g. an IVR or an SMS-based solution) that detects and responds to user inputs — and a natural language processor (NLP) that analyses and understands the user input. The NLP will then transform the input into a machine-readable format and then trigger an appropriate response from the system.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--raweIVdk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2A7E76P5SryLng4orn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--raweIVdk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2A7E76P5SryLng4orn.jpg" alt="https://miro.medium.com/max/700/0*7E76P5SryLng4orn.jpg" width="700" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why is Customer Service Important?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Customer service is a key aspect of any customer-facing business. It can be the difference between capturing a new customer and losing an existing one. It’s no wonder that the customer experience is the top priority for brands. According to a recent study, 69% of customers would pay more for a better experience. That’s why so many companies are turning to customer service AI. Customer service AI brings conversational interfaces, a technology that’s been around since the 1960s. More recently, it’s become increasingly important in the fields of commerce, health care, transportation, and more.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;How will Conversational AI Change Customer Service?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The rise of human-machine communication will transform customer service in the following ways: -&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Increased accessibility&lt;/strong&gt;: Human customer service will become more accessible to everyone thanks to the rise of AI-powered virtual assistants. Meanwhile, AI customer service agents will be able to handle more requests from more people simultaneously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better quality of service&lt;/strong&gt;: High-quality, personalized service delivered by AI agents will boost customer satisfaction and retention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better customer satisfaction&lt;/strong&gt;: Satisfied customers generate more revenue for businesses than unhappy customers. AI customer service agents can increase customer satisfaction across the board.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improved customer retention&lt;/strong&gt;: Businesses can retain customers by providing an exceptional customer service experience. AI can help businesses do just that.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improved AI-human collaboration:&lt;/strong&gt; Businesses will unlock new levels of productivity by bringing AI and human agents together.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better customer retention through personalized messaging&lt;/strong&gt;: AI agents will be able to deliver highly personalized messages to customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Limitations of Conversational AI in Customer Service&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;While conversational AI is poised to revolutionize the customer service experience, there are some limitations that we must account for&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Customer expectations:&lt;/strong&gt; Customers have high standards and will be disappointed if AI falls short of their expectations. -&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data privacy and security:&lt;/strong&gt; Businesses must protect the privacy of their customer’s data. AI poses a particular concern in this regard, as hackers can use AI to take over machines and systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cultural shifts in customer service:&lt;/strong&gt; AI may not be a good fit for every culture, and businesses may have to adjust their strategies accordingly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human resources:&lt;/strong&gt; The implementation of AI may mean fewer human agents, which may pose problems for businesses that depend on human customer service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical limitations:&lt;/strong&gt; While the promise of AI is great, the technology is not yet advanced enough to meet all of our expectations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shifting customer service strategies:&lt;/strong&gt; Customer service strategies may shift in the coming years, rendering today’s AI technologies obsolete.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;How to Achieve Success with Conversational AI in Customer Service&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Success with conversational AI in customer service starts with a strategic plan for implementation. Companies should consider the following: -&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LbDIrz6m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2AfBGttCdBm2R3xSn_.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LbDIrz6m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2AfBGttCdBm2R3xSn_.jpg" alt="https://miro.medium.com/max/700/0*fBGttCdBm2R3xSn_.jpg" width="700" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Defining the customer experience:&lt;/strong&gt; Companies must define their customer experience strategy, including how AI agents fit into that strategy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Building a strategy for AI:&lt;/strong&gt; Companies should decide what type of AI to implement and how that AI will work within their strategy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hiring the right talent:&lt;/strong&gt; Companies must hire the right people to implement their AI strategy. This includes both AI agents and human agents that will collaborate with them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Investing in the right technology:&lt;/strong&gt; Companies must choose the right technology that supports their AI strategy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing and training:&lt;/strong&gt; Companies must ensure that AI works as intended before launching it to customers. They must also train their human and AI agents to work together.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Final Thoughts&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The future of customer service is bright, but businesses must act soon to take advantage of the benefits of conversational AI. Companies must prepare by defining their strategy, investing in the right technology, and hiring the right talent. They must also consider the limitations of AI and have a plan for overcoming them. Finally, businesses must act quickly before the benefits of conversational AI are claimed by others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HDZ3Uxm8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tuybc5sjf3hmy2nk2dyz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HDZ3Uxm8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tuybc5sjf3hmy2nk2dyz.jpg" alt="Image description" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Redefining Customer Engagement as Digital Bank</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Fri, 09 Sep 2022 13:14:53 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/redefining-customer-engagement-as-digital-bank-2d72</link>
      <guid>https://dev.to/neurotech_africa/redefining-customer-engagement-as-digital-bank-2d72</guid>
      <description>&lt;p&gt;This article means a lot to digital banks on how they can use conversational Artificial intelligence to acquire, engage and retain customers.&lt;/p&gt;

&lt;p&gt;This article was originally published on the &lt;a href="https://blog.neurotech.africa/redefining-customer-engagement-as-digital-bank/"&gt;Neurotech Africa&lt;/a&gt; blog.&lt;/p&gt;

&lt;p&gt;Wow! the heading of this article brings you here, great to hear that.&lt;/p&gt;

&lt;p&gt;And I will be sharing with you my understanding of the estimated degree and depth of people interacting with digital banks associated with artificial intelligence technology. Without further due let me pencil the topic in simple words.&lt;/p&gt;

&lt;p&gt;But you already have something in your mind right?&lt;/p&gt;

&lt;p&gt;Customer engagement is the means by which a company creates a relationship with its customer base to foster brand loyalty and awareness. This can be accomplished via marketing campaigns, new content created for and posted to websites, and outreach via social media and mobile and wearable devices, among other methods.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Customer engagement is the ongoing interactions between company and customer, offered by the company, chosen by the customer.” by Paul Greenberg&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Wonderful, now we are clear on the topic it’s all about when you let your customers choose how they’d like to engage with you, you’ll be more likely to uncover the type of interactions that they find valuable. By making it easier for customers to engage in ways they find valuable, you’ll strengthen their emotional investment in your digital bank.&lt;/p&gt;

&lt;p&gt;Disruptive innovation in financial services is growing massively every now and then. The challenges facing this industry are making professionals continue brainstorming the right way of emerging with it using the existing technologies.In modern banking Artificial intelligence has developed an important and distinguished series of roles, from security automation, and loan automation to customer engagement processes.&lt;/p&gt;

&lt;p&gt;Companies with well-defined data strategies have realized the great role played by this technology to bring value to their products. The journey of handling customers differs from one organization to another depending on culture, strategies, goals, and so on.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why digital banks should care about redefining customer engagement?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In digital Banking customers are the kings, the interaction between the bank and people made the business simply to say one bank needs people who also other competitors need them.Redefining the interaction between your customers and the bank is important to provide good customer service for a successful business. With the advent of digital, the scope of good customer service has extended from providing timely and high-quality products and/or services to providing an experience that delivers value outside the original sale.&lt;/p&gt;

&lt;p&gt;As the banking world has become more crowded, there’s been an overwhelming focus on clicks, conversions, and acquisition costs.&lt;/p&gt;

&lt;p&gt;However, these acquisition strategies alone won’t be enough to grow your business sustainably. Finding ways to engage with your customers in between purchases strengthens their emotional connection to your brand, helping you retain the customers you already have while sustainably growing your business.&lt;/p&gt;

&lt;p&gt;In fact, the revenue banks generate 95 percent rely on effective customer engagement through interest on loans and fees associated with their services.&lt;/p&gt;

&lt;p&gt;According to &lt;a href="https://www.constellationr.com/blog-news/research-summary-why-live-engagement-marketing-supercharges-event-marketing"&gt;constellation research&lt;/a&gt; on customer engagement, companies that have improved engagement increase cross-sell by 22 percent, drive up-sell revenue from 13 percent to 51 percent, and also increase order sizes from 5 percent to 85 percent.&lt;/p&gt;

&lt;p&gt;The statistics show the impact of engaging your customers and how significant the revenue can increase.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;About conversational Artificial intelligence&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Conversational AI involves three concepts: artificial intelligence, human language, and automation. We can define it as the type of artificial intelligence that enables consumers to interact with computer applications the way they would with other humans.&lt;/p&gt;

&lt;p&gt;Best conversation AI solutions show remarkable support for businesses. Think about the last time that you communicated with a business online and received the answer to your question within seconds all with little effort. This is conversational AI doing powerful work seamlessly and efficiently. The bonus? A conversational AI solution knows when to notify and transfer the customer to a live agent all within the same conversation stream when the situation warrants it.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conversational Artificial Intelligence in customer engagement in digital Bank&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The process of acquiring, engaging, and retaining customers can be boosted with technologies like conversational Artificial intelligence. In fact, the technology itself does not offer full focus on the process but specific means other factors can be considered, here are the cases for digital banks&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Increasing customer attraction through socials:-&lt;/strong&gt; making easier accessibility of digital banks’ services can impact their engagement with customers through social platforms like WhatsApp, telegram, etc also go a long way in keeping them engaged over time. Conversational AI makes it easier to handle this kind of engagement by using a natural conversation with your customers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manage payments and transactions:-&lt;/strong&gt; On a regular, people have to clear bills, pay businesses, shop online, or perform any kind of online transaction. A conversation AI can help the user make and track these payments. Clearing payments can often be urgent and time-bound. More often than not, in such cases, switching platforms to complete transactions can be inconvenient. But with an omnichannel conversational AI, your customers can make payments right where they are, and avoid any delays!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recommendation of new service:-&lt;/strong&gt; with conversational AI, digital banks’ can simplify the process of selecting the right services or products for specific customers, from their day-to-day interactions. Meeting user expectations is a great win and this can improve the engagement with your bank.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Addressing frequently asked questions (FAQs)&lt;/strong&gt;:- With conversational AI handling, repetitive questions becomes easier instead of agent calls or scrolling over a long website page, customers can type or speak and get an answer to a query instantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leads generation:-&lt;/strong&gt; Conversational AI solutions have no match when interaction comes to play. They can interact with customers for the first time and understand their needs and sentiments behind the conversation. This, very human interaction, can help digital banks acquire new customers and also get their personal details. These details are then transferred to the sales team for taking the conversation forward.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Driving referral campaign with exiting customers:-&lt;/strong&gt; with conversational AI driving engagement doesn’t have to be solely between your customers and your brand, it can also be between customers. Empowering your best customers to easily share your brand with their friends and family can not only help you acquire a new one but also engage the customers you have.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How does &lt;a href="https://www.neurotech.africa/#contact"&gt;Neurotech’s&lt;/a&gt; conversational AI solution, redefine customer engagement for digital banks?&lt;/p&gt;

&lt;p&gt;We offer customer support solutions for businesses to engage customers with a personalized experience at every touchpoint, across any digital channel through our internal engine called &lt;a href="https://sarufi.io/"&gt;Sarufi&lt;/a&gt;. We care about memorable experiences that happen when customers are free to speak naturally. Our conversational solution(chatbots) understands customer, and provide seamless customer support across multiple platforms, enabling you to offer a more personalized, contextual service to customers, reduce call center overload, ensure reliable customer support 24/7, and you can explore more from &lt;a href="https://blog.neurotech.africa/how-can-neurotech-transform-your-business-with-conversational-ai/"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ILOrD12C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2AOr7TTBGbnJci0NPN.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ILOrD12C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/700/0%2AOr7TTBGbnJci0NPN.jpg" alt="https://miro.medium.com/max/700/0*Or7TTBGbnJci0NPN.jpg" width="700" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can reach out for a demo of our banking conversational AI solution &lt;a href="https://www.neurotech.africa/#contact"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Final thoughts:&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Don’t confuse technology and business strategy, You should consider relying on your strategies which can be boosted with technology like Artificial intelligence.&lt;/p&gt;

&lt;p&gt;Great customer experiences across every channel are imperative that digital banks’ cannot ignore. While the availability of digital footprints has made it possible to deliver pronounced mobile and digital experiences, digital banks need to ensure that the customer at the physical store is not deprived of the same seamless and immersive experience that the digital native or the millennial customer is accustomed to.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s---YafQApo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0hb7cuw37taf5cl3nn4a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s---YafQApo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0hb7cuw37taf5cl3nn4a.jpg" alt="Image description" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>datascience</category>
      <category>ai</category>
    </item>
    <item>
      <title>How is conversational AI impacting the finance industry?</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 09 Aug 2022 07:21:18 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/how-is-conversational-ai-impacting-the-finance-industry-agh</link>
      <guid>https://dev.to/neurotech_africa/how-is-conversational-ai-impacting-the-finance-industry-agh</guid>
      <description>&lt;p&gt;This article was originally published in the &lt;a href="https://blog.neurotech.africa/how-is-conversational-ai-imapacting-the-finance-industry/"&gt;neurotech Africa&lt;/a&gt; blog.&lt;/p&gt;

&lt;p&gt;The evolution of technology continues to spread across multiple industries, the finance industry can't be left behind when the list of industries experiencing immense transformation every now and then because is one of the important segments of the economy.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;About Finance industry&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The finance sector is wide and constitutes at least 20% of the global economy and the impact of this sector on economic growth is significant.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;According to the finance and development department of the International Monetary Fund, financial services are the processes by which consumers or businesses acquire financial goods. For example, a payment system provider offers a financial service when it accepts and transfers funds between payers and recipients. This includes accounts settled through credit and debit cards, checks, and electronic funds transfers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In developing countries, fin-tech firms are gaining prominence, aided by the rise of digital public goods and currencies. Migrating to online and mobile services will remain a priority for financial firms way to a cashless economy and financial services companies like banks, Tax, and accounting services and insurances will need to complete with emergent financial firms. Building internal services can be the best option for large companies but not all of them and every solution, speaking of small to medium microfinance the best way to migrate is by outsourcing to companies with the best talents to build solutions to meet the demand of the digital economy.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Conversational AI use cases in the Finance industry:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Conversational AI is one of the essential boosts in the Finance sector from sales, marketing, and customer services. Conversational AI solutions allow smooth customer services management both fast and efficiently, the essential advantage of this technology is to act as a listening channel and better understanding of your customers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KQVg9R6N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit01.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KQVg9R6N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit01.png" alt="https://blog.neurotech.africa/content/images/2022/08/benefit01.png" width="828" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Collective way of understanding which product is performing better, how the customer views your services, what they don’t like what they like, feedback, and suggestions to work on. All of these pieces of information are the potential to improve your business by transforming services in a personalized way, and recommendations of new services or products will help customers to get better service according to what they utilize.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IQDY0WGH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IQDY0WGH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit0.png" alt="https://blog.neurotech.africa/content/images/2022/08/benefit0.png" width="853" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In fact, conversational AI solutions help businesses to reduce operational costs by improving the efficiency of their service, minimizing human error, and resolving customer queries quicker.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2VwkrEI6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit02.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2VwkrEI6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/08/benefit02.png" alt="https://blog.neurotech.africa/content/images/2022/08/benefit02.png" width="877" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benefits of conversational AI solution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are the use cases of conversational AI in the finance industry?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manage payments and transactions:- On a regular, people have to clear bills, pay businesses, shop online, or perform any kind of online transaction. A conversation AI can help the user make and track these payments. Clearing payments can often be urgent and time-bound. More often than not, in such cases, switching platforms to complete transactions can be inconvenient. But with an omnichannel conversational AI, your customers can make payments right where they are, and avoid any delays!&lt;/li&gt;
&lt;li&gt;Leads generation:- Conversational AI solutions have no match when interaction comes to play. They can interact with customers for the first time and understand their needs and sentiments behind the conversation. This, very human interaction, can help banks acquire new customers and also get their personal details. These details are then transferred to the sales team for taking the conversation forward.&lt;/li&gt;
&lt;li&gt;Resolve common and repetitive inquiries:- Some repetitive activities are really boring, there are some questions that most of your users likely ask frequently such as "how do I restore unsuccessful transaction?", "What are the steps to follow to get a loan? ", "what is the status of my loan application?", etc Instead of customers going through a long list of frequently asked questions, a conversational AI solution can handle this with instant reply.&lt;/li&gt;
&lt;li&gt;Easy document collection and sharing:- Assume your customer wants to apply for a new loan but keeps getting sent back from the bank each time because of new inconsistencies in verification very annoying, right? Nobody can be happy with this situation neither your customer nor you. This is a pretty common scene to witness in a bank. This happens mostly because of a lack of knowledge and awareness on the customer’s front. However, form filling, document collection, and verification are common conversational AI use cases in banking and insurance.&lt;/li&gt;
&lt;li&gt;Locate nearest service providers:- This may include ATMs, agents, and branches. Assume you're new in the city and you need to find a certain bank branch or even an ATM instead of asking multiple people, a conversational AI solution with geolocation of all your businesses can help your customer easier to navigate to the nearest service provider.&lt;/li&gt;
&lt;li&gt;Feedback collections:- Customers would love to give feedback and reviews if their hard-earned money or other services is taken care of by the banks or insurance company. These reviews can be collected by the banking conversational AI, instead of using the long survey forms, banks can now integrate chatbots on their websites and apps for collecting feedback and reviews.&lt;/li&gt;
&lt;li&gt;Handling suspicious activities:- Security and data privacy concerns any business. But for banks and financial organizations, their reputation relies on it. conversation AI solutions can effectively monitor and recognize the warning signs of fraudulent activity and issue alerts directly to the customer and the bank.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Conversational AI solution by Neurotech&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.neurotech.africa/"&gt;Neurotech&lt;/a&gt; we are an AI company that builds &lt;a href="https://www.neurotech.africa/#services"&gt;solutions&lt;/a&gt; for businesses currently we do develop conversational AI for business needs which are controlled by our internal engine goes by the name &lt;a href="https://sarufi.io/#_"&gt;Sarufi&lt;/a&gt;. We offer custom solutions to fit various business needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it is useful?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our conversational ai solutions can provide seamless customer support across multiple platforms, enabling you to offer a more personalized, contextual service to customers, and you can explore more from &lt;a href="https://blog.neurotech.africa/how-can-neurotech-transform-your-business-with-conversational-ai/"&gt;here&lt;/a&gt;&lt;strong&gt;.&lt;/strong&gt;  Our solutions ****are developed in such a way that can understand the contextual meaning of the interaction or conversation with targeted audiences, Our custom chatbots can be deployed on social media platforms like Whatsapp, Facebook, Instagram, and Telegram. This depends on what our customers need.&lt;/p&gt;

&lt;p&gt;Currently, our solutions can work in two languages only Swahili and English, it can help out your business with customer support, locate near service providers, save on labor costs and instead pay fewer support employees fair wages without being stretched to support a large staff, increase revenue and build opportunities with every customer interaction note that 😊.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final thoughts&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Businesses should consider focusing on business needs not on technology in fact the aim is to earn more and make sure things are moving in the right direction. Technology advancing at a rapid pace which sometimes may be confusing. Executives and business professionals may find that their decisions are lagging behind the rapidly growing technology like conversational AI, blockchain, data analysis, etc. This misunderstanding may lead to consuming non-actionable insights into your business operations and this can be stressful and overwhelming to your customers.&lt;/p&gt;

&lt;p&gt;To avoid that mistake and overloading customers with unnecessary information, executives should get close to technology experts to clearly understand what can be solved based on their needs with only actionable insights, this will help to avoid unnecessary costs and expectations on something that can’t work on your business.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.neurotech.africa/#contact"&gt;Get in touch&lt;/a&gt; with Neurotech’s team to discover how you can benefit from our conversational solutions to boost your business, we do consultations on best practices for using data insights to address your business needs.&lt;/p&gt;

&lt;p&gt;Find the needs of your business, build solutions don’t implement something simply because ABC company has implemented it, and do something that is potential for your business growth.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kA9NRsI3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r3rilv2ydy6hsgnk0rn6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kA9NRsI3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r3rilv2ydy6hsgnk0rn6.jpg" alt="Image description" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>The cause of a decision in Swahili social media sentiments</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 09 Aug 2022 07:07:00 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/the-cause-of-a-decision-in-swahili-social-media-sentiments-2jhp</link>
      <guid>https://dev.to/neurotech_africa/the-cause-of-a-decision-in-swahili-social-media-sentiments-2jhp</guid>
      <description>&lt;p&gt;This article was originally published in the &lt;a href="https://blog.neurotech.africa/the-cause-of-the-decision-in-swahili-social-media-sentiment/"&gt;neurotech Africa&lt;/a&gt; blog.&lt;/p&gt;

&lt;p&gt;As a data professional one of the best practices is to be accountable for the solutions at hand, by understanding how the model you have built is performing and predicting the results. I came across Swahili social media sentiments and since I'm a Swahili speaker I was curious to understand the cause of decisions in Swahili sentiment analysis using machine learning algorithms.&lt;/p&gt;

&lt;p&gt;In today's article, I will work with you through building a machine learning model for Swahili social media sentiment classification with the interpretability of each prediction of our final model using &lt;a href="https://github.com/marcotcr/lime"&gt;Local Interpretable Model-Agnostic Explanations&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Why Should I Trust You?” Explaining the Predictions of Any Classifier ~ Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Kiswahili is a lingua franca spoken by up to 150 million people across East Africa. It is an official language in Tanzania, DRC, Kenya, and Uganda. On social media, Swahili speakers tend to express themselves in their own local dialect.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Building Swahili social media sentiment classifier&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Sentiment analysis relies on multiple word senses and cultural knowledge and can be influenced by age, gender, and socio-economic status. In today's task, I will be using datasets from Twitter originally hosted at &lt;a href="https://zindi.africa/competitions/swahili-social-media-sentiment-analysis-challenge"&gt;Google Natural language processing hack series&lt;/a&gt; by zindi Africa, with the aim of classifying whether a Swahili sentence is of positive, negative, or neutral sentiment.&lt;/p&gt;

&lt;p&gt;The dataset contains three columns which are &lt;code&gt;id&lt;/code&gt; as the unique ID of a unique Swahili tweet, &lt;code&gt;tweets&lt;/code&gt; containing the actual text of the Swahili tweet, and &lt;code&gt;labels&lt;/code&gt; the label of the Swahili tweet, either negative(-1), neutral(0), positive(1) with 2263 observations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WtXDEqFl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/sw-head.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WtXDEqFl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/sw-head.png" alt="https://blog.neurotech.africa/content/images/2022/07/sw-head.png" width="671" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How about label distribution?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RzGttkrK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/class-dist.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RzGttkrK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/class-dist.png" alt="https://blog.neurotech.africa/content/images/2022/07/class-dist.png" width="720" height="504"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most of the tweets collected are neutral, which shows that our labels are imbalanced.&lt;/p&gt;

&lt;p&gt;Let's work on preprocessing the dataset to make everything ready for building our final machine learning model. This will involve a range of steps for cleaning the texts&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;removing non-alphanumeric text.&lt;/li&gt;
&lt;li&gt;removing stopwords&lt;/li&gt;
&lt;li&gt;converting all tweets into lowercase.&lt;/li&gt;
&lt;li&gt;removing punctuation, links, emojis, and white spaces.&lt;/li&gt;
&lt;li&gt;tokenize the text based on each word.&lt;/li&gt;
&lt;li&gt;the final piece is to append all clean tweets into new columns named &lt;code&gt;clean_tweets&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Point to note, &lt;code&gt;nltk&lt;/code&gt; doesn't consist of Swahili stopwords but you have to create your own list and apply it to the tweets. I just created a &lt;a href="https://github.com/Neurotech-HQ/Cause-of-decision-in-Swahili-sentiments/blob/main/data/Common%20Swahili%20Stop-words.csv"&gt;CSV&lt;/a&gt; file with a couple of Swahili stopwords like na, kwa, kama, lakini, ya, take, etc which  I will apply here.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Stop words are a set of commonly used words in any language. For example, in English, “the”, “is” and “and”, would easily qualify as stop words. In NLP and text mining applications, stop words are used to eliminate unimportant words, allowing applications to focus on the important words instead.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To make things smooth let's just use one function to perform all of the tasks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_tweets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;'''
        function to clean tweet column, make it ready for transformation and modeling
    '''&lt;/span&gt;
    &lt;span class="n"&gt;tweet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;#convert text to lower-case
&lt;/span&gt;    &lt;span class="n"&gt;tweet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'[‘’“”…,]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# remove punctuation
&lt;/span&gt;    &lt;span class="n"&gt;tweet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'[()]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# remove parenthesis
&lt;/span&gt;    &lt;span class="n"&gt;tweet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[^a-zA-Z]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;#remove numbers and keep text/alphabet only
&lt;/span&gt;    &lt;span class="n"&gt;tweet_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;clean_tweets&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tweet_list&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;swstopwords&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# remove stop words
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_tweets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_tweets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;function to clean tweet column, make it ready for transformation and modeling&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now the tweets are clean and ready for further processes&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jLFHrwEu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/sw-clean-tweets.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jLFHrwEu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/sw-clean-tweets.png" alt="https://blog.neurotech.africa/content/images/2022/07/sw-clean-tweets.png" width="880" height="176"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;datasets after applying the clean_tweet function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Time to work on the analysis of the Swahili tweets by looking at polarity and subjectivity. But wait! what do polarity and subjectivity mean?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Polarity is the expression that determines the sentimental aspect of an opinion. In textual data, the result of sentiment analysis can be determined for each entity in the sentence, document, or sentence. The sentiment polarity can be determined as positive, negative, and neutral. Usually defined as a float that ranges from 1 (entirely positive) to -1 (entirely negative)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sentiment polarity for an element defines the orientation of the expressed sentiment, i.e., it determines if the text expresses the positive, negative or neutral sentiment of the user about the entity in consideration.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Subjectivity is the measure of how factual the text is, ranging from 0 (pure fact) and 1 (pure opinion)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I will be using &lt;code&gt;textblob&lt;/code&gt; to analyze tweets&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;polarity_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="s"&gt;'''
        This function takes in a text data and returns the polarity of the text
        Polarity is float which lies in the range of [-1,0,1] where 1 means positive statement, 0 means positive statement
        and -1 means a negative statement
    '''&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TextBlob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;polarity&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;subjectivity_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="s"&gt;'''
      This function takes in a text data and returns the subectivity of the text.
      Subjectivity sentences generally refer to personal opinion,
      emotion or judgment whereas objective refers to factual information.
      Subjectivity is also a float which lies in the range of [0,1].
  '''&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TextBlob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tweet&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subjectivity&lt;/span&gt;

  &lt;span class="c1"&gt;#apply above functions to the data
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'polarity_score'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;polarity_score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'subjectivity_score'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subjectivity_score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;polarity score&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now let's try to aggregate the overall polarity and subjectivity of the entire dataset&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The overall polarity of the tweet data is 0.01&lt;/p&gt;

&lt;p&gt;The overall subjectivity of the tweet data is 0.03&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The overall polarity of the tweet data indicates that the tweets are fairly neutral.&lt;/p&gt;

&lt;p&gt;Let's try to visualize the polarity and subjectivity distributions  of each class independently&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# visualization
&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;subplot_titles&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Polarity Score Distribution-Negative"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Subjectivity Score Distribution-Negative"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                                   &lt;span class="s"&gt;"Polarity Score Distribution-Neutral"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Subjectivity Score Distribution-Neutral"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                                   &lt;span class="s"&gt;'Polarity Score Distribution-Positive'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;'Subjectivity Score Distribution-Positive'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;x_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Score"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Frequency'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'polarity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'subjectivity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'polarity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'subjectivity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'polarity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'subjectivity_score'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"colab"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now here we go,&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wEAJUAFT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wEAJUAFT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot.png" alt="https://blog.neurotech.africa/content/images/2022/07/newplot.png" width="880" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;distribution of each class on polarity and subjectivity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In terms of Subjectivity, all three classes tend to be similar no significant difference can be stated, but the polarity of the negative class is different from the positive and neutral classes in terms of skewness.&lt;/p&gt;

&lt;p&gt;Let's try to understand the content by visualizing the most used words in all classes and later we can jump to each class independently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;word_freq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expand&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="n"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;word_freq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'index'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update_layout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xaxis_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Word"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yaxis_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Count"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Top 20 most Frequent words in across entire tweet data"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"colab"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kSKlqh3U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot--1-.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kSKlqh3U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot--1-.png" alt="https://blog.neurotech.africa/content/images/2022/07/newplot--1-.png" width="880" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;habari&lt;/code&gt;, &lt;code&gt;leo&lt;/code&gt;, &lt;code&gt;siku&lt;/code&gt;, &lt;code&gt;namba&lt;/code&gt; are the top frequent words in the overall tweet contents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Negative Tweets Word Frequency
&lt;/span&gt;&lt;span class="n"&gt;word_freq_neg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expand&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="n"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;word_freq_neg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'index'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Neutral Tweets Word Frequency
&lt;/span&gt;&lt;span class="n"&gt;word_freq_neut&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expand&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="n"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;word_freq_neut&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neut&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'index'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Positive Tweets Word Frequency
&lt;/span&gt;&lt;span class="n"&gt;word_freq_pos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Labels'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;'clean_tweets'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expand&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="n"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;word_freq_pos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_pos&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'index'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;subplot_titles&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Top 20 most frequent words-Negative"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Top 20 most frequent words-Neutral"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Top 20 most frequent words-Positive"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;x_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Word"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_neut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_trace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_pos&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Word'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;word_freq_pos&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Count'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"colab"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--svlRm5r6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot--2-.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--svlRm5r6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/newplot--2-.png" alt="https://blog.neurotech.africa/content/images/2022/07/newplot--2-.png" width="880" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Across the negative class tweets, the most used words are &lt;code&gt;watu&lt;/code&gt;, &lt;code&gt;leo&lt;/code&gt; and &lt;code&gt;siku&lt;/code&gt; , across the neutral class tweets, the most used words are &lt;code&gt;habari&lt;/code&gt;, &lt;code&gt;kazi&lt;/code&gt; and &lt;code&gt;mtu&lt;/code&gt; and across the positive class tweets, the most frequently used words are &lt;code&gt;habari&lt;/code&gt;, &lt;code&gt;leo&lt;/code&gt;, &lt;code&gt;asante&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let's prepare our final dataset for modeling by splitting it into two groups(train and test)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# data split
&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"clean_tweets"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Labels"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;seed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;

&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;stratify&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, I can make our data pipeline ready for training Swahili sentiments by defining &lt;code&gt;TfidfVectorizer&lt;/code&gt; as a vectorizer and &lt;code&gt;LogisticRegression&lt;/code&gt; as an algorithm for building our model. Using initialized pipeline, I can train the classifier using the training set of tweets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# instantiating model pipeline
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;TfidfVectorizer&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;LogisticRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# training model
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great! I have trained our classifier for Swahili social media sentiments, and now it's time to evaluate the performance of our model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Classification Report"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;classification_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;target_names&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Negative"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Neutral"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"Positive"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the classification report, the performance is not very good, our model has a 60% accuracy&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qZY5TYTc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/classification-report.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qZY5TYTc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/classification-report.png" alt="https://blog.neurotech.africa/content/images/2022/07/classification-report.png" width="701" height="277"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Results Interpretability&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It's time to understand the cause of the decision of our classifier, we should bring LIME to help us in the interpretation of each prediction of our model, for understanding let me opt to filter out three kinds of prediction(negative, neutral, and positive).&lt;/p&gt;

&lt;p&gt;The higher the interpretability of a machine learning model, the easier it is for someone to comprehend why certain decisions or predictions have been made. A model is better interpretable than another model if its decisions are easier for a human to comprehend than decisions from the other model.&lt;/p&gt;

&lt;p&gt;I should consider predicting probabilities with a LogisticRegression classifier instead of 0 or 1 simply because LIME requires a model that produces probability scores for each prediction to explain the decision's cause.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JQHuQhFc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime-00.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JQHuQhFc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime-00.png" alt="https://blog.neurotech.africa/content/images/2022/07/temp-lime-00.png" width="880" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here we go, the above observation shows that the probability of a positive class is higher(0.47) compared to other classes, and the cause of decision by words  &lt;code&gt;serikali&lt;/code&gt;, &lt;code&gt;mwisho&lt;/code&gt; and &lt;code&gt;vyema&lt;/code&gt;  way back in our previous visualization of top frequent words for the positive class to conclude the classifier decision.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--a08D6MGg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime-01.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--a08D6MGg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime-01.png" alt="https://blog.neurotech.africa/content/images/2022/07/temp-lime-01.png" width="880" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above observation shows that the probability of a neutral class is higher(0.72) compared to the other two classes, and the cause of the decision comes from words &lt;code&gt;walimu&lt;/code&gt;, &lt;code&gt;walikuwa&lt;/code&gt;, and &lt;code&gt;mwanzoni&lt;/code&gt; .&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Dpx2p333--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime02_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Dpx2p333--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/temp-lime02_.png" alt="https://blog.neurotech.africa/content/images/2022/07/temp-lime02_.png" width="880" height="224"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above observation shows that both of the three classes weigh comparable but due to the high weighting of the word &lt;code&gt;polisi&lt;/code&gt;  the tweet predicted a negative class.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How companies can benefit from customer sentiment analysis?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Sentiment analysis can help to understand the potential of customers to see an overview of what’s good, and what’s lacking. This can help to improve the strategy of marketing and operations based on customer sentiments.&lt;/p&gt;

&lt;p&gt;The power of deep insights from sentiment can help capture what specifically people don’t like about the service, product, or policy and after the business has taken steps to fix the issue, or improve a process, also can track how that has improved customer satisfaction. Insights from customer sentiments can also differentiate between feedback that is frequent and feedback that influences satisfaction scores.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final thoughts&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Understanding the cause of the decision of individual predictions from classifiers is important for data professionals. Having explanations lets you make an informed decision about how much you trust the prediction or the model as a whole, and provides insights that can be used to improve the model.&lt;/p&gt;

&lt;p&gt;The complete code used in this article can be found on the GitHub &lt;a href="https://github.com/Neurotech-HQ/Cause-of-decision-in-Swahili-sentiments"&gt;repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yjfnx6Ed--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wml0uak86qdh133yfty2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yjfnx6Ed--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wml0uak86qdh133yfty2.jpg" alt="Image description" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>machinelearning</category>
      <category>datascience</category>
    </item>
    <item>
      <title>How conversational AI is transforming the Insurance industry</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 26 Jul 2022 05:51:00 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/how-conversational-ai-is-transforming-the-insurance-industry-22kd</link>
      <guid>https://dev.to/neurotech_africa/how-conversational-ai-is-transforming-the-insurance-industry-22kd</guid>
      <description>&lt;p&gt;This article was originally published in the &lt;a href="https://blog.neurotech.africa/how-can-conversational-ai-impact-the-insurance-industry/" rel="noopener noreferrer"&gt;Neurotech Africa blog&lt;/a&gt; post&lt;/p&gt;

&lt;p&gt;Everyday technology continues to evolve from different angles, in this blog post I will explain through leveraging the power of conversational AI can make a difference in the insurance industry.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;About the Insurance sector:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The insurance sector is made up of companies that offer risk management in the form of insurance contracts. The basic concept of insurance is that one party, the insurer, will guarantee payment for an uncertain future event. Meanwhile, another party, the insured or the policyholder, pays a smaller premium to the insurer in exchange for that protection on that uncertain future occurrence.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;According to the 2020 Tanzania insurance report, “the Tanzania insurance sector is growing steadily, with 30 insurance companies and 112 insurance brokers currently active in the market (2014 TIRA data)”.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Through the statistics you can realize that the contribution of insurance to the National Gross Domestic Product remains very limited, paving the way for plenty of room for further growth.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Digital transformation in the Insurance industry&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Digital_transformation" rel="noopener noreferrer"&gt;Digital transformation&lt;/a&gt; varies across multiple industries, but the worth truth is that 70% of digital transformation fails in the sense that they don’t meet their objectives, this is based on studies from International Data Group. The fun fact is that a company or an industry can’t be fully digital transformed at once but better be staged. May begin with system operational to employees to be aware of what transformation is capable of and how their contribution can improve the whole process of adopting digital transformation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.neurotech.africa%2Fcontent%2Fimages%2F2022%2F07%2Fdigital-insurance.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.neurotech.africa%2Fcontent%2Fimages%2F2022%2F07%2Fdigital-insurance.svg" alt="https://blog.neurotech.africa/content/images/2022/07/digital-insurance.svg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source: tibco.com&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The insurance industry is among the oldest financial businesses in the world. In fact, the industry tends to stay traditional and is slow to change, however, new technology trends have been impacting the insurance marketplace, creating extreme competition. The immense experience was during Covid-19 when insurance companies found themselves in the middle of the storm. A couple of times operations were done remotely and at the same time, they were fielding calls about changing coverage. Answering questions about business interruption policies and continuing to pay claims for life, health, and disability insurance.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Need of accelerating digital transformation in the insurance industry&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Digital transformation will help the insurance industry to solve some challenges and improve its business strategies. Let me just highlight some of the potentials of accelerating digital transformations:-&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer experience: spend enough time getting to know the customer and figuring out what it is they want and how to respond to it. This bit of the process should occur throughout the customer lifecycle, from prospecting until the moment of withdrawal.&lt;/li&gt;
&lt;li&gt;Value generation through data: data-driven is essential for decision-making in insurance companies. Understanding how you want to use data in a way that you can create value is important, and this experience can be influenced by executives to normal employees. By doing that, it will be possible to determine the various uses of data.&lt;/li&gt;
&lt;li&gt;Ecosystem development: the process of redesigning insurance strategies involves a couple of tasks like measuring, controlling, and assessing risks all of these are being transformed by the digital environment, and the leading insurance market leaders are aware of this. Understanding the ecosystem to know how strategies can be applied to regions or branches depending on the scenarios rather than using them because perform well around the city or elsewhere.&lt;/li&gt;
&lt;li&gt;Margin management: the digital transformation of the insurance business can only do one of two things: either reduce costs or increase them. Either way, it hinges on making the right decisions and then adopting new technologies to create business models based on those decisions&lt;/li&gt;
&lt;li&gt;Multichannel Strategy: using several channels means that your brand will utilize two or more marketing methods to share your content and messaging across several platforms. In simple terms, a multichannel strategy makes it easier for consumers to complete their sale transactions and interact with brands through the most suitable platforms.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Conversational AI use-case in the insurance industry&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Manage internal operations: through automation and speeding up repetitive tasks, employees can focus on more complicated tasks and further on developing their skills to improve operations.&lt;/li&gt;
&lt;li&gt;Customer awareness and education: conversation AI can bring closer customer awareness and education on how the process works, benefits, availability of offers, and compare as well as suggest the optimal policy, from multiple carriers, based on the customer’s profile and inputs. But also the engagement and interaction with customers, this can be through websites or social media platforms.&lt;/li&gt;
&lt;li&gt;Risk evaluation: leveraging conversational AI can improve the ways and processes of taking control of the data overwhelming to assess risks with high accuracy, better insights understanding, plans customizations, and make better decisions.&lt;/li&gt;
&lt;li&gt;Claims management: this involves claim processing and payment assistance, conversational AI can be trained to address your customer’s insurance claims and also follow up with them on the existing ones. But how about automating payment processes according to the preferences of customers.&lt;/li&gt;
&lt;li&gt;Customer feedback and reviews: most customers tend to share their feedback immediately after service apart from that it is rarely. Most studies suggest that customers are more likely to respond over live chat than email, and they do feel confident and well to contact the business through message rather than calls.&lt;/li&gt;
&lt;li&gt;Fraudulent prevention: Insurance firms must take care of customer data privacy and security. Conversational AI is efficient in monitoring and detecting warning signs of fraudulent activity and can alert both the insurer and the customer.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;About Neurotech’s conversational AI solutions&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.neurotech.africa/" rel="noopener noreferrer"&gt;Neurotech&lt;/a&gt; we are an AI company that builds &lt;a href="https://www.neurotech.africa/#services" rel="noopener noreferrer"&gt;solutions&lt;/a&gt; for businesses currently we do develop conversational AI for business needs which are controlled by our internal engine goes by the name &lt;a href="https://sarufi.io/#_" rel="noopener noreferrer"&gt;Sarufi&lt;/a&gt;. We offer custom solutions to fit various business needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it is useful?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our conversational ai solutions can provide seamless customer support across multiple platforms, enabling you to offer a more personalized, contextual service to customers, and you can explore more from &lt;a href="https://blog.neurotech.africa/how-can-neurotech-transform-your-business-with-conversational-ai/" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;strong&gt;.&lt;/strong&gt;  Our solutions ****are developed in such a way that can understand the contextual meaning of the interaction or conversation with targeted audiences, Our custom chatbots can be deployed on social media platforms like Whatsapp, Facebook, Instagram, and Telegram. This depends on what our customers need.&lt;/p&gt;

&lt;p&gt;Currently, our solutions can work in two languages only Swahili and English, it can help out your business with customer support, save on labor costs and instead pay fewer support employees fair wages without being stretched to support a large staff, increase revenue and build opportunities with every customer interaction note that 😊.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final thoughts&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Is there a need of customizing customer experience in the insurance industry? Absolutely yes, innovation in the insurance industry with conversational AI in transforming the entire cycle of processes such as claims, can help to improve the awareness and the education of a large population with less cost money, and effort. Through conversational AI  will ensure faster settlements and optimized customer experiences, leading to improved risk evaluation with new technologies like Machine learning and Artificial intelligence in making appropriate decisions, ensuring personalized and customized customer services and experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.neurotech.africa%2Fcontent%2Fimages%2F2022%2F07%2Fthankyou-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.neurotech.africa%2Fcontent%2Fimages%2F2022%2F07%2Fthankyou-1.jpg" alt="https://blog.neurotech.africa/content/images/2022/07/thankyou-1.jpg"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>How can neurotech Africa transform your business with Conversational AI</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Sun, 17 Jul 2022 20:09:33 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/how-can-neurotech-africa-transform-your-business-with-conversational-ai-13p4</link>
      <guid>https://dev.to/neurotech_africa/how-can-neurotech-africa-transform-your-business-with-conversational-ai-13p4</guid>
      <description>&lt;p&gt;This article was originally published on the &lt;a href="https://blog.neurotech.africa/how-can-neurotech-transform-your-business-with-conversational-ai/"&gt;neurotech&lt;/a&gt; blog post&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How does Neurotech use Conversational AI?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We build custom conversational solutions to help businesses improve their customer experiences and services with our internal tool, which goes by the name &lt;a href="https://sarufi.io/"&gt;Sarufi&lt;/a&gt;. The best thing about our solution we use Natural Language Processing to provide a more conversational approach to customer service and a deeper understanding of the context of what people say depending on the industry of the business.&lt;/p&gt;

&lt;p&gt;As per use-case, our approaches differ depending on the customer specifications. With our conversational AI solutions, you can get access to incredibly intelligent control of the market of your business without needing to invest the time, money, and resources to train to build the solutions with the internal team.&lt;/p&gt;

&lt;p&gt;Our solutions can be deployed across a range of platforms starting with a website if you have, social platforms like WhatsApp, telegram, Instagram, and Facebook messenger. This can depend on where the client prefers to host their business. At &lt;a href="https://www.neurotech.africa/#"&gt;Neurotech&lt;/a&gt;, we offer full support of our solutions from our talented team to make sure that our clients' businesses benefit from what we offer. This helps make sure you’re getting the most value out of a conversational AI solution for your business.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How can Neurotech transform your business?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Our experts and Sarufi engine provide fast and easy deployment of solutions. With our solution, we transform everything into a custom experience that will help your business to save costs and increase revenue, understand what is missing from your product's service, and keep in touch with your customers.&lt;/p&gt;

&lt;p&gt;Through user interaction with your business, you will be able to know better what works and what not working without using extreme effort.&lt;/p&gt;

&lt;p&gt;This is a more comfortable transformation simply because the service will be available 24/7 without paying any additional costs to employees, and customers able to instantiate conversation in their natural languages. This can be achieved through a couple of steps:-&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Our team of experts will work with the client to determine the requirements and the efficient way the conversational experience will be integrated into the business.&lt;/li&gt;
&lt;li&gt;Then, we work on building the solution by considering training models to act upon the inputs provided by consumers with continuous reviews of the results.&lt;/li&gt;
&lt;li&gt;In the final, we deploy the solution and offer to support and consulting services to our clients.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What are the benefits of conversational AI solutions?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Personalize customer experience:-&lt;/strong&gt; Businesses can provide a more personalized experience to both existing customers and potential clients by using conversational AI (such as chatbots) to create a deeper level of interactivity and familiarity with the brand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improving marketing experience:-&lt;/strong&gt; Conversational AI helps to improve marketing by creating a better experience for each customer, based on their needs and desires.A more convenient mode of communication because of the combination of various functionalities would make it convenient for customers across multiple channels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost-effective:-&lt;/strong&gt; Depending on their learnings and training techniques, they reduce the requirement of human resources to answer customer queries. They are also proficient in handling multiple chats simultaneously with accuracy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhance Operations beyond borders:-&lt;/strong&gt; Expand business outreach to the potential population.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-evolving platforms from experience:-&lt;/strong&gt; conversational AI learn from their experiences. The more they interact with human beings, the more quickly their intelligence improves. Also learn from any existing data, such as customer databases and previous customer interactions. Clever Conversational Interfaces learn from their mistakes just as human beings do. They take note of what questions the customers ask and what kinds of responses seem to be informative. They try new approaches until they find a way that is both effective and efficient.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insights driven:-&lt;/strong&gt; Conversational solutions make effective use of analytics, which essentially helps in gleaning data and information from outside the organization. A mix of both internal and external data can be a great advantage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Round-the-clock support:-&lt;/strong&gt; Conversational AI can provide real-time customer assistance. This means that businesses can address customer queries and complaints as they occur, significantly improving customer satisfaction. Provide 24/7 client support, so existing and potential customers can try and solve their problems after work hours and on weekends.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast-paced communication:-&lt;/strong&gt; can help businesses provide quicker and more efficient customer service. This is because chatbots can handle a large number of customer inquiries simultaneously. They can also route customers to the right agent, which reduces the wait time, and works 24/7/365, a huge advantage for businesses. Properly programmed chatbots are always polite and their behavior does not depend on the mood.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Conversational AI solutions are not perceived as a human replacement but rather as human augmentation, enhancing easier access to business both internally and externally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.neurotech.africa/#contact"&gt;Get in touch&lt;/a&gt; with Neurotech’s team to discover how you can benefit from our conversational solutions to boost your business, the time is now to leverage benefits from Artificial intelligence Technology.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6LodlsLa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/thankyou.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6LodlsLa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/07/thankyou.jpg" alt="https://blog.neurotech.africa/content/images/2022/07/thankyou.jpg" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Potentials of conversational AI for businesses</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Thu, 14 Jul 2022 20:15:30 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/potentials-of-conversational-ai-for-businesses-5c79</link>
      <guid>https://dev.to/neurotech_africa/potentials-of-conversational-ai-for-businesses-5c79</guid>
      <description>&lt;p&gt;This article was originally published on the &lt;a href="https://blog.neurotech.africa/potentials-about-conversational-ai-for-businesses/"&gt;Neurotech&lt;/a&gt; blog post.&lt;/p&gt;

&lt;p&gt;Speaking about the evolution of technology you can't skip mentioning artificial intelligence simply because in our day-to-day activities we do interact with the technology mostly even without knowing that we do. If you own a smartphone, laptop, smartwatches, desktop, and so many devices yes you do interact with artificial intelligence or use it to accomplish some of your tasks such as google search, Camera, meeting platforms like zoom, Google spreadsheets, Microsoft Cortana, Apple Siri, Google Assistant, Google map, Apple map, Google lens, social media interaction, etc. The scope of artificial intelligence has expanded and evolved over time. So, it is time to think about how you can leverage this technology to improve the revenue of your business in this article I will highlight the potential of conversational artificial intelligence for businesses.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;About conversational AI&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Conversational AI involves three concepts artificial intelligence, human language, and automation. We can define it as the type of artificial intelligence that enables consumers to interact with computer applications the way they would with other humans. Conversational AI has primarily taken the form of advanced chatbots that contrast with conventional chatbots and combines natural language processing with traditional software, voice assistants, or an interactive voice recognition system to help customers through either a spoken or typed conversation interface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--h6tfXXjE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.appmaster.io/api/_files/ooRtJGmcZqEaSfTL468d8U/download/" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--h6tfXXjE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.appmaster.io/api/_files/ooRtJGmcZqEaSfTL468d8U/download/" alt="conversational chatbot" width="880" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How does conversational AI work?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Conversational AI involves three main key components which are:-&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Natural language processing&lt;/li&gt;
&lt;li&gt;Algorithm Training and Machine Learning&lt;/li&gt;
&lt;li&gt;Sentiment Analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Through the conversational interface, a user provides inputs either through voice or text. For text-based inputs requires &lt;a href="https://en.wikipedia.org/wiki/Natural-language_understanding"&gt;NLU&lt;/a&gt; to understand the contexture meaning of the inputs and the case of speech-based inputs requires &lt;a href="https://usabilitygeek.com/automatic-speech-recognition-asr-software-an-introduction/"&gt;ASR&lt;/a&gt;  to parse audio into language tokens that can be analyzed. After that best option is answered to a user response this depends on how trained and programmed to perform tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Use cases of conversational AI:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Customer service:&lt;/strong&gt; conversational AI has been extreme in this industry through the automation of customer support activities to improve access and reduce costs. Activities such as travel booking, FAQs, supporting customers to bill something, and handling complaints. Also, conversational AI is interesting to handle surveys with your customer to understand what they feel about what you provide or even if there is a new product.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retail industry:&lt;/strong&gt; speaking of lead generation, lead qualification, and lead nurturing to 24/7 concierge service, faster order fulfillment, amplifying marketing messages, and more can be with conversational AI. In the retail field, things can go more advanced with the recommendation of products to customers, and multichannel integrations to follow your customer to the platform they love to use like WhatsApp, Facebook, Instagram, and Tiktok. But last is being able to serve your customer anytime they want service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Finance and Banking industry:&lt;/strong&gt; Conversational AI has greatly helped banking and financial services reduce operating costs, automate functions, and improve the overall customer experience. In accessing and analyzing users’ spending patterns or bank accounts to help them decide how to spend their money, resolve customer queries by automating repetitive processes that typically take a human agent much longer, and through AI bot help in checking balances and detecting fraudulent transactions, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health industry:&lt;/strong&gt; In the case of handling schedules in hospital appointments conversational AI is being used for automating this process across the health industry which helps patients to manage their appointments and paperwork. The experience of Cognitive Behavioral Therapy, using conversational AI creates an immersive way to manage anxiety and other mental health issues.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sales and Marketing industry:&lt;/strong&gt; most consumers do prefer self-service technology for shopping experiences instead of human sales agents. Conversational AI generates and nurtures leads, optimizes the sales cycle, and gets and updates data instantly while maintaining accuracy with conversational automation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use cases of conversational AI are more than the few I just mentioned, these are just some of them you can explore more use cases from &lt;a href="https://www.chatcompose.com/conversationalai.html"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What are the impacts of conversational AI on your business?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Customer retention&lt;/li&gt;
&lt;li&gt;Customer personalization&lt;/li&gt;
&lt;li&gt;Get customer feedback in a seamless manner&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of this help to boost an increase in revenue and reduce costs with more accurate and timely marketing efforts while ensuring a seamless and pleasant experience for your customers. It is not enough to have chatbots on your website as a solution for customer support. Businesses need to have intelligent chatbots with natural language processing and understanding for the best customer support experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How Neurotech’s conversational ai solutions are best for your businesses?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Okay, &lt;a href="https://www.neurotech.africa/"&gt;Neurotech&lt;/a&gt; we are an AI company that builds &lt;a href="https://www.neurotech.africa/#services"&gt;solutions&lt;/a&gt; for businesses currently we do develop conversational AI for business needs which are controlled by our internal engine and go by the name &lt;a href="https://sarufi.io/#_"&gt;Sarufi&lt;/a&gt;. We offer custom solutions to fit various business needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it is useful?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our conversational ai solutions are developed in such a way that can understand the contextual meaning of the interaction or conversation with targeted audiences, Our custom chatbots can be deployed on social media platforms like Whatsapp, Facebook, Instagram, and Telegram. This depends on what our customers need. Currently, our solutions can work in two languages only Swahili and English, it can help out your business with customer support, increase revenue and build opportunities with every customer interaction note that 😊.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Final Thoughts&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Now think of your business, it is time to get more close to your customers using conversational AI not too late, by automating workflows for FAQs, and repetitive tasks that staff has to go after with conversational artificial intelligence. The worth truth is conversational AI continues to evolve, making itself absolutely necessary to various industries such as finance, online marketing, healthcare, real estate, customer support, retail, and more. But don't worry we have &lt;a href="https://sarufi.io/#_"&gt;Sarufi&lt;/a&gt; for your business needs, if you may be interested to have a discussion with &lt;a href="https://www.neurotech.africa/#contact"&gt;Neurotech&lt;/a&gt; don't hesitate to reach out we do consult depending on what would be best for your business challenges.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--n8WK959P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://t3.ftcdn.net/jpg/04/48/13/40/240_F_448134055_3ygLHIrGKhm176wZnoRvDaY1iqljzVdZ.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--n8WK959P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://t3.ftcdn.net/jpg/04/48/13/40/240_F_448134055_3ygLHIrGKhm176wZnoRvDaY1iqljzVdZ.jpg" alt="thank you" width="390" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nlp</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>GET STARTED WITH TOPIC MODELLING USING GENSIM IN NLP</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Wed, 25 May 2022 03:49:49 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/get-started-with-topic-modelling-using-gensim-in-nlp-1b4g</link>
      <guid>https://dev.to/neurotech_africa/get-started-with-topic-modelling-using-gensim-in-nlp-1b4g</guid>
      <description>&lt;h3&gt;
  
  
  &lt;strong&gt;INTRODUCTION&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;As one application of NLP  &lt;strong&gt;Topic modeling is&lt;/strong&gt; being used in many business areas to easily scan a series of documents, find groups of words (Topics) within them, and automatically &lt;strong&gt;cluster&lt;/strong&gt; word groupings, this has saved time and reduced costs.&lt;/p&gt;

&lt;p&gt;In this article, you're going to learn how to implement topic modeling with  &lt;strong&gt;Gensim&lt;/strong&gt;, hope you will enjoy it, let's get started.&lt;/p&gt;

&lt;p&gt;Have you ever wondered how hard is to process 100000 documents that contain 1000 words in each document? , that means it takes 100000 * 1000 =100000000 threads to process all documents. This can be hard, time, and memory-consuming if done manually, that's  where &lt;strong&gt;Topic modeling&lt;/strong&gt; comes into play as it allows to programmatically achieve all of that, and that's what you're going to learn in this article&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;WHAT IS TOPIC MODELLING?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Topic Modelling&lt;/strong&gt; can be easily defined as the statistical and unsupervised classification method that involves different techniques such as Latent Dirichlet Allocation (LDA) topic model to easily discover the topics and also recognize the words in those topics present in the documents. This saves time and provides an efficient way to understand the documents easily based on the topics.&lt;/p&gt;

&lt;p&gt;Topic modeling has many &lt;strong&gt;applications&lt;/strong&gt; ranging from sentimental analysis to recommendation systems. consider the below diagram for other applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lpGgpa3H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/topic_modelling.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lpGgpa3H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/topic_modelling.png" alt="https://blog.neurotech.africa/content/images/2022/02/topic_modelling.png" width="773" height="578"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;applications of topic modeling -&lt;a href="https://medium.com/@fatmafatma/industrial-applications-of-topic-model-100e48a15ce4"&gt;source&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now that you have a clear understanding of what the topic modeling means, Let's see how to achieve it with Gensim,  But wait someone there asked what is &lt;strong&gt;Gensim&lt;/strong&gt;?&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;WHAT IS GENSIM?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Well, Gensim is a short form for the &lt;strong&gt;general similarity&lt;/strong&gt; that is &lt;strong&gt;Gen&lt;/strong&gt; from &lt;em&gt;generating&lt;/em&gt; and &lt;strong&gt;sim&lt;/strong&gt; from &lt;em&gt;similarity&lt;/em&gt;, it is an open-source fully specialized python library written by &lt;strong&gt;Radim Rehurek&lt;/strong&gt; to represent document vectors as efficiently(computer-wise) and painlessly(human-wise) as possible.&lt;/p&gt;

&lt;p&gt;Genism is designed to be used in Topic modeling tasks to extract semantic topics from documents, Genism is your tool in case you're want to process large chunks of textual data, it uses algorithms like  &lt;em&gt;Word2Vec&lt;/em&gt;, &lt;em&gt;FastText&lt;/em&gt;, &lt;em&gt;Latent Semantic Indexing&lt;/em&gt; (LSI, LSA, LsiModel), &lt;strong&gt;Latent Dirichlet Allocation&lt;/strong&gt; (LDA, LdaModel) internally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JX3sIa4w--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/gensim_history-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JX3sIa4w--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/gensim_history-1.png" alt="https://blog.neurotech.africa/content/images/2022/02/gensim_history-1.png" width="785" height="578"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gensim history - source &lt;a href="https://radimrehurek.com/"&gt;Radim Rehurek&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;WHY GENSIM?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;It has efficient, implementations for various vector space algorithms as mentioned above.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It also provides similarity queries for documents in their semantic representation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It provides I/O wrappers and converters around several popular data formats.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Gensim is so fast, because of its design of data access and implementation of numerical processing.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;HOW TO USE GENSIM FOR TOPIC MODELLING IN NLP.&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We have come to the meat of our article, so grab a cup of coffee, fun playlists from your computer with Jupyter Notebook opened ready for hands-on. let's start.&lt;/p&gt;

&lt;p&gt;In this section, we'll see the practical implementation of the &lt;strong&gt;Gensim&lt;/strong&gt; for Topic Modelling using the &lt;strong&gt;Latent Dirichlet Allocation&lt;/strong&gt; (LDA) Topic model,&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Installation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here we have to install &lt;a href="https://radimrehurek.com/gensim/"&gt;the gensim library&lt;/a&gt; in a jupyter notebook to be able to use it in our project, consider the code below;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;! pip install --upgrade gensim
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Loading the datasets and importing important libraries&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We are going to use an open-source dataset containing the news of millions of headlines sourced from the reputable Australian news source ABC (Australian Broadcasting Corporation)Agency Site: (&lt;a href="http://www.abc.net.au/"&gt;ABC&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The datasets contain two columns that are publish_date and headlines_texts column with millions of the headlines.&lt;/p&gt;

&lt;p&gt;Consider the below code for importing the required libraries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#importing library
import pandas as pd #loading dataframe
import numpy as np  #for mathematical calculations

import matplotlib.pyplot as plt #visualization
import seaborn as sns #visualization
import zipfile #for extracting the zip file datasets

import gensim #library for topic modelling
from gensim.models import LdaMulticore
from gensim.utils import simple_preprocess
from gensim.parsing.preprocessing import STOPWORDS

import nltk   #natural language toolkit for preprocessing the text data

from nltk.stem import WordNetLemmatizer   #used to Lemmatize using WordNet's    #built-in morphy function.Returns the input word unchanged if it cannot #be found in WordNet.

from nltk.stem import SnowballStemmer #used for stemming in NLP
from nltk.stem.porter import * #porter stemming

from wordcloud import WordCloud #visualization techniques for #frequently repeated texts

nltk.download('wordnet')  #database of words in more than 200 #languages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IvcCub_b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture1-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IvcCub_b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture1-1.png" alt="https://blog.neurotech.africa/content/images/2022/02/capture1-1.png" width="631" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, we have managed to install &lt;strong&gt;Gensim&lt;/strong&gt; and import the supporting libraries into our working environment, consider the below codes for  installation of the other libraries if not installed yet in your jupyter notebook,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;! pip install nltk       #installing nltk library
! pip install wordcloud  #installing wordcloud library
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After successful importing the above libraries, let's now extract the zip datasets into a folder named data_for_Topic_modelling as shown on the below codes;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Extracting the Datasets
with zipfile.ZipFile("./abcnews-date-text.csv.zip") as file_zip:
    file_zip.extractall("./data_for_Topic_modelling")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nice, we have successfully unzipped the data from zip file libraries that we imported above, remember? , Now let's load the data into a variable called &lt;em&gt;data&lt;/em&gt;, since the datasets have more than millions of news for this tutorial we are going to use 500000 rows using slicing techniques in python language of the headline news from ABC.&lt;/p&gt;

&lt;p&gt;consider the code below for doing that;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#loading the data
#Here we have taken 500,000 rows of out dataset for implementation

data=pd.read_csv("./data_for_Topic_modelling/abcnews-date-text.csv")
data=data[:500000] #500000 rows taken
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;EDA and processing the data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nice, after having the data on our variable named data as above shown from code, we have to check how it looks like hence EDA means exploratory data analysis and hence we will do some processing the data to make sure we have dataset ready for the algorithm to be trained,&lt;/p&gt;

&lt;p&gt;Here in the code below, we have used the &lt;em&gt;.head()&lt;/em&gt; function that prints the first five rows from the datasets, this helps us to know the structure of the data and hence confirmed it is of texts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Checking the first columns
data.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RXIYaOeS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RXIYaOeS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture2.png" alt="https://blog.neurotech.africa/content/images/2022/02/capture2.png" width="582" height="180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here we try to check the shape of the &lt;em&gt;dimension&lt;/em&gt; of the dataset and hence confirmed we have the rows that we selected at the start of loading the data, hence, pretty ready to go.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#checking the shape
#as you see there are 500000 the headline news as the rows we selected above.

data.shape
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8pfvOxBX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8pfvOxBX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture3.png" alt="https://blog.neurotech.africa/content/images/2022/02/capture3.png" width="224" height="56"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, we have to delete the &lt;strong&gt;publish_date&lt;/strong&gt; column from the dataset using the keyword &lt;strong&gt;del&lt;/strong&gt; as shown below codes, &lt;strong&gt;why?&lt;/strong&gt; because we don't want it our main focus is to model the topics according to the document that has a lot of headline news, so we consider the headline _text column.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Deleting the publish data column since we want only headline_text #columns.

del data['publish_date']

#confirm deleteion
data.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--k0SC3czz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--k0SC3czz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/capture5.png" alt="https://blog.neurotech.africa/content/images/2022/02/capture5.png" width="442" height="172"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we have remained with our important column which is headline_text as seen above, and here we now using &lt;strong&gt;wordcloud&lt;/strong&gt; to get a look at the most frequently appearing words from our datasets in headline_text columns, this increase more understanding about the datasets, consider the code below&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#word cloud visualization for the headline_text
wc = WordCloud(
    background_color='black',
    max_words = 100,
    random_state = 42,
    max_font_size=110
    )
wc.generate(' '.join(data['headline_text']))
plt.figure(figsize=(50,7))
plt.imshow(wc)
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KaZqOS0c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/c6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KaZqOS0c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/c6.png" alt="https://blog.neurotech.africa/content/images/2022/02/c6.png" width="761" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hereafter visualizing the data, we process the data by starting with &lt;strong&gt;stemming&lt;/strong&gt;, which is simply the process of reducing a word to its word &lt;strong&gt;stem&lt;/strong&gt; that is to say affixes to suffixes and prefixes or to the roots of words known as a &lt;strong&gt;lemma.&lt;/strong&gt; Example &lt;strong&gt;cared to care.&lt;/strong&gt; Here we are using the &lt;em&gt;snowballStemmer&lt;/em&gt; algorithm that we imported from &lt;strong&gt;&lt;a href="https://www.nltk.org/"&gt;nltk&lt;/a&gt;&lt;/strong&gt;, remember right?&lt;/p&gt;

&lt;p&gt;consider the below code function code;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#function to perform the pre processing steps on the  dataset
#stemming

stemmer = SnowballStemmer("english")
def lemmatize_stemming(text):
    return stemmer.stem(WordNetLemmatizer().lemmatize(text, pos='v'))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then continue to &lt;strong&gt;tokenize&lt;/strong&gt; and &lt;strong&gt;lemmatize,&lt;/strong&gt; where here we split the large texts in headline text into a list of smaller words that we call tokenization, and finally append the lemmatized word from the &lt;strong&gt;lemmatize_stemming&lt;/strong&gt; function above code to the result list as shown below;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Tokenize and lemmatize

def preprocess(text):
    result=[]
    for token in gensim.utils.simple_preprocess(text) :
        if token not in gensim.parsing.preprocessing.STOPWORDS and len(token) &amp;gt; 3:
            #Apply lemmatize_stemming on the token, then add to the results list
            result.append(lemmatize_stemming(token))
    return result
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then after the above steps, here we just call the &lt;strong&gt;preprocess()&lt;/strong&gt; function&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#calling the preprocess function above
processed_docs = data['headline_text'].map(preprocess)
processed_docs[:10]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uX3zMw4d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uX3zMw4d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image.png" alt="https://blog.neurotech.africa/content/images/2022/02/image.png" width="507" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create a dictionary from 'processed_docs' from gensim.corpora  containing the number of times a word appears in the training set, and call it a name it a dictionary, consider below code&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; dictionary = gensim.corpora.Dictionary(processed_docs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, after having a dictionary from the above code, we have to implement &lt;strong&gt;bags of words model&lt;/strong&gt; (BoW),  &lt;strong&gt;BoW&lt;/strong&gt; is nothing but a representation of the text that shows the occurrence of the words that are within the specified documents, this keeps the word count only and discard another thing like order or structure of the document, Therefore we will create a sample document called document_num and assigned a value of 4310.&lt;/p&gt;

&lt;p&gt;Note: you can just create any sample document of your own,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Create the Bag-of-words(BoW) model for each document
document_num = 4310
bow_corpus = [dictionary.doc2bow(doc) for doc in processed_docs]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checking Bag of Words corpus for our sample document that is  (token_id, token_count)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bow_corpus[document_num]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yjH73YRP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yjH73YRP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-2.png" alt="https://blog.neurotech.africa/content/images/2022/02/image-2.png" width="662" height="39"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modeling using LDA (Latent Dirichlet Allocation) from bags of words above&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have come to the final part of using LDA which is LdaMulticore for fast processing and performance of the model from Gensim to create our first topic model and save it&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Modelling part
lda_model = gensim.models.LdaMulticore(bow_corpus,
                                       num_topics=10,
                                       id2word = dictionary,
                                       passes = 2,
                                       workers=2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each topic, we will explore the words occurring in that topic and their relative weight&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Here it should give you a ten topics as example shown below image
for idx, topic in lda_model.print_topics(-1):
    print("Topic: {} \nWords: {}".format(idx, topic))
    print("\n")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--S8rq5VhD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--S8rq5VhD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-6.png" alt="https://blog.neurotech.africa/content/images/2022/02/image-6.png" width="743" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's finish with performance evaluation, by checking which topics the test document  that we created  earlier belongs to, using LDA bags of word model, consider the code below&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Our test document is document number 4310
for index, score in sorted(lda_model[bow_corpus[document_num]], key=lambda tup: -1*tup[1]):
    print("\nScore: {}\t \nTopic: {}".format(score, lda_model.print_topic(index, 10)))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--06K8aiSZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--06K8aiSZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/image-5.png" alt="https://blog.neurotech.africa/content/images/2022/02/image-5.png" width="880" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Congrats! if you have managed to reach the end of this article, as you see above we have implemented a successful model using LDA from the Gensim library using bags of the words to easily model the topics present in the documents with 500,000 headline news. The full codes and datasets used can be found &lt;strong&gt;&lt;a href="https://github.com/sarufi-io/Topic-Modelling-With-Gensim"&gt;here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Relationship Between Neurotech and Natural Language Processing(NLP)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Natural Language Processing is a powerful tool when your solve business challenges, associating with the digital transformation of companies and startups. &lt;a href="https://sarufi.io/"&gt;Sarufi&lt;/a&gt; and &lt;a href="https://www.neurotech.africa/#services"&gt;Neurotech&lt;/a&gt; offer high-standard solutions concerning conversational AI(chatbots). Improve your business experience today with NLP  &lt;a href="https://sarufi.io/solutions"&gt;solutions&lt;/a&gt; from experienced technical expertise.&lt;/p&gt;

&lt;p&gt;Hope you find this article useful, sharing is caring.&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>gensim</category>
      <category>topicmodeling</category>
      <category>getstarted</category>
    </item>
    <item>
      <title>Swahili Text Classification Using Transformers</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 24 May 2022 14:56:00 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/swahili-text-classification-using-transformers-4850</link>
      <guid>https://dev.to/neurotech_africa/swahili-text-classification-using-transformers-4850</guid>
      <description>&lt;p&gt;I will not explain how transformer models work but to show their applications on multilingual use-cases, I will be using Swahili datasets to train &lt;a href="https://towardsdatascience.com/multilingual-transformers-ae917b36034d"&gt;multilingual&lt;/a&gt; transformers and you can access the data from &lt;a href="https://zindi.africa/competitions/swahili-news-classification/data"&gt;here&lt;/a&gt; to understand the problem collected to solve.&lt;/p&gt;

&lt;p&gt;I assume that the reader has prior knowledge of classical Machine learning and Deep learning.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What are transformers?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Transformers are transfer learning deep learning models that are trained with a large set of datasets to perform different uses cases in the field of Natural Language Processing such as text classification, question answering, machine translation, speech recognition, and so on.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;According to Wikipedia, Transformers are deep learning model that adopts the mechanism of self-attention, deferentially weighting the significance of each part of the input data. Are used primarily in the fields of Natural Language Processing(NLP) and Computer Vision(CV).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Training Deep Learning models with small datasets is more likely to face &lt;a href="https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/#:~:text=Overfitting%20in%20Machine%20Learning&amp;amp;text=Overfitting%20happens%20when%20a%20model,as%20concepts%20by%20the%20model."&gt;overfitting&lt;/a&gt; because Deep Learning is essential for complex pattern recognition with a lot of parameters, they do require large datasets to perform/generalize well with the nature of challenges.&lt;/p&gt;

&lt;p&gt;The advantage of using transformers is training over a large corpus of data from &lt;a href="https://www.wikipedia.org/"&gt;Wikipedia&lt;/a&gt; and different book collections which makes things smooth when you what to validate this model with the new set of data for your specific problem. Some of the techniques I know when it comes to working with other challenges using datasets that are not in English such as Arabic sentiments, Swahili sentiments, German sentiments, and so on.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Translate your data(data of your own language) into English then solve the challenge using English-trained models. This depends on the nature of the problem you're trying to solve or if you think there is a perfect model to perform the translation task.&lt;/li&gt;
&lt;li&gt;Augment your training data(data of your own language). This can be done by taking a large English dataset then translating it into your own language to cope with the nature of the challenge you're solving then combine with the small data of your own language and after fine-tune the transformers models.&lt;/li&gt;
&lt;li&gt;The last technique is to retrain the large models(transformer) with the data of your own language most known as &lt;strong&gt;&lt;a href="https://en.wikipedia.org/wiki/Transfer_learning"&gt;Transfer Learning&lt;/a&gt; in Natural Language Processing&lt;/strong&gt; and this is the technique I will be working with today.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's dive into the main topic of this article, we are going to train a transformer model for Swahili news classification, Since transformers are large to make the task simple we need to select a wrapper to work with, if you are good with &lt;a href="https://pytorch.org/"&gt;PyTorch&lt;/a&gt; you can use &lt;a href="https://www.pytorchlightning.ai/"&gt;PyTorch Lightning&lt;/a&gt; a wrapper for high-performance AI research, to wrap the transformers but today lets go with k&lt;a href="https://pypi.org/project/ktrain/"&gt;train&lt;/a&gt; from &lt;a href="https://www.tensorflow.org/"&gt;Tensorflow&lt;/a&gt; Python Library.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ktrain is a lightweight wrapper for the deep learning library TensorFlow Keras (and other libraries) to help build, train, and deploy neural networks and other machine learning models. Inspired by ML framework extensions like fastai and ludwig, ktrain is designed to make deep learning and AI more accessible and easier to apply for both newcomers and experienced practitioners. With only a few lines of code, ktrain allows you to easily and quickly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you want to gain more about ktrain &lt;a href="https://towardsdatascience.com/ktrain-a-lightweight-wrapper-for-keras-to-help-train-neural-networks-82851ba889c"&gt;here&lt;/a&gt; we go.&lt;/p&gt;

&lt;p&gt;I recommend using &lt;a href="https://colab.research.google.com/"&gt;google colab&lt;/a&gt; or &lt;a href="https://www.kaggle.com/code"&gt;kaggle kernel&lt;/a&gt;, in order to make the task simple and computational power from these platforms can make things go in a smooth way.&lt;/p&gt;

&lt;p&gt;The first task is to update pip and install ktrain in you're working environment, This can be done simply by running a few commands below&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# update pip package &amp;amp; install ktrain&lt;/span&gt;

&lt;span class="o"&gt;!&lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-U&lt;/span&gt; pip
&lt;span class="o"&gt;!&lt;/span&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ktrain
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then import all the required libraries for preprocessing text data and other computational purposes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;seaborn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;string&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;warnings&lt;/span&gt;
&lt;span class="n"&gt;warnings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filterwarnings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ignore"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's time to load our dataset and inspect how they look like which features are contained with it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# loading datasets from data dir
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"data/Train.csv"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Total Records
&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Total Records: "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# preview data from top
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wNPzxf41--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/dt-inspect.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wNPzxf41--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/dt-inspect.png" alt="https://blog.neurotech.africa/content/images/2022/02/dt-inspect.png" width="880" height="209"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;data preview in first 5 rows&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can see the datasets have 3 columns ID as a unique identification of each sentiment/news, content which contains the news, and category which contain labels of each specific news/sentiment. Then let's visualize the target column to understand the distribution of each class news from the dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Let's visualize the Label distiributions using seaborn
&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;countplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'category'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DISTRIBUTION OF LABELS"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rOweiZI9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/classes.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rOweiZI9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/classes.png" alt="https://blog.neurotech.africa/content/images/2022/02/classes.png" width="880" height="334"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Labels distribution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can see, the dataset we are working on contains 5 classes(labels) and is not well balanced because the majority of the news collected is of &lt;code&gt;kitaifa&lt;/code&gt;  while a minority of them were of &lt;code&gt;kimataifa&lt;/code&gt; and &lt;code&gt;Burudani&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then we can consider first cleaning text before fitting to our transformer by removing punctuation, removing digits, converting to lower case, removing unnecessary white spaces, removing emojis, removing stopwords, Tokenization, and so on. I created the &lt;code&gt;clean_text&lt;/code&gt; function to perform such tasks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# function to clean text
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;'''
        function to clean content column, make it ready for transformation and modeling
    '''&lt;/span&gt;
    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                &lt;span class="c1"&gt;#convert text to lower-case
&lt;/span&gt;    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'â€˜'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# remove the text â€˜ which appears to occur flequently
&lt;/span&gt;    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'[‘’“”…,]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# remove punctuation
&lt;/span&gt;    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'[()]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;#remove parentheses
&lt;/span&gt;    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[^a-zA-Z]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;#remove numbers and keep text/alphabet only
&lt;/span&gt;    &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# remove repeated characters (tanzaniaaaaaaaa to tanzania)
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, it is time now to apply our &lt;code&gt;clean_text&lt;/code&gt; to the content&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Applying our clean_text function on contents
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YcMl-hvE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/clean_text.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YcMl-hvE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/clean_text.png" alt="https://blog.neurotech.africa/content/images/2022/02/clean_text.png" width="880" height="193"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can notice the difference in the &lt;code&gt;content&lt;/code&gt; of our datasets.&lt;/p&gt;

&lt;p&gt;Then let's split datasets into training and validation sets, training set will be used to train our model and the validation set for model validation of its performance on new data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s"&gt;'category'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;SEED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2020&lt;/span&gt;
&lt;span class="n"&gt;df_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frac&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SEED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;85% of the datasets will be used to train the model and 15% will be used to validate our model to see if will generalize well and evaluate the performance. Always don't forget to set seed value to help get predictable, repeatable results every time. If we do not set a seed, then we get different random numbers at every invocation.&lt;/p&gt;

&lt;p&gt;With the Transformer API in ktrain, we can select any &lt;a href="https://huggingface.co/"&gt;Hugging Face&lt;/a&gt; transformers model appropriate for our data. Since we are dealing with Swahili, we will use &lt;strong&gt;&lt;a href="https://huggingface.co/bert-base-multilingual-cased"&gt;multilingual BERT&lt;/a&gt;&lt;/strong&gt;  which is normally used by ktrain for non-English datasets in the alternative text_classifier API in ktrain. But you can opt for any other multilingual transformer model.&lt;/p&gt;

&lt;p&gt;Let's import ktrain and set some common parameters for our model, the important thing is to specify which transformer model are you going to use, then make sure that is compatible with the problem you're solving. Let's set our transformer model to &lt;strong&gt;&lt;a href="https://huggingface.co/bert-base-multilingual-uncased"&gt;bert-base-multilingual-uncased&lt;/a&gt;&lt;/strong&gt; and parameters such as&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;MAXLEN&lt;/strong&gt; specifies to consider the first 128 words of each news content. This can be adjusted depending on the computational power of your machine, be careful with it whenever you set high means you want to cover large content and if your machine is not capable to handle such computation it will cry with &lt;code&gt;Resource exhausted error&lt;/code&gt;, so make sure to take this into consideration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;batch_size&lt;/strong&gt; as the number of training examples utilized in one iteration, let's use 32&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;learning_rate&lt;/strong&gt; as the amount that the weights are updated during training, let's use learning_rate of 5e-5, also you can adjust to see how your model can behave, small it is recommended.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;epochs&lt;/strong&gt; the number of passes of the entire training dataset the algorithm has to be completed, let's use 3 epochs for now so that our model will use a few minutes.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;ktrain&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;ktrain&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;

&lt;span class="c1"&gt;# selecting transformer to use
&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'bert-base-multilingual-uncased'&lt;/span&gt;

&lt;span class="c1"&gt;# Common parameters
&lt;/span&gt;&lt;span class="n"&gt;MAXLEN&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;
&lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;
&lt;span class="n"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;5e-5&lt;/span&gt;
&lt;span class="n"&gt;epochs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After parameter settings. It's time to train our first transformer with Swahili datasets by specifying the training and validation set to let the preprocessor function work with the text and then after to fit our model so that it can learn from the datasets. The process can take a couple of minutes to complete depending on the computational power of your choice.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Transformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maxlen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAXLEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;preprocess_train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_train&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;preprocess_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_classifier&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;learner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ktrain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_learner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;train_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;trn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;learner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;learning_rate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wow, congrats the task of training transformers for Swahili datasets is over now it's to test the performance of our Swahili model that we have trained.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TI6RHMGQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/trainu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TI6RHMGQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/trainu.png" alt="https://blog.neurotech.africa/content/images/2022/02/trainu.png" width="880" height="115"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now you can see our trained model with an accuracy score of 88.40% of train_set and 84.35 of the validation set&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I have copied the first news content from the &lt;code&gt;Train.csv&lt;/code&gt; file to see how the Swahili model can work with it and it does the right classification because the sentence is long you can check on the &lt;a href="https://github.com/sarufi-io/Swahili-sentiment-Analysis-using-transformers/tree/main/notebook"&gt;notebook&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's try to predict with short contents which range within those categories(Kitaifa, michezo, Biashara, Kimataifa, Burudani) of the datasets used to train this model and then see what will be the output.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Swahili = "Simba SC ni timu bora kwa misimu miwili iliyopita katika ligi kuu ya Tanzania"&lt;/p&gt;

&lt;p&gt;English = "Simba S.C are the best team for the last two seasons in the Tanzanian Premier League "&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Uu5oEQRL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/mic.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Uu5oEQRL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/mic.png" alt="https://blog.neurotech.africa/content/images/2022/02/mic.png" width="880" height="119"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;correct Swahili model predictions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's true that the content of this text is based on sports(Michezo), so that is how you can train your own transformer model for Swahili news classification.&lt;/p&gt;

&lt;p&gt;If you want to access the full codes used in this article &lt;strong&gt;&lt;a href="https://github.com/sarufi-io/Swahili-sentiment-Analysis-using-transformers"&gt;here&lt;/a&gt;&lt;/strong&gt; you go.&lt;/p&gt;

&lt;p&gt;Thank you, hope you enjoyed and learned a lot from this article, feel free to share with others.&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>transformers</category>
      <category>swahili</category>
    </item>
    <item>
      <title>MISTAKES TO AVOID WHEN IMPLEMENTING CHATBOTS FOR YOUR BUSINESS</title>
      <dc:creator>Anthony Mipawa</dc:creator>
      <pubDate>Tue, 24 May 2022 07:31:19 +0000</pubDate>
      <link>https://dev.to/neurotech_africa/mistakes-to-avoid-when-implementing-chatbots-for-your-business-13ec</link>
      <guid>https://dev.to/neurotech_africa/mistakes-to-avoid-when-implementing-chatbots-for-your-business-13ec</guid>
      <description>&lt;p&gt;Had you taken some time to think about which kind of &lt;a href="https://en.wikipedia.org/wiki/Chatbot"&gt;chatbot&lt;/a&gt; is best for your business?&lt;/p&gt;

&lt;p&gt;Not every business can require the same implementation of &lt;a href="https://en.wikipedia.org/wiki/Chatbot"&gt;chatbots&lt;/a&gt;. When you want to upgrade services on you're business/company with a chatbot don't just call for developers to build a chatbot to take care of you're business, think like you want to recruit another worker so you must know what task that new worker is capable of doing and what value can add on the product or business. Be specific and you should go with some research ask questions and list all weaknesses that you want to solve with chatbot technology, then share those insights with the team of developers so that they can understand they can build something that can serve your business. If you don't understand how a chatbot can serve your business &lt;a href="https://blog.neurotech.africa/understand-how-chatbots-can-take-your-business-to-the-next-level/"&gt;here&lt;/a&gt; you go.&lt;/p&gt;

&lt;p&gt;Nowadays &lt;a href="https://en.wikipedia.org/wiki/Chatbot"&gt;chatbots&lt;/a&gt; are common tools for driving business which provides high impacts when their implementations meet the demand of the specific business but not in most cases. Some of these implementations are not effective depending on different factors today I will take you through the mistakes that can reduce the effectiveness of your chatbot.&lt;/p&gt;

&lt;p&gt;But you can ask yourself why chatbots?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0wX_2hvI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/istockphoto-1252494221-612x612.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0wX_2hvI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/istockphoto-1252494221-612x612.jpg" alt="https://blog.neurotech.africa/content/images/2022/02/istockphoto-1252494221-612x612.jpg" width="612" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blog.neurotech.africa/understand-how-chatbots-can-take-your-business-to-the-next-level/"&gt;Here&lt;/a&gt; is the answer&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What is a chatbot?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;chatbot&lt;/strong&gt; is a computer program that simulates human conversation through voice commands or text chats or both. You can say it is the output interface of &lt;a href="https://en.wikipedia.org/wiki/Natural_language_processing"&gt;Natural Language Processing&lt;/a&gt;.  Chatbot, short for &lt;strong&gt;chatterbot&lt;/strong&gt;, is an artificial intelligence (AI) feature that can be embedded and used through any major messaging application. Common known chatbots such as &lt;a href="https://www.apple.com/siri/"&gt;Siri&lt;/a&gt;, &lt;a href="https://assistant.google.com/"&gt;Google Assistant&lt;/a&gt;,&lt;a href="https://developer.amazon.com/en-US/alexa"&gt;Alexa&lt;/a&gt;, etc&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8QzE2PWP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/cf.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8QzE2PWP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.neurotech.africa/content/images/2022/02/cf.webp" alt="https://blog.neurotech.africa/content/images/2022/02/cf.webp" width="880" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Mistakes to avoid when implementing a chatbot for your business&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Let's understand a couple of mistakes to avoid during the designing and implementation of chatbots in businesses:-&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No value for the customers:&lt;/strong&gt; people put a chatbot out there just as they want to save cost and save some effort from the business but forgetting about customer care they don't look at it actually from the customer standpoint where it might increase effort, increasing friction for the customer. So as you are really looking to save costs look at your customer as valuable assets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No attention to the customer preferences&lt;/strong&gt;, there could be people who don’t want to deal with a chatbot or they don’t want to deal with a chat at that particular spot in the customer journey. Be sure to make alternative options for those customers to avoid losing them in the tracks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not understanding your own target group:&lt;/strong&gt; the way a chatbot communicates is a representation of the company it works for. Making sure to address your audience the right way will make or break the sale. Further, a chatbot is meant to be a quicker customer support solution. Keep the interaction &lt;strong&gt;simple&lt;/strong&gt;, &lt;strong&gt;fast&lt;/strong&gt;, and &lt;strong&gt;focused.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lacking conversational direction:&lt;/strong&gt; the conversational flow of chatbot interactions is crucial. Conversational dead ends are frustrating for your customers and often lead to the interactions being dropped. The more clarity your chatbot can provide the easier customers can get the help and experience they are looking for. So keep in mind to track the flow and be creative to make sure the attention of you're customers is on your hands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shallow or non-existent personality&lt;/strong&gt;: A remarkable brand has a crucial personality that people can relate to. A brand and a boring chatbot can do the job, but will never excite your customers or even make them come back just for the experience they enjoyed so much.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Strategic focus:&lt;/strong&gt; A strategic approach calls for a deep understanding of the space you're entering. You must be very clear about how chatbots work and what benefits they can provide to your customers and business. Also, take into consideration which language most of you're customer speaks then build something that fit the context to add value to your product/business.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consider Less typing and more clicking:&lt;/strong&gt; this is among the important experience when we are talking about user interface designing principles, as a human habit we don't like interacting with systems that require a lot of typing inputs, just consider designing the interface that involves less typing inputs and instead uses clicking.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are many things to consider but I shared some of the mistakes I experienced and heard a couple of users complaining about such things. Also, you can add on considering "How about color-matching with the specific brand colors", "Consider a chatbot able to access data and learn from data " and so on&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Relation Between Chatbot and Neurotech&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;At Neurotech We are building  &lt;a href="https://sarufi.io/"&gt;Sarufi&lt;/a&gt; a Swahili conversational API that can help solve customer problems in businesses with &lt;a href="https://sarufi.io/"&gt;Sarufi&lt;/a&gt;, you can easily build Conversational AI or chatbots to communicate with customers. It is a no-code chatbot builder that will provide you with all the solutions you need to build and implement a chatbot for your business!&lt;/p&gt;

&lt;p&gt;Thank you for making the end of this informative article, don't forget that sharing is caring.&lt;/p&gt;

</description>
      <category>business</category>
      <category>chatbot</category>
      <category>python</category>
      <category>nlp</category>
    </item>
  </channel>
</rss>
