<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: TAQİ EDDİNE EL MAMOUNİ</title>
    <description>The latest articles on DEV Community by TAQİ EDDİNE EL MAMOUNİ (@taqiddin).</description>
    <link>https://dev.to/taqiddin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3268560%2F8713d2e7-c6f3-4554-9c61-34568d45b7e4.jpg</url>
      <title>DEV Community: TAQİ EDDİNE EL MAMOUNİ</title>
      <link>https://dev.to/taqiddin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/taqiddin"/>
    <language>en</language>
    <item>
      <title>🧠 NLP: From Tokenization to Vectorization (with Practical Insights)</title>
      <dc:creator>TAQİ EDDİNE EL MAMOUNİ</dc:creator>
      <pubDate>Thu, 19 Jun 2025 20:01:47 +0000</pubDate>
      <link>https://dev.to/taqiddin/nlp-from-tokenization-to-vectorization-with-practical-insights-2he1</link>
      <guid>https://dev.to/taqiddin/nlp-from-tokenization-to-vectorization-with-practical-insights-2he1</guid>
      <description>&lt;p&gt;Natural Language Processing (NLP) bridges the gap between human language and machine intelligence. In this blog, we’ll explore foundational steps like tokenization, stemming, lemmatization, vectorization, and modern tools like Transformers. Whether you're just starting or want a refresher, this is your guide to transforming raw text into machine-readable format.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🔤 Tokenization
Tokenization breaks text into smaller units — called tokens — that could be words, subwords, or sentences.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🔹 Word Tokenization&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: "Natural Language Processing"
Tokens: ["Natural", "Language", "Processing"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🔹 Sentence Tokenization&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: "NLP is fascinating. It has endless applications!"
Tokens: ["NLP is fascinating.", "It has endless applications!"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;✂️ Stemming
Stemming reduces words to their root by stripping prefixes/suffixes — but it may not always produce real words.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Words: "running", "runs", "runner"
Stems: "run", "run", "runner"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🟢 Use Case: Fast text indexing and search systems.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🧬 Lemmatization
Lemmatization brings words to their proper dictionary root (lemma) using morphological analysis.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Words: "running", "ran", "runs"
Lemmas: "run", "run", "run"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🟢 Use Case: Sentiment analysis, text classification.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🛑 Stop Word Removal
Stop words are common words like “the”, “is”, “and” that are usually removed before analysis.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: "AI is transforming the world."
Output: "AI transforming world"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;🏷️ Part-of-Speech (POS) Tagging
This tags each word with its grammatical role: noun, verb, adjective, etc.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: "AI transforms industries."
Output: [('AI', 'NNP'), ('transforms', 'VBZ'), ('industries', 'NNS')]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;🔢 Text Normalization (Often Skipped but Important!)
Before further processing, normalize the text:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Lowercasing&lt;/p&gt;

&lt;p&gt;Removing punctuation/numbers&lt;/p&gt;

&lt;p&gt;Removing extra spaces&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import re
text = "AI is Changing the WORLD! 2025."
clean = re.sub(r"[^a-zA-Z\s]", "", text.lower())
# Result: "ai is changing the world"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;🔠 TF-IDF (Vectorization)
TF-IDF weighs words by importance across documents.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.feature_extraction.text import TfidfVectorizer

docs = ["AI is the future", "AI transforms industries"]
tfidf = TfidfVectorizer()
matrix = tfidf.fit_transform(docs)

print(tfidf.get_feature_names_out())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;🌐 Word Embeddings (Word2Vec, GloVe, FastText)
These convert words to dense vectors with semantic meaning.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Model   Description&lt;br&gt;
Word2Vec    Learns words from context&lt;br&gt;
GloVe   Combines local + global context&lt;br&gt;
FastText    Captures subword information (e.g., prefixes)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🤖 Transformers (BERT, RoBERTa, GPT)
Modern NLP uses transformer-based models that understand context much better than traditional methods.
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from transformers import pipeline
clf = pipeline("sentiment-analysis")
print(clf("I love NLP and transformers!"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;🟢 Use Cases: Sentiment analysis, question answering, summarization, translation, etc.&lt;/p&gt;

&lt;p&gt;🔧 10. Build a Simple NLP Pipeline (Practical Example)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import TfidfVectorizer

model = Pipeline([
    ('tfidf', TfidfVectorizer(stop_words='english')),
    ('clf', LogisticRegression())
])

X = ["I love this product", "This is terrible"]
y = [1, 0]

model.fit(X, y)
print(model.predict(["Awesome experience"]))  # Output: [1]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🚀 Where to Go From Here?&lt;br&gt;
Topic   Description&lt;br&gt;
🔍 NER:   Recognize names, organizations, locations&lt;br&gt;
🧩 Dependency Parsing:    Understand how words relate&lt;br&gt;
🏷️ Text Classification:    Categorize emails, reviews, etc.&lt;br&gt;
📚 Topic Modeling:    Discover themes in documents&lt;br&gt;
🤖 Transformers:  BERT, GPT for deep understanding&lt;br&gt;
📝 Summarization: Shorten long documents&lt;br&gt;
💬 Chatbots:  Build intelligent assistants&lt;br&gt;
🛠 NLP Project:   Use spaCy, NLTK, or HuggingFace to combine all steps&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>nlp</category>
      <category>machinelearning</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Medical Chat bot</title>
      <dc:creator>TAQİ EDDİNE EL MAMOUNİ</dc:creator>
      <pubDate>Thu, 19 Jun 2025 10:11:06 +0000</pubDate>
      <link>https://dev.to/taqiddin/medical-chat-bot-3k50</link>
      <guid>https://dev.to/taqiddin/medical-chat-bot-3k50</guid>
      <description>&lt;p&gt;🚀 Launching Our AI-Powered Turkish Health Support Chatbot! 🇹🇷💬&lt;br&gt;
In regions where healthcare access is limited, we built a 24/7 Turkish-language chatbot that provides users with fast, reliable answers to basic health-related questions using cutting-edge LLM and NLP technologies.&lt;/p&gt;

&lt;p&gt;🧠 🔹 Project Overview:&lt;br&gt;
Users can ask natural questions like: “I have a headache” or “I feel nauseous”, and the bot replies with possible causes and suggestions.&lt;br&gt;
Designed for native Turkish speakers and optimized to improve health literacy and reduce unnecessary hospital visits.&lt;br&gt;
Accessible, real-time, and developed with a focus on public benefit.&lt;/p&gt;

&lt;p&gt;📊 🔹 Dataset Information:&lt;br&gt;
Format: CSV file with 15,000 question-answer pairs.&lt;br&gt;
Source: Translated from SQuAD (Stanford Question Answering Dataset).&lt;br&gt;
Sample:&lt;br&gt;
 Q: “How can I relieve a headache?”&lt;br&gt;
 A: “Rest, drink plenty of water, and take painkillers if needed.”&lt;/p&gt;

&lt;p&gt;🧹 🔹 Data Preprocessing Steps:&lt;br&gt;
Text cleaning: Removed HTML tags, links, and special characters.&lt;br&gt;
Normalization: Lowercasing, punctuation handling, whitespace trimming.&lt;br&gt;
Tokenization: Used meta-llama/Llama-3.2-1B-Instruct tokenizer for LLM compatibility.&lt;br&gt;
Libraries: transformers, datasets, os, torch, Flask.&lt;/p&gt;

&lt;p&gt;🔧 🔹 Model Development:&lt;br&gt;
Model: meta-llama/Llama-3.2-1B-Instruct – a 1B parameter Turkish-tuned LLaMA 3 model.&lt;br&gt;
Architecture: Causal decoder, fine-tuned on domain-specific healthcare QA data.&lt;br&gt;
Training Configuration:&lt;br&gt;
Epochs: 3&lt;br&gt;
Batch Size: 8&lt;br&gt;
Learning Rate: 2e-5&lt;br&gt;
Optimizer: AdamW&lt;br&gt;
Results:&lt;br&gt;
Initial: Loss = 2.78, Accuracy = 11%&lt;br&gt;
Final: Loss = 0.12, Accuracy = 73%&lt;br&gt;
Training loss decreased steadily, indicating strong learning performance.&lt;br&gt;
🌐 🔹 Web Interface:&lt;/p&gt;

&lt;p&gt;Built with Flask for seamless user interaction.&lt;br&gt;
Users submit questions through a simple HTML interface.&lt;br&gt;
Backend:&lt;br&gt;
Checks if the question was asked before.&lt;br&gt;
If new, the model generates and stores the answer.&lt;br&gt;
Responses are returned in JSON.&lt;br&gt;
“Clear Chat” button allows resetting the session.&lt;/p&gt;

&lt;p&gt;💡 🔹 Project Impact:&lt;br&gt;
 ✅ Promotes Turkish-language NLP applications&lt;br&gt;
 ✅ Real-world health chatbot use-case using LLaMA 3&lt;br&gt;
 ✅ End-to-end AI integration (data, training, deployment)&lt;br&gt;
 ✅ Fully functional Flask web app with real-time responses&lt;/p&gt;

&lt;p&gt;👨‍💻 Developer: Taqi Eddine El Mamouni&lt;br&gt;
 👥 Teammate: ILYASS ELMAMOUNI&lt;br&gt;
 🎓 Advisor: Dr. Kadir TOHMA&lt;br&gt;
 📅 Project Date: May 29, 2025&lt;/p&gt;

&lt;p&gt;hashtag#AI hashtag#HealthcareAI hashtag#LLM hashtag#LLaMA3 hashtag#NLP hashtag#TurkishLanguage hashtag#DeepLearning hashtag#Chatbot hashtag#MachineLearning hashtag#Flask hashtag#OpenSource hashtag#TaqiEddineElMamouni hashtag#HealthTech hashtag#DataScience&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🤖 What is Artificial Intelligence (AI)?</title>
      <dc:creator>TAQİ EDDİNE EL MAMOUNİ</dc:creator>
      <pubDate>Mon, 16 Jun 2025 13:20:40 +0000</pubDate>
      <link>https://dev.to/taqiddin/what-is-artificial-intelligence-ai-4b22</link>
      <guid>https://dev.to/taqiddin/what-is-artificial-intelligence-ai-4b22</guid>
      <description>&lt;p&gt;Artificial Intelligence (AI) is a branch of computer science focused on building machines and systems that can perform tasks that typically require human intelligence. These tasks include problem-solving, learning, understanding language, recognizing patterns, and even making decisions.&lt;/p&gt;

&lt;p&gt;🧠 Types of AI&lt;br&gt;
Narrow AI:&lt;br&gt;
Designed for a specific task (e.g., voice assistants like Siri, or recommendation engines like Netflix).&lt;/p&gt;

&lt;p&gt;General AI:&lt;br&gt;
A theoretical form of AI that can perform any intellectual task that a human can do.&lt;/p&gt;

&lt;p&gt;Superintelligent AI:&lt;br&gt;
A hypothetical future AI that surpasses human intelligence across all domains.&lt;/p&gt;

&lt;p&gt;🔍 How Does AI Work?&lt;br&gt;
AI systems work by processing large amounts of data using algorithms that find patterns and make predictions or decisions. Some key concepts include:&lt;/p&gt;

&lt;p&gt;Machine Learning (ML) – AI that learns from data.&lt;/p&gt;

&lt;p&gt;Natural Language Processing (NLP) – Understanding and generating human language.&lt;/p&gt;

&lt;p&gt;Computer Vision – Interpreting and understanding visual information from the world.&lt;/p&gt;

&lt;p&gt;🛠️ Real-World Applications&lt;br&gt;
Healthcare: Diagnosing diseases, analyzing medical images.&lt;/p&gt;

&lt;p&gt;Finance: Fraud detection, algorithmic trading.&lt;/p&gt;

&lt;p&gt;Transportation: Self-driving cars.&lt;/p&gt;

&lt;p&gt;Customer Service: Chatbots and virtual assistants.&lt;/p&gt;

&lt;p&gt;Creativity: Generating art, music, and even writing.&lt;/p&gt;

&lt;p&gt;💡 Why It Matters&lt;br&gt;
AI is transforming industries and changing how we live and work. From simplifying daily tasks to solving complex global challenges, AI has become one of the most important technologies of the 21st century.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
