<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: saboor</title>
    <description>The latest articles on DEV Community by saboor (@saboorhamedi).</description>
    <link>https://dev.to/saboorhamedi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F484950%2Ffa4fc6c2-f514-4ce9-9efc-08c7b9af8667.jpeg</url>
      <title>DEV Community: saboor</title>
      <link>https://dev.to/saboorhamedi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saboorhamedi"/>
    <language>en</language>
    <item>
      <title>Dev Snippet — A Local-First Markdown Editor That Thinks With You</title>
      <dc:creator>saboor</dc:creator>
      <pubDate>Fri, 02 Jan 2026 02:56:59 +0000</pubDate>
      <link>https://dev.to/saboorhamedi/dev-snippet-a-local-first-markdown-editor-that-thinks-with-you-4m23</link>
      <guid>https://dev.to/saboorhamedi/dev-snippet-a-local-first-markdown-editor-that-thinks-with-you-4m23</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fae4w2tut0ql1b3vyibkm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fae4w2tut0ql1b3vyibkm.png" alt=" " width="800" height="506"&gt;&lt;/a&gt;&lt;br&gt;
Free. Offline. No Cloud. No Tracking. Just pure, high-performance knowledge crafting.&lt;/p&gt;

&lt;p&gt;After nine years of coding and months of focused development, I built the Markdown editor I always wished existed — one that respects your focus, your privacy, and your intelligence.&lt;/p&gt;

&lt;p&gt;Why Dev Snippet stands out:&lt;/p&gt;

&lt;p&gt;True Local-First: All data stays on your machine via SQLite and a secure snippet:// protocol. No accounts. No forced sync.&lt;br&gt;
Flow Mode: A borderless, shadowed editor designed to induce deep work — neither VS Code nor Obsidian offers this.&lt;br&gt;
Scientist Mode: Typewriter scrolling and minimal UI, optimized for thesis writing and long-form technical drafting.&lt;br&gt;
Full Mermaid and Mathematical Support: Render diagrams, equations, and architectural sketches directly in the live preview.&lt;br&gt;
Semantic Linking: Connect ideas with wiki-links [[snippet_name]], categorize with tags (#), and reference concepts with mentions (@).&lt;br&gt;
Hybrid Search: Press Ctrl+Shift+F to search across all snippets by tag, mention, language, or content — powered by FTS5 and BM25.&lt;br&gt;
Stable Editing Experience: A cursor-aware rendering architecture eliminates layout jumps when switching between modes.&lt;br&gt;
Thoughtful Theming: Four built-in themes inspired by modern CLI tools, with real-time sync between editor and preview.&lt;br&gt;
Built for:&lt;/p&gt;

&lt;p&gt;Researchers drafting papers in Markdown (arXiv-ready)&lt;br&gt;
Developers documenting systems and code&lt;br&gt;
Students organizing lecture notes with structure and links&lt;br&gt;
Anyone who values privacy, speed, and cognitive clarity&lt;br&gt;
Tech Stack: Electron, React, CodeMirror 6, SQLite (WAL mode), and unified.js&lt;br&gt;
License: MIT — fully open source, no hidden costs&lt;/p&gt;

&lt;p&gt;Download v1.2.2 for Windows, macOS, and Linux: &lt;a href="https://github.com/Saboor-Hamedi/dev-snippet/releases" rel="noopener noreferrer"&gt;releases&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source and documentation: &lt;a href="https://github.com/Saboor-Hamedi/dev-snippet" rel="noopener noreferrer"&gt;repo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tools should adapt to cognition — not force cognition to adapt to tools.&lt;/p&gt;

&lt;p&gt;I welcome your feedback — especially from researchers, developers, and serious note-takers. What would make this your daily driver?&lt;/p&gt;

</description>
      <category>electron</category>
      <category>react</category>
      <category>sqlite</category>
      <category>md</category>
    </item>
    <item>
      <title>Quick Snippets — A Small Tool for Big Focus</title>
      <dc:creator>saboor</dc:creator>
      <pubDate>Mon, 01 Dec 2025 13:15:56 +0000</pubDate>
      <link>https://dev.to/saboorhamedi/quick-snippets-a-small-tool-for-big-focus-hjl</link>
      <guid>https://dev.to/saboorhamedi/quick-snippets-a-small-tool-for-big-focus-hjl</guid>
      <description>&lt;p&gt;&lt;strong&gt;Built for developers who hate interrupting their flow.&lt;/strong&gt;&lt;br&gt;
Future update here &lt;a href="https://dev-dialect.com/" rel="noopener noreferrer"&gt;dev-dialect&lt;/a&gt;&lt;br&gt;
When you're deep in the zone and need to save that perfect regex, API response, or config snippet, you don't want to open a new IDE tab or search through messy text files. Quick Snippets gives you a lightning-fast, distraction-free space for those micro-knowledge pieces that deserve to be saved but don't belong in a full project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn1uhskbm3q4so46j0sf7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn1uhskbm3q4so46j0sf7.png" alt=" " width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Philosophy: Less Friction, More Flow
&lt;/h2&gt;

&lt;p&gt;Quick Snippets was born from frustration with existing tools that were either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Too heavy&lt;/strong&gt; (full IDEs for 10 lines of code)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Too simple&lt;/strong&gt; (plain text files with no organization)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Too slow&lt;/strong&gt; (cloud apps requiring authentication)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted something that feels like an extension of my muscle memory — a tool that appears when I need it and disappears when I don't.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes It Special
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Instant Capture&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl/Cmd+N&lt;/strong&gt; → New snippet immediately focused and ready&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drag &amp;amp; drop&lt;/strong&gt; files directly into the app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-save&lt;/strong&gt; every keystroke (never lose work)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Smart Organization&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Command Palette (Ctrl/Cmd+P)&lt;/strong&gt; – Fuzzy-search through all snippets instantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live Markdown Preview&lt;/strong&gt; – See formatted results as you type&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite Backend&lt;/strong&gt; – Local, fast, reliable storage that syncs nothing to the cloud&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Keyboard-First Workflow&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl+R&lt;/strong&gt; – Rename selected snippet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl+Shift+C&lt;/strong&gt; – Copy snippet to clipboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delete key&lt;/strong&gt; – Remove with confirmation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Esc&lt;/strong&gt; – Smart modal hierarchy (closes only what's relevant)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Clean, Focused Interface&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// No bloat. No distractions.&lt;/span&gt;
&lt;span class="c1"&gt;// Just your code and a live preview.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  See It in Action
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Split-Pane Productivity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F770it6xfporyzvmhcoxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F770it6xfporyzvmhcoxf.png" alt=" " width="800" height="358"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Left: Clean editor. Right: Instant Markdown rendering. No switching tabs.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Perfect For...
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Daily Developer Tasks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Saving &lt;strong&gt;one-off commands&lt;/strong&gt; you always forget&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API examples&lt;/strong&gt; and curl commands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Config snippets&lt;/strong&gt; for different environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bug reproduction&lt;/strong&gt; templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code review&lt;/strong&gt; notes and templates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Beyond Just Code
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Meeting notes in Markdown&lt;/li&gt;
&lt;li&gt;Quick calculations&lt;/li&gt;
&lt;li&gt;Project ideas&lt;/li&gt;
&lt;li&gt;Contact templates&lt;/li&gt;
&lt;li&gt;Issue descriptions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install
&lt;/span&gt;npm run dev  &lt;span class="c"&gt;# For development&lt;/span&gt;
npm run build  &lt;span class="c"&gt;# For production&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  First 60 Seconds
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl+N&lt;/strong&gt; – Create your first snippet&lt;/li&gt;
&lt;li&gt;Give it a name (&lt;code&gt;.js&lt;/code&gt; extension auto-detects language)&lt;/li&gt;
&lt;li&gt;Type some code – watch it auto-save&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl+P&lt;/strong&gt; – Search for it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ctrl+Shift+C&lt;/strong&gt; – Copy it back to your main project&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;The magic is in the constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No folders&lt;/strong&gt; – Search instead of organizing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No cloud sync&lt;/strong&gt; – Local-first means instant&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No tabs&lt;/strong&gt; – Single focus reduces cognitive load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No settings&lt;/strong&gt; – It just works&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5q497ldig5eaa5ob0ye.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5q497ldig5eaa5ob0ye.png" alt=" " width="800" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Roadmap Ideas
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Snippet tagging and collections&lt;/li&gt;
&lt;li&gt;Quick export to gist/git&lt;/li&gt;
&lt;li&gt;Theme customization&lt;/li&gt;
&lt;li&gt;Plugin system for syntax highlighting&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  License
&lt;/h2&gt;

&lt;p&gt;MIT – Use it, modify it, share it. Just keep the credits if you redistribute.&lt;/p&gt;




&lt;h2&gt;
  
  
  Want to Contribute?
&lt;/h2&gt;

&lt;p&gt;Found a bug? Have a feature idea? The code is intentionally simple so you can jump right in. Issues and PRs are welcome!&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Quick Snippets isn't trying to be your main editor. It's the sticky note on your monitor that actually gets used.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>snippet</category>
      <category>electron</category>
      <category>node</category>
      <category>sqlite</category>
    </item>
    <item>
      <title>Cracking Down on Cyber Scams: A Breakthrough in Email Threat Detection Using AI .</title>
      <dc:creator>saboor</dc:creator>
      <pubDate>Wed, 11 Jun 2025 12:09:22 +0000</pubDate>
      <link>https://dev.to/saboorhamedi/cracking-down-on-cyber-scams-a-breakthrough-in-email-threat-detection-using-ai--4049</link>
      <guid>https://dev.to/saboorhamedi/cracking-down-on-cyber-scams-a-breakthrough-in-email-threat-detection-using-ai--4049</guid>
      <description>&lt;p&gt;We're students of Information Technology (IT) at the &lt;strong&gt;University of Pamulang (Universitas Pamulang)&lt;/strong&gt;. It's one of the best private universities, providing excellent classes for various majors.&lt;br&gt;
&lt;strong&gt;Student Names:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Abdul Saboor Hamedi
&lt;/li&gt;
&lt;li&gt;Esa Rizki Hari Utama
&lt;/li&gt;
&lt;li&gt;Anydya Relbi Wayah Pandeyani &lt;/li&gt;
&lt;li&gt;Moh. Erland Sumantri &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This blog is an assignment for our &lt;strong&gt;Computer System and Networking&lt;/strong&gt; subject. In this blog, we will go through a paper sourced from Scopus, titled "Machine learning algorithm for detecting suspicious email messages using Natural Language Processing NLP."&lt;br&gt;
You can access the paper through this &lt;a href="https://doi.org/10.1016/j.aej.2025.04.067" rel="noopener noreferrer"&gt;here&lt;/a&gt;...&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In our increasingly connected world, email isn't just for sending holiday snaps or coordinating a Friday arvo barbie. It's a fundamental part of global connectivity and even drives economic growth. But with this convenience comes a serious downside: email is a prime target for cyber threats. We're talking sophisticated phishing schemes and sneaky malware distribution that can hit individuals, companies, and even institutions hard.&lt;/p&gt;

&lt;p&gt;Traditional security measures, bless 'em, are finding it tough to keep up with how quickly these nasty tactics evolve. And let's be fair, there's a serious shortage of cybersecurity pros to fight this battle – one survey across eight countries found about 82% of employers are feeling the pinch. Data from the US also shows unfilled cybersecurity jobs have jumped over 50% since 2015, with projections of a global deficit reaching a whopping 1.8 million roles soon. This talent gap just makes the problem worse.&lt;/p&gt;

&lt;p&gt;When security systems can't adapt, we end up with frustrating classification errors: &lt;strong&gt;false positives (FP)&lt;/strong&gt; and &lt;strong&gt;false negatives (FN)&lt;/strong&gt;. FPs are when a perfectly harmless email gets flagged as a threat, ruining the reliability of email communication. Even worse are FNs, where a genuinely harmful email slips through the net. These errors can lead to data breaches, losing your hard-earned cash, or damaging reputations. With email being crucial for business, sorting out email security is a huge deal, demanding fresh ideas to fix the gaps in older systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Rise of Machine Learning and NLP
&lt;/h3&gt;

&lt;p&gt;Over the years, folks have tried various ways to combat email threats. Early efforts moved beyond just labelling emails as 'ham' (good) or 'spam' (bad) to a three-class system using Artificial Neural Networks (ANN). Hybrid machine learning techniques also showed promise. More recently, tackling &lt;strong&gt;targeted malicious emails (TMEs)&lt;/strong&gt; has been a focus. Some approaches have used methods like SpamAssassin and ClamAV, while others have found success with Support Vector Machine (SVM) algorithms. However, dealing with the sheer volume and complexity of spam means selecting the right features in the email data is crucial for boosting performance.&lt;/p&gt;

&lt;p&gt;Despite the progress, existing systems still chuck up too many false positives, don't quite grasp the full context of a phishing attempt, and struggle to adapt to new threats. This is where combining &lt;strong&gt;Machine Learning (ML)&lt;/strong&gt; and &lt;strong&gt;Natural Language Processing (NLP)&lt;/strong&gt; comes in. NLP helps systems analyse the actual content of emails, spotting tricky language and common patterns found in phishing attempts. This improved accuracy, cutting down on both FPs and FNs. Later on, NLP started looking at the context and meaning of the content, working out the intent behind the text to tell suspicious emails apart from everyday ones. Analysing linguistic patterns and sentiment became key, flagging emails that use persuasive or urgent language. It turns out NLP works particularly well when teamed up with SVM models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our Proposed Solution: SVC with NLP and BERT
&lt;/h3&gt;

&lt;p&gt;This research builds on previous efforts by bringing together a &lt;strong&gt;Support Vector Classifier (SVC)&lt;/strong&gt; with &lt;strong&gt;NLP-based feature extraction&lt;/strong&gt;, including the advanced &lt;strong&gt;BERT&lt;/strong&gt; model, to really nail down classification accuracy and cut those pesky false positives. While past studies often used Random Forest, Naïve Bayes, or standard SVM models, our work shows that an optimised SVC model using smart feature selection techniques achieves higher accuracy (an impressive 98.65%) and is more effective at filtering spam.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Does It Work? Unpacking the Methodology
&lt;/h3&gt;

&lt;p&gt;Let's break down the process.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;The Data&lt;/strong&gt;: The study used the &lt;strong&gt;Kaggle Email Spam Classification Dataset&lt;/strong&gt;. This dataset is a benchmark and contains details from 5172 emails, each labelled as either spam (1) or non-spam (0). Each email is represented by word counts across a massive 3002 columns, plus its label. About 39.4% were spam and 60.6% non-spam, showing the dataset had a class imbalance.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Getting the Data Ready (Preprocessing)&lt;/strong&gt;: Before the ML model could chew on the data, it needed a clean-up. This involved a few steps:

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Lowercasing&lt;/strong&gt;: All text was converted to lowercase to ensure consistency.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Tokenization&lt;/strong&gt;: Emails were broken down into individual words or "tokens" for detailed analysis.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Stop Word Removal&lt;/strong&gt;: Common, uninformative words like "the", "is", and "and" were removed to focus on words that actually carry meaning. Other cleaning included removing emojis, HTML tags, special characters, and URLs.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Lemmatization&lt;/strong&gt;: Words were reduced to their base form.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Extracting Features&lt;/strong&gt;: Once the text was clean, important features were pulled out using NLP techniques.

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;One-Hot Encoding&lt;/strong&gt;: Categorical features (like email subject) were turned into a binary format. While efficient, it doesn't capture meaning or relationships between words, limiting its use for complex phishing detection.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;TF-IDF Vectorization&lt;/strong&gt;: This technique turns the preprocessed text into numerical features, giving words a weight based on how often they appear in an email compared to the whole dataset. It's simple and good for basic text classification, but misses the context between words, which is a limitation for sophisticated phishing.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;BERT Embeddings&lt;/strong&gt;: To fix the limitations of TF-IDF and One-Hot Encoding, BERT was brought in. BERT is a state-of-the-art NLP model that creates contextual embeddings, helping the model understand the meaning of words based on their surroundings. This is a game-changer for spotting subtle linguistic cues in phishing emails, though it does require a fair bit of computational power. Combining One-Hot, TF-IDF, and BERT showed the best performance in feature extraction tests.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Handling Imbalance&lt;/strong&gt;: Because there were more non-spam emails than spam, the dataset was imbalanced. To counter this bias and improve performance, especially in reducing false negatives, several techniques were used:

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;SMOTE&lt;/strong&gt;: This technique creates synthetic examples for the minority class (spam) by interpolating between existing ones. This boosted recall.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Under-sampling&lt;/strong&gt;: This reduces the number of majority class (non-spam) examples. This slightly reduced overall accuracy but improved precision for spam detection.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Algorithmic Adjustments&lt;/strong&gt;: The &lt;code&gt;class_weight&lt;/code&gt; parameter in the SVC was set to 'balanced', giving more importance to the minority class during training. Combining these methods helped balance recall and precision for better overall reliability.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Choosing and Training the Model&lt;/strong&gt;: An &lt;strong&gt;SVC model&lt;/strong&gt; was selected because it's great at binary classification (suspicious or not suspicious), handles high-dimensional data well (like text features), and finds an optimal separation boundary that helps prevent overfitting. Its performance was compared against other popular classifiers like Random Forest, Neural Networks, Decision Trees, and Naive Bayes. The SVC model came out on top, especially for critical metrics like recall and F1-score, which are vital for catching true threats. The training involved using &lt;strong&gt;k-fold cross-validation (with 5 folds)&lt;/strong&gt; to check for overfitting and evaluate the model better. &lt;strong&gt;GridSearchCV&lt;/strong&gt; was used to find the best settings (hyperparameters) for the SVC model. Specific parameters for the SVC included C=1.0, kernel='RBF' (for non-linear data), gamma='scale', and class_weight='balanced'. The model achieved a training accuracy of 98.89%.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Understanding the SVC Structure&lt;/strong&gt;: At its heart, the SVC finds the best &lt;strong&gt;hyperplane&lt;/strong&gt; (a decision boundary) to separate the different classes in the data. It does this by maximising the &lt;strong&gt;margin&lt;/strong&gt; (distance) between the boundary and the closest data points from each class, known as &lt;strong&gt;Support Vectors&lt;/strong&gt;. The model uses a &lt;strong&gt;Kernel Function&lt;/strong&gt; to calculate similarity between data points.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Procedure in Steps&lt;/strong&gt;: The overall process involved Exploratory Data Analysis (EDA) to visualise the dataset, the preprocessing and splitting (80% for training, 20% for testing), loading and initialising the SVC model, training the model, and finally testing it. The process can be visualised as data preprocessing -&amp;gt; handling imbalance -&amp;gt; model training -&amp;gt; model evaluation -&amp;gt; output.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Tools Used&lt;/strong&gt;: The research employed standard computing gear and several software tools, including &lt;strong&gt;Python&lt;/strong&gt; (3.7), &lt;strong&gt;Pandas&lt;/strong&gt;, &lt;strong&gt;NumPy&lt;/strong&gt;, &lt;strong&gt;Scikit-learn&lt;/strong&gt;, &lt;strong&gt;NLTK&lt;/strong&gt;, &lt;strong&gt;Matplotlib/Seaborn&lt;/strong&gt;, &lt;strong&gt;BeautifulSoup&lt;/strong&gt;, &lt;strong&gt;Joblib&lt;/strong&gt;, &lt;strong&gt;Uvicorn&lt;/strong&gt;, and &lt;strong&gt;FastAPI&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Deployment&lt;/strong&gt;: The proposed email security solution is envisioned as a &lt;strong&gt;browser extension&lt;/strong&gt; installed on a user's personal computer. The ML and NLP modules would work within a security engine in the extension to analyse emails and provide results to the user.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Results Are In!
&lt;/h3&gt;

&lt;p&gt;Our model achieved an impressive &lt;strong&gt;accuracy of 98.65%&lt;/strong&gt; on the test set. This is pretty darn good at telling spam/phishing emails from legitimate ones. The study used a test set of 1034 emails. The results, shown in the confusion matrix, highlight the model's effectiveness:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;True Positives (TP)&lt;/strong&gt;: 731 phishing emails correctly identified.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;True Negatives (TN)&lt;/strong&gt;: 290 non-phishing emails correctly identified.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;False Positives (FP)&lt;/strong&gt;: Only 11 non-suspicious emails were wrongly flagged as suspicious.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;False Negatives (FN)&lt;/strong&gt;: Only 3 suspicious emails were wrongly missed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This high precision means the model is reliable for real-world use.&lt;/p&gt;

&lt;p&gt;Comparing our SVC approach with others mentioned in the literature showed it performed very competently:&lt;br&gt;
| Paper                  | Model Used                | Accuracy |&lt;br&gt;
|------------------------|---------------------------|----------|&lt;br&gt;
| Amin et al.            | Random Forest Classifier  | 91%      |&lt;br&gt;
| Khamis et al.          | SVM                       | 88.80%   |&lt;br&gt;
| Ghaleb et al.          | MOGOA and EGOA            | 98.3%    |&lt;br&gt;
| Magdy et al.           | ANN                       | 99.5%    |&lt;br&gt;
| M. Dewis and T. Viana  | MLP                       | 94%      |&lt;br&gt;
| Y. Li                  | Naïve Bayes               | 99.2%    |&lt;br&gt;
| Our SVC Approach       | Integrated NLP and SVC    | 98.65%   |&lt;br&gt;
&lt;em&gt;Table derived from source.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;While some studies showed slightly higher accuracy (like Magdy et al. and Y. Li), our approach particularly excels in reducing FPs and FNs, which is crucial for reliability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Balancing Performance and Efficiency
&lt;/h3&gt;

&lt;p&gt;It's also worth looking at how the models perform computationally:&lt;br&gt;
| Classifier | Training time (s) | Inference latency (ms/email) | Memory usage (MB) | Accuracy (%) |&lt;br&gt;
|:---|:---:|:---:|:---:|:---:|&lt;br&gt;
| Random Forest  | 18.5  | 1.8  | 210 |      97.20         |&lt;br&gt;
| Neural Network | 210.3 | 5.6 | 350 |       97.80         |&lt;br&gt;
| Naive Bayes    | 4.2   | 0.9   | 80 |      95.40         |&lt;br&gt;
| Gradient Boosting      | 35.7  | 2.1 |     180 |    97.60 |&lt;br&gt;
| SVC (Proposed) | 42.1  | 3.2  | 120 |      98.65 |&lt;br&gt;
&lt;em&gt;Table derived from source.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Our SVC model takes a bit longer to train (42.1 s) and uses more memory (120 MB) compared to Naive Bayes (4.2 s, 80 MB) or Random Forest (18.5 s, 210 MB). This is mainly because of its kernel-based optimisation. However, it strikes a good balance with inference speed (3.2 ms/email) and high accuracy. Neural Networks had a higher accuracy listed in the table (97.80%), but with significant computational cost, which limits scalability for large-scale deployments. This shows that the choice of model depends on what you need – SVC is great for high accuracy, but others might suit if computing resources are tight.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dealing with Errors: False Positives and False Negatives
&lt;/h3&gt;

&lt;p&gt;Let's revisit those classification errors, as they matter a lot in detecting phishing emails.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;False Positives (FPs)&lt;/strong&gt; are annoying. They can make users distrust the system, lead to lost productivity from checking quarantined emails, and mean you might miss important messages.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;False Negatives (FNs)&lt;/strong&gt; are dangerous. When a phishing email slips through, it can lead to successful attacks, compromising sensitive info, and causing financial and reputational damage. FNs are considered more critical in this context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How can we tackle these?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Threshold Adjustment&lt;/strong&gt;: Tweaking the model's decision threshold can help balance catching threats (sensitivity) with not flagging good emails (specificity). This was tested and reduced FNs by 20%.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Ensemble Methods&lt;/strong&gt;: Combining multiple models (like SVC with others) can make the system tougher and more accurate, helping to reduce both FPs and FNs.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cost-sensitive Learning&lt;/strong&gt;: Designing models that penalise missing a threat (FN) more heavily than flagging a safe email (FP) can bias the model towards minimising FNs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even with few errors, there's always room to improve. The complexity of email content and very subtle text details can still cause misclassifications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion and What's Next
&lt;/h3&gt;

&lt;p&gt;This study put forward a solid email security framework combining machine learning and NLP to get better at spotting suspicious emails. By teaming SVC with advanced feature extraction like BERT embeddings, the model hit an accuracy of 98.65%, outdoing many older spam detection methods. The results clearly show this system is effective at cutting down both false positives and false negatives, making email communication more reliable.&lt;/p&gt;

&lt;p&gt;However, there are still challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Dataset Scope&lt;/strong&gt;: The dataset used is a benchmark, which is good for evaluation, but it might not fully represent the wild diversity and constantly changing nature of real-world phishing emails, especially in businesses. It lacks some real-world context like sender reputation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Overfitting Risks&lt;/strong&gt;: Complex models like BERT can risk overfitting, although cross-validation helps. Using more data and regularization techniques could further mitigate this.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Language and Domain&lt;/strong&gt;: The model was mainly trained on English emails. It might not work as well for other languages or phishing attacks specific to different cultures or domains.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Dataset Biases&lt;/strong&gt;: The dataset might have biases from its source or how it was labelled, potentially affecting performance in varied real-world situations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Looking ahead, future work will focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Cross-Lingual Training&lt;/strong&gt;: Making the framework work for emails in different languages using models like mBERT.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Bias Mitigation and External Validation&lt;/strong&gt;: Testing the model on real corporate datasets and using techniques to identify and correct biases from public datasets.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Generalizability&lt;/strong&gt;: Using transfer learning to help the model adapt to new email types, including regional threats and emerging tactics like AI-generated phishing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These steps are about making the model continuously better and adaptable to the ever-changing world of email threats.&lt;/p&gt;

&lt;p&gt;Ultimately, this research highlights the huge potential of using AI-driven solutions to enhance email security, helping to protect our digital communication channels from cyber threats.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>springboot</category>
      <category>svc</category>
    </item>
    <item>
      <title>MySQL charset</title>
      <dc:creator>saboor</dc:creator>
      <pubDate>Sat, 31 Oct 2020 07:30:50 +0000</pubDate>
      <link>https://dev.to/saboorhamedi/mysql-charset-11l7</link>
      <guid>https://dev.to/saboorhamedi/mysql-charset-11l7</guid>
      <description>&lt;p&gt;I don't know is it the correct place to ask this types of question? &lt;br&gt;
BTW, I have a problem with charset "utf8mb4" of MySQL I have built a website and everything is working fine except Arabic Language here is the case:&lt;br&gt;
I can update and insert new Arabic record on my table through PHP script, when I  open workbench or BDeaver I get this character "ØµØ¨ÙˆØ±" which is == "صبور", but on my  website I get the correct result, It sound like encrypt and decrypt process.&lt;br&gt;
Here is the opposite, when I update this record through MySQL I get my name  which is "صبور" but when I fetch this name on my website I get question mark  == "?????"&lt;br&gt;
Any help would be big help thank you in advance&lt;/p&gt;

</description>
      <category>mysql</category>
      <category>php</category>
      <category>sql</category>
    </item>
    <item>
      <title>How to decrypt hash password</title>
      <dc:creator>saboor</dc:creator>
      <pubDate>Mon, 12 Oct 2020 09:12:05 +0000</pubDate>
      <link>https://dev.to/saboorhamedi/how-decrypt-hash-password-30cb</link>
      <guid>https://dev.to/saboorhamedi/how-decrypt-hash-password-30cb</guid>
      <description>&lt;p&gt;Hello everyone! this is my first time I'm posting here. I have a question,  Is it possible to decrypt hash password and fetch them all in table ?&lt;br&gt;
here is the example&lt;/p&gt;

</description>
      <category>php</category>
      <category>mysql</category>
      <category>sql</category>
      <category>mysqli</category>
    </item>
  </channel>
</rss>
