<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Steven Mathew</title>
    <description>The latest articles on DEV Community by Steven Mathew (@stevenmathew).</description>
    <link>https://dev.to/stevenmathew</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F951372%2F80e0d928-0af3-4d03-a6c4-3838828ef168.jpeg</url>
      <title>DEV Community: Steven Mathew</title>
      <link>https://dev.to/stevenmathew</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stevenmathew"/>
    <language>en</language>
    <item>
      <title>Alibaba releases new Qwen-2-VL which can analyze 20mins long videos</title>
      <dc:creator>Steven Mathew</dc:creator>
      <pubDate>Fri, 13 Sep 2024 14:04:09 +0000</pubDate>
      <link>https://dev.to/stevenmathew/alibaba-releases-new-qwen-2-vl-which-can-analyze-20mins-long-videos-2dci</link>
      <guid>https://dev.to/stevenmathew/alibaba-releases-new-qwen-2-vl-which-can-analyze-20mins-long-videos-2dci</guid>
      <description>&lt;p&gt;Alibaba has recently unveiled Qwen-2-VL, an advanced AI model designed for analyzing long-format videos, particularly those exceeding 20 minutes. This model represents a significant advancement in multimodal AI, as it can process both video and audio content, offering more precise insights from complex visual and auditory data. Unlike previous models that focused on short clips, Qwen-2-VL can comprehend intricate narratives and patterns over extended durations.&lt;/p&gt;

&lt;p&gt;One of the standout features of Alibaba AI Model Qwen-2-VL is its ability to understand contextual information from long videos. By analyzing content in real time, the model can identify important moments, summarize key points, and generate rich interpretations. This makes it ideal for applications in education, media production, and entertainment, where long-format videos are common, and detailed analysis is crucial.&lt;/p&gt;

&lt;p&gt;Qwen-2-VL also excels in bridging the gap between text, images, and video. Its multimodal capabilities mean it can answer questions based on video content and create summaries that incorporate both visual and textual elements. This could revolutionize how video-based information is processed, enabling faster insights in sectors like marketing, content creation, and e-learning.&lt;/p&gt;

&lt;p&gt;By releasing Qwen-2-VL, Alibaba demonstrates its commitment to advancing AI technology, focusing on models that provide greater utility in real-world applications. This AI model could pave the way for more efficient content analysis, offering deeper insights from videos in ways that were previously difficult for AI to achieve.&lt;/p&gt;

&lt;p&gt;Benefits of Alibaba AI Model Qwen-2-VL:&lt;/p&gt;

&lt;p&gt;Long Video Analysis: Unlike previous AI models that struggle with longer content, Qwen-2-VL can analyze videos exceeding 20 minutes, providing more in-depth analysis and understanding of complex sequences.&lt;/p&gt;

&lt;p&gt;Multimodal Processing: It can handle both video and audio content simultaneously, offering enhanced insights compared to single-modality models.&lt;/p&gt;

&lt;p&gt;Real-time Analysis: Qwen-2-VL processes content as it plays, making it highly effective for live video summarization and analysis.&lt;/p&gt;

&lt;p&gt;Comparison to Existing Models:&lt;/p&gt;

&lt;p&gt;Long-Content Capability: Most existing AI models, like OpenAI’s GPT-4 and Google’s PaLM, are excellent at handling text but struggle with extended video content. Qwen-2-VL fills this gap by focusing on video understanding, particularly long-format videos.&lt;/p&gt;

&lt;p&gt;Contextual Understanding: While some models are optimized for short clips or image-based tasks (like OpenAI’s CLIP), Qwen-2-VL is more robust in comprehending intricate and evolving narratives in longer videos.&lt;/p&gt;

&lt;p&gt;Integrated Multimodal Performance: Unlike older models that either focused on text, video, or audio separately, Qwen-2-VL integrates these modalities, making it more versatile for real-world use cases like educational videos, media, and entertainment analysis.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Sarcasm Detection AI Model (97% Accuracy) Trained With Reddit Comments - Training &amp; Testing</title>
      <dc:creator>Steven Mathew</dc:creator>
      <pubDate>Sun, 07 Jul 2024 04:20:56 +0000</pubDate>
      <link>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-training-testing-2e32</link>
      <guid>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-training-testing-2e32</guid>
      <description>&lt;p&gt;Now we are going to split the data to train and test the data to check the accuracy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df = pd.read_csv('labeled_reddit_comments.csv')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This line reads the previously saved CSV file (labeled_reddit_comments.csv) containing cleaned Reddit comments and their corresponding labels into a Pandas DataFrame (df).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Splitting Data into Training and Testing Sets&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X_train, X_test, y_train, y_test = train_test_split(df['cleaned_comment'], df['label'], test_size=0.2, random_state=42)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we split the data into two parts:&lt;br&gt;
X_train and y_train: These variables contain 80% of the data (df['cleaned_comment'] and df['label']) which will be used for training the model.&lt;/p&gt;

&lt;p&gt;X_test and y_test: These variables contain the remaining 20% of the data, which will be used to evaluate how well the trained model performs on new, unseen data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Creating a Pipeline with a Random Forest Classifier&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pipeline = Pipeline([
    ('tfidf', TfidfVectorizer()),
    ('clf', RandomForestClassifier(random_state=42))
])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sets up a pipeline (pipeline) that sequentially applies two steps to the data:&lt;br&gt;
Step 1 ('tfidf', TfidfVectorizer()): Converts the text data (X_train and X_test) into numerical TF-IDF (Term Frequency-Inverse Document Frequency) vectors.&lt;/p&gt;

&lt;p&gt;Step 2 ('clf', RandomForestClassifier(random_state=42)): Trains a Random Forest classifier on the TF-IDF vectors. The random_state=42 ensures reproducibility of results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defining Hyperparameters for Tuning&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;param_grid = {
    'tfidf__max_features': [10000, 20000, None],
    'clf__n_estimators': [50, 100],
    'clf__max_depth': [None, 10],
    'clf__min_samples_split': [2, 5],
    'clf__min_samples_leaf': [1, 2]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This dictionary (param_grid) specifies different hyperparameter values to explore during the grid search process:&lt;br&gt;
'tfidf_&lt;em&gt;max_features': Limits the number of features generated by TfidfVectorizer.&lt;br&gt;
'clf&lt;/em&gt;&lt;em&gt;n_estimators', 'clf&lt;/em&gt;&lt;em&gt;max_depth', 'clf&lt;/em&gt;&lt;em&gt;min_samples_split', 'clf&lt;/em&gt;_min_samples_leaf': Parameters that control the behavior of the Random Forest classifier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performing GridSearchCV for Hyperparameter Tuning&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy', verbose=1, error_score='raise')
grid_search.fit(X_train, y_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, GridSearchCV is used to search for the best combination of hyperparameters (param_grid) for the pipeline (pipeline). It:&lt;br&gt;
Divides the data into 5 folds (cv=5) for cross-validation.&lt;/p&gt;

&lt;p&gt;Uses accuracy (scoring='accuracy') as the metric to evaluate the performance of each combination of hyperparameters.&lt;br&gt;
Prints detailed messages (verbose=1) during the search process and raises errors (error_score='raise') if an error occurs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluating the Best Model&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)

# Print evaluation metrics
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
print(classification_report(y_test, y_pred))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After finding the best set of hyperparameters (best_model), the code evaluates this model's performance on the test data (X_test) that was set aside earlier (y_test). &lt;/p&gt;

&lt;p&gt;It:&lt;br&gt;
Predicts labels (y_pred) for the test data.&lt;br&gt;
Calculates and prints the accuracy score (accuracy_score) of the predictions compared to the actual labels (y_test).&lt;/p&gt;

&lt;p&gt;Prints a detailed classification report (classification_report) showing precision, recall, F1-score, and support for each class (sarcasm and non-sarcasm).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After training and Testing I got an accuracy of 97%&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzj8wdqgpmd7mf1n92ez1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzj8wdqgpmd7mf1n92ez1.png" alt="Image description" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Testing with sample text&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8segcd9rs1qs7je18uj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8segcd9rs1qs7je18uj.png" alt="Image description" width="800" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checking on the top 5 comments on a post on Reddit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nrm4f692w0sfs5bk5bl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nrm4f692w0sfs5bk5bl.png" alt="Image description" width="800" height="278"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GITHUB: &lt;a href="https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments" rel="noopener noreferrer"&gt;https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Author: &lt;a href="https://stevenmathew.dev" rel="noopener noreferrer"&gt;Steven Mathew&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Sarcasm Detection AI Model (97% Accuracy) Trained With Reddit Comments - Cleaning and Saving The Data</title>
      <dc:creator>Steven Mathew</dc:creator>
      <pubDate>Sun, 07 Jul 2024 04:11:05 +0000</pubDate>
      <link>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-cleaning-and-saving-the-data-46dj</link>
      <guid>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-cleaning-and-saving-the-data-46dj</guid>
      <description>&lt;p&gt;Now we will clean the data and save the data for training and testing  in the next part.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def clean_comment(text):
    text = re.sub(r'http\S+', '', text)  # Remove any web URLs in the text
    text = re.sub(r'/u/\w+', '', text)  # Remove mentions of Reddit users (like /u/username)
    text = re.sub(r'r/\w+', '', text)  # Remove mentions of subreddits (like r/subreddit)
    text = re.sub(r'\n', ' ', text)  # Replace new line characters with spaces
    text = re.sub(r'[^A-Za-z0-9\s]', '', text)  # Remove any characters that are not letters, numbers, or spaces
    return text.lower()  # Convert the cleaned text to lowercase
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function takes in a piece of text (text) and cleans it up by removing web URLs, mentions of Reddit users and subreddits, new line characters, and any characters that are not letters, numbers, or spaces. Finally, it converts the cleaned text to lowercase.&lt;/p&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Load data from a CSV file into a DataFrame
df = pd.read_csv('reddit_comments.csv')

# Apply the cleaning function to each comment and create a new column for cleaned comments
df['cleaned_comment'] = df['comment'].apply(clean_comment)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we load data from a CSV file (reddit_comments.csv) into a table-like structure called a DataFrame. Then, for each comment in the 'comment' column of this DataFrame, we use the clean_comment function we defined earlier to clean up the text. The cleaned versions of the comments are stored in a new column named 'cleaned_comment'.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
# Manually assign labels to the comments
labels = [0, 1] * (len(df) // 2)  # Create a list of labels alternating between 0 and 1
if len(labels) &amp;lt; len(df):
    labels.append(0)  # Add one more label to match the number of comments

df['label'] = labels  # Assign the labels to a new column named 'label' in the DataFrame
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this part, we assign labels to each comment to indicate whether it's sarcastic or not. For demonstration purposes, we alternate between labels 0 (for non-sarcastic) and 1 (for sarcastic). We make sure that each comment gets a corresponding label. These labels are stored in a new column named 'label' in the DataFrame.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Remove rows where the cleaned comment is empty or NaN (missing)
df = df.dropna(subset=['cleaned_comment'])  # Remove rows where 'cleaned_comment' is NaN
df = df[df['cleaned_comment'].str.strip() != '']  # Remove rows where 'cleaned_comment' is empty or only whitespace

# Save the cleaned and labeled data to a new CSV file
df.to_csv('labeled_reddit_comments.csv', index=False)  # Save DataFrame to CSV without including the index
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we clean up the data further by removing any rows where the cleaned comment is empty or missing (NaN). We also remove rows where the cleaned comment consists only of whitespace. &lt;/p&gt;

&lt;p&gt;After cleaning and filtering, we save the cleaned and labeled data (including the 'cleaned_comment' and 'label' columns) to a new CSV file named labeled_reddit_comments.csv. &lt;/p&gt;

&lt;p&gt;Note:&lt;br&gt;
The index=False parameter ensures that the CSV file does not include an extra column for row numbers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-training-testing-2e32"&gt;Read the Part 3 - Sarcasm Detection From Reddit Comments : Training &amp;amp; Testing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GITHUB: &lt;a href="https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments" rel="noopener noreferrer"&gt;https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Author: &lt;a href="https://stevenmathew.dev" rel="noopener noreferrer"&gt;Steven Mathew&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Sarcasm Detection AI Model (97% Accuracy) Trained With Reddit Comments - Part 1</title>
      <dc:creator>Steven Mathew</dc:creator>
      <pubDate>Sun, 07 Jul 2024 04:04:00 +0000</pubDate>
      <link>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-55kf</link>
      <guid>https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-55kf</guid>
      <description>&lt;p&gt;I have trained a Sarcasm Detection AI model using Reddit comments. This is how you can do it too.&lt;/p&gt;

&lt;p&gt;Requirements:&lt;br&gt;
Google Colab&lt;br&gt;
Reddit API Credentials&lt;br&gt;
Lots of time&lt;br&gt;
Coffee&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First we will import the necessary libraries.
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import asyncio  # For asynchronous programming in Python.
import asyncpraw  # Python Reddit API Wrapper for asynchronous Reddit API interactions.
import pandas as pd  # Data manipulation and analysis tool.
import nest_asyncio  # Necessary for allowing nested asyncio run loops.
import re  # Regular expressions for pattern matching and text manipulation.
from sklearn.model_selection import train_test_split  # Splits data into training and testing sets.
from sklearn.feature_extraction.text import TfidfVectorizer  # Converts text data into TF-IDF feature vectors.
from sklearn.ensemble import RandomForestClassifier  # Random Forest classifier for machine learning.
from sklearn.metrics import accuracy_score, classification_report  # Metrics for evaluating model performance.
from imblearn.over_sampling import SMOTE  # Oversampling technique for handling class imbalance.
from sklearn.pipeline import Pipeline  # Constructs a pipeline of transformations and estimators.
from sklearn.model_selection import GridSearchCV  # Performs grid search over specified parameter values.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;Connecting to Reddit API
Get your API credentials from &lt;a href="https://www.reddit.com/prefs/apps" rel="noopener noreferrer"&gt;https://www.reddit.com/prefs/apps&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`client_id = 'your_client_id'
client_secret = 'your_client_secret'
user_agent = 'MyRedditApp/0.1 by your_username'

reddit = praw.Reddit(client_id=client_id,
                     client_secret=client_secret,
                     user_agent=user_agent)`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This code sets up authentication credentials (client_id, client_secret, user_agent) to create a Reddit API connection using praw. The Reddit object initializes a connection to Reddit's API, allowing the Python script to interact with Reddit, retrieve data, and perform various actions programmatically on the platform.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Initialization and Setup
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`nest_asyncio.apply()`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This line ensures that asyncio can be used in a nested manner, which is necessary when using asynchronous operations in environments that already have an event loop running.&lt;/p&gt;

&lt;p&gt;Asynchronous Function Definition&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`async def collect_reddit_comments(subreddit_name, keyword, limit=1000):
    reddit = asyncpraw.Reddit(
        client_id=client_id,
        client_secret=client_secret,
        user_agent=user_agent
    )`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Defines an asynchronous function collect_reddit_comments to retrieve comments from Reddit. It initializes a Reddit instance using asyncpraw, passing in credentials (client_id, client_secret, user_agent) for API authentication.&lt;/p&gt;

&lt;p&gt;Fetching Subreddit and Comments&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`subreddit = await reddit.subreddit(subreddit_name)
comments = []
count = 0
after = None`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Asynchronously fetches the subreddit object based on subreddit_name. Initializes an empty list comments to store comment data, and sets counters (count) and pagination marker (after) for comment retrieval.&lt;/p&gt;

&lt;p&gt;Looping Through Submissions and Comments&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`while len(comments) &amp;lt; limit:
    try:
        async for submission in subreddit.search(keyword, limit=None, params={'after': after}):
            await submission.load()
            submission.comment_limit = 0
            submission.comments.replace_more(limit=0)`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Explanation: Enters a loop to fetch submissions matching keyword within the specified subreddit. Asynchronously loads submission details and retrieves all comments for each submission, handling cases where more comments are nested (replace_more).&lt;/p&gt;

&lt;p&gt;Collecting and Storing Comments&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;           ` for comment in submission.comments.list():
                if isinstance(comment, asyncpraw.models.Comment):
                    author_name = comment.author.name if comment.author else '[deleted]'
                    comments.append([comment.body, author_name, comment.created_utc])
                    count += 1

                    if count &amp;gt;= limit:
                        break

            after = submission.id  # Sets the 'after' parameter for pagination

            if count &amp;gt;= limit:
                break`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Iterates through each comment in the submission, checking if it's a valid comment. Collects comment details such as body, author name, and creation time (created_utc). Controls the loop with count and limit to ensure the specified number of comments (limit) is collected.&lt;/p&gt;

&lt;p&gt;Handling API Exceptions&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    `except asyncpraw.exceptions.APIException as e:
        print(f"API exception occurred: {e}")
        wait_time = 60  # Wait for 1 minute before retrying
        print(f"Waiting for {wait_time} seconds before retrying...")
        await asyncio.sleep(wait_time)`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Catches and handles API exceptions that may occur during Reddit API interactions. Prints the exception message, waits for a minute (wait_time) before retrying, and then resumes fetching comments.&lt;/p&gt;

&lt;p&gt;Returning Results&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`return comments[:limit]`  # Returns up to 'limit' number of comments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Returns a list of collected comments, limited by the specified limit, ensuring only the required number of comments are returned.&lt;/p&gt;

&lt;p&gt;Main Function to Execute Collection&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;async def main():
    comments = await collect_reddit_comments('sarcasm', 'sarcastic', limit=5000)  # Adjust limit as needed
    df = pd.DataFrame(comments, columns=['comment', 'author', 'created_utc'])
    df.to_csv('reddit_comments.csv', index=False)
    print(f"Total comments collected: {len(df)}")
    print(df.head())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Defines an asynchronous main function to orchestrate the comment collection process. Calls collect_reddit_comments with parameters subreddit_name='sarcasm', keyword='sarcastic', and limit=5000 (can be adjusted). Converts collected comments into a Pandas DataFrame (df), stores it as a CSV file (reddit_comments.csv), and prints summary information about the collected data.&lt;/p&gt;

&lt;p&gt;Running the Main Function&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;`await main()`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Executes the main function asynchronously, initiating the process of collecting Reddit comments, processing them into a DataFrame, saving them to a CSV file, and providing feedback on the number of comments collected and a preview of the data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/stevenmathew/sarcasm-detection-ai-model-trained-with-reddit-comments-cleaning-and-saving-the-data-46dj"&gt;Read the Part 2 - Sarcasm Detection From Reddit Comments : Cleaning &amp;amp; Saving The Data&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GITHUB: &lt;a href="https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments" rel="noopener noreferrer"&gt;https://github.com/stevie1mat/Sarcasm-Detection-With-Reddit-Comments&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Author: &lt;a href="https://stevenmathew.dev" rel="noopener noreferrer"&gt;Steven Mathew&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machin</category>
      <category>reddit</category>
      <category>sarcasm</category>
    </item>
    <item>
      <title>Flutter Youtube List With PHP &amp; MySql</title>
      <dc:creator>Steven Mathew</dc:creator>
      <pubDate>Fri, 21 Oct 2022 17:20:01 +0000</pubDate>
      <link>https://dev.to/stevenmathew/flutter-youtube-list-with-php-mysql-kkp</link>
      <guid>https://dev.to/stevenmathew/flutter-youtube-list-with-php-mysql-kkp</guid>
      <description>&lt;p&gt;While working on a project, I came across the need to create a dynamic Youtube list of videos using the youtube_player_flutter package.&lt;/p&gt;

&lt;p&gt;So here is how I went about developing the code.&lt;/p&gt;

&lt;p&gt;To start with, I create a new flutter project. Then I opened up my PHPMyAdmin to start creating tables for storing the dynamic data that will be shown to the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create A SQL Table
CREATE TABLE `videosapp` ( `youtubeid` varchar(100) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ALTER TABLE videosapp ADD PRIMARY KEY (youtubeid); COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;Create the PHP file to provide the JSON data from the PHPMyAdmin database.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Connect to the phpmyadmin.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;static $DB_SERVER=””;

static $DB_NAME=””;

static $USERNAME=””;

static $PASSWORD=””;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Fetch the data and convert it into JSON format.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;while($row=$result-&amp;gt;fetch_array()) {

array_push($spacecrafts, array(“youtubeid”=&amp;gt;$row[‘youtubeid’])); }

print(json_encode(array_reverse($spacecrafts)));
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;The final step is to parse the value in the Flutter application.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use a ListView Builder to create a widget that will create a list from the Youtube Id’s retrieved from the PHP file in JSON Format.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ListView.builder( itemCount: widget.spacecrafts.length, itemBuilder: (context, int currentIndex) {

return createViewItem(widget.spacecrafts[currentIndex], context); },
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Create a List from the Youtube controller by providing the Youtube Id’s from the database to the Youtube Player Controller intialVideoId.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final List&amp;lt;YoutubePlayerController&amp;gt; _controllers = [spacecraft.youtubeid] .map&amp;lt;YoutubePlayerController&amp;gt;( (videoId) =&amp;gt;

YoutubePlayerController(

initialVideoId: videoId,

flags: YoutubePlayerFlags( autoPlay: false, ), ), ) .toList();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Create a Future Builder and retrieve the snapshot data and show it onto the screen. You can also use Firebase or any other backend service to display the list of videos by using StreamBuilder.&lt;/p&gt;

&lt;p&gt;You can find the Full Source Code of the project here.&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/stevie1mat" rel="noopener noreferrer"&gt;
        stevie1mat
      &lt;/a&gt; / &lt;a href="https://github.com/stevie1mat/Flutter-Youtube-List-With-PHP-MySql" rel="noopener noreferrer"&gt;
        Flutter-Youtube-List-With-PHP-MySql
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Flutter Youtube List From PHP - MySQL (using phpmyadmin)
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/d913070c7ea706914ad3eecfb8674f16271239c16cfaa13de0e1fe44fe7be00b/68747470733a2f2f79612d77656264657369676e2e636f6d2f696d616765732f796f75747562652d6c6f676f2d627574746f6e2d706e672e706e67"&gt;&lt;img src="https://camo.githubusercontent.com/d913070c7ea706914ad3eecfb8674f16271239c16cfaa13de0e1fe44fe7be00b/68747470733a2f2f79612d77656264657369676e2e636f6d2f696d616765732f796f75747562652d6c6f676f2d627574746f6e2d706e672e706e67" height="100px" width="100px"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
  &lt;b&gt;Flutter Youtube List From PHP - MySQL (using phpmyadmin)&lt;/b&gt;
&lt;/p&gt;
  &lt;p&gt;Flutter youtube list view using PHP and MySQL by converting it into json and the parsing it in the Flutter app. You can also use json directly if You don't want to use PHP. (All of these work remotely so You can edit and change it.)&lt;/p&gt;
  &lt;br&gt;
  &lt;p&gt;Packages Used&lt;/p&gt;
  &lt;ul&gt;
  &lt;li&gt;&lt;a href="https://pub.dev/packages/http" rel="nofollow noopener noreferrer"&gt;HTTP&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://pub.dev/packages/youtube_player_flutter" rel="nofollow noopener noreferrer"&gt;Youtube Player Flutter&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;p&gt;&lt;i&gt;Update to the lastest version so that You don't face any errors.&lt;/i&gt;&lt;/p&gt;
  &lt;br&gt;
  &lt;p&gt;
  Steps For Setting This Up:
  &lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Create A SQL Table&lt;/li&gt;
    &lt;code&gt;
      CREATE TABLE `videosapp` (
  `youtubeid` varchar(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
&lt;br&gt;
&lt;/code&gt;&lt;p&gt;&lt;code&gt;ALTER TABLE &lt;code&gt;videosapp&lt;/code&gt;
ADD PRIMARY KEY (&lt;code&gt;youtubeid&lt;/code&gt;);
COMMIT;
&lt;/code&gt;&lt;/p&gt;
  &lt;/ul&gt;
  &lt;ul&gt;
  &lt;li&gt;Edit the videoapp.php file with Your credentials &amp;amp; table name and upload it to Your server.
  &lt;/li&gt;
&lt;/ul&gt;
  &lt;ul&gt;
  &lt;li&gt;Finally change the link to php file in the list.dart file. Sample link has been provided Please don't misuse it in any way.
  &lt;/li&gt;
&lt;/ul&gt;
  &lt;br&gt;
  &lt;p&gt;&lt;b&gt; Screen Shots&lt;/b&gt;&lt;/p&gt;
  &lt;p&gt;
  &lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/5be213cb3829478394f780002153623ab995157f9c161d0545ae17ea304d750e/68747470733a2f2f736a6d6f64656c6167656e63792e636f6d2f6170702f332e6a706567"&gt;&lt;img src="https://camo.githubusercontent.com/5be213cb3829478394f780002153623ab995157f9c161d0545ae17ea304d750e/68747470733a2f2f736a6d6f64656c6167656e63792e636f6d2f6170702f332e6a706567" width="400px"&gt;&lt;/a&gt;
  &lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/9f52f6643ad7e915d0687aa3e6b2eaa72939893ba2311a7d59ba1766eef6a79e/68747470733a2f2f736a6d6f64656c6167656e63792e636f6d2f6170702f322e6a706567"&gt;&lt;img src="https://camo.githubusercontent.com/9f52f6643ad7e915d0687aa3e6b2eaa72939893ba2311a7d59ba1766eef6a79e/68747470733a2f2f736a6d6f64656c6167656e63792e636f6d2f6170702f322e6a706567" width="400px"&gt;&lt;/a&gt;
&lt;/p&gt;
  
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Buy Me A Coffee&lt;/h1&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://rzp.io/l/jlOOFVXJ" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/72278bb447c293ce476ddcb25f47200b807646ee9a2282dc1b0478b3bc04eeec/68747470733a2f2f73332e61702d736f757468656173742d312e616d617a6f6e6177732e636f6d2f696d616765732e64656363616e6368726f6e69636c652e636f6d2f64632d436f7665722d753062333439757071756766696f31393573346c706b383134342d32303139303231333132303330332e4d6564692e6a706567" width="200" height="100"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;/div&gt;
&lt;br&gt;
&lt;br&gt;
  &lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/stevie1mat/Flutter-Youtube-List-With-PHP-MySql" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;p&gt;Author: &lt;a href="https://stevenmathew.dev" rel="noopener noreferrer"&gt;Steven Mathew&lt;/a&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>mysql</category>
      <category>android</category>
    </item>
  </channel>
</rss>
