<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nigel Okoth</title>
    <description>The latest articles on DEV Community by Nigel Okoth (@anvilicious).</description>
    <link>https://dev.to/anvilicious</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F578536%2Fea4dd25a-0a18-444b-a0e2-60a0de766cde.jpg</url>
      <title>DEV Community: Nigel Okoth</title>
      <link>https://dev.to/anvilicious</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anvilicious"/>
    <language>en</language>
    <item>
      <title>🌐 How to Create a Free WordPress Website with Free Hosting &amp; Domain (SSL Included)</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Wed, 16 Apr 2025 12:43:14 +0000</pubDate>
      <link>https://dev.to/anvilicious/how-to-create-a-free-wordpress-website-with-free-hosting-domain-ssl-included-1njl</link>
      <guid>https://dev.to/anvilicious/how-to-create-a-free-wordpress-website-with-free-hosting-domain-ssl-included-1njl</guid>
      <description>&lt;p&gt;Have you ever wanted to launch your own WordPress website without paying a dime for hosting or a domain? Whether you're building a personal portfolio, blog, e-commerce site, or even a news platform, this guide will walk you through how to do it completely free of cost — including installing SSL (Secure Socket Layer) for a secure connection.&lt;br&gt;
Let’s get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 Step 1: Use InfinityFree for Free Hosting and Subdomain
&lt;/h2&gt;

&lt;p&gt;InfinityFree is one of the most reliable and popular free hosting providers out there. It has been offering ad-free, fast, and free website hosting for over 10 years!&lt;br&gt;
Key Features:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;No ads on your website&lt;/li&gt;
&lt;li&gt;Fast loading time &lt;/li&gt;
&lt;li&gt;Free subdomain support. &lt;/li&gt;
&lt;li&gt;Option to use a custom domain&lt;/li&gt;
&lt;li&gt;cPanel-style control panel with Softaculous installer&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  📝 Step 2: Sign Up for InfinityFree
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Head to InfinityFree&lt;/li&gt;
&lt;li&gt;Click Sign Up&lt;/li&gt;
&lt;li&gt;Enter your email address and password&lt;/li&gt;
&lt;li&gt;Click Create New Profile&lt;/li&gt;
&lt;li&gt;If you already have an account, just log in.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🌍 Step 3: Create Your Free Domain
&lt;/h2&gt;

&lt;p&gt;Once you're logged in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click Create Account&lt;/li&gt;
&lt;li&gt;Under Free Subdomain, choose a domain extension (e.g., .epizy.com, .rf.gd)&lt;/li&gt;
&lt;li&gt;Enter your desired website name&lt;/li&gt;
&lt;li&gt;If the name is unavailable, try another&lt;/li&gt;
&lt;li&gt;Click Create Account&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✨ Your domain is now created!&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚙️ Step 4: Install WordPress
&lt;/h2&gt;

&lt;p&gt;To install WordPress:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to the Control Panel&lt;/li&gt;
&lt;li&gt;Approve the SSL notice if prompted&lt;/li&gt;
&lt;li&gt;Scroll down and click on Softaculous Apps Installer&lt;/li&gt;
&lt;li&gt;Select WordPress &amp;gt; Install&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Choose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Protocol: https://&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Domain: Your new subdomain&lt;br&gt;
Enter:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Admin Username&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Password&lt;br&gt;
Scroll down and click Install&lt;br&gt;
✅ WordPress will be installed shortly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🔒 Step 5: Fix HTTPS and Add Free SSL Certificate
&lt;/h2&gt;

&lt;p&gt;Sometimes after installation, your site may not load with https. Here's how to fix that:&lt;br&gt;
Add Free SSL:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go back to the InfinityFree dashboard&lt;/li&gt;
&lt;li&gt;Click on Free SSL Certificate&lt;/li&gt;
&lt;li&gt;Paste your subdomain&lt;/li&gt;
&lt;li&gt;Choose the recommended SSL provider&lt;/li&gt;
&lt;li&gt;Setup the CNAME record to verify your domain&lt;/li&gt;
&lt;li&gt;Wait about 20–60 minutes until status turns green&lt;/li&gt;
&lt;li&gt;Click Request Certificate&lt;/li&gt;
&lt;li&gt;Wait a few more minutes for SSL to be issued&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to the SSL/TLS section&lt;/li&gt;
&lt;li&gt;Install your Private Key and Certificate&lt;/li&gt;
&lt;li&gt;Refresh your browser&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;✅ Your site will now work over a secure HTTPS connection.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎨 Step 6: Design Your WordPress Site
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Login to your WordPress dashboard (yourdomain.epizy.com/wp-admin)&lt;/li&gt;
&lt;li&gt;Go to Appearance &amp;gt; Themes&lt;/li&gt;
&lt;li&gt;Choose a free theme like Astra&lt;/li&gt;
&lt;li&gt;Click Get Started to import starter templates&lt;/li&gt;
&lt;li&gt;Select a free pre-built design&lt;/li&gt;
&lt;li&gt;Follow the setup wizard to build your site&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;👏 You now have a fully functional, professional-looking WordPress site — totally free!&lt;/p&gt;

&lt;h2&gt;
  
  
  ✅ Final Thoughts
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You’ve just created a free WordPress website with:&lt;/li&gt;
&lt;li&gt;Free hosting via InfinityFree&lt;/li&gt;
&lt;li&gt;A free subdomain&lt;/li&gt;
&lt;li&gt;HTTPS enabled with SSL&lt;/li&gt;
&lt;li&gt;A beautiful design powered by free themes
You can now customize your site, add content, install plugins, and make it your own.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💬 Need Help?
&lt;/h2&gt;

&lt;p&gt;If you found this guide helpful or have any questions, feel free to leave a comment or reach out. Happy building!&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>webdev</category>
      <category>programming</category>
      <category>php</category>
    </item>
    <item>
      <title>Clustering with R Programming</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Mon, 14 Aug 2023 10:21:21 +0000</pubDate>
      <link>https://dev.to/anvilicious/clustering-with-r-programming-1jmk</link>
      <guid>https://dev.to/anvilicious/clustering-with-r-programming-1jmk</guid>
      <description>&lt;p&gt;Clustering is a machine learning technique that falls under the category of unsupervised learning. In unsupervised learning, we work with datasets that consist of input data without labeled responses or target variables. This means that we do not have predefined categories or classes for the data.&lt;/p&gt;

&lt;p&gt;The main objective of clustering is to identify meaningful patterns and structures within a dataset by grouping similar data points together and separating dissimilar data points. This is done without any prior knowledge of the underlying class labels or categories.&lt;/p&gt;

&lt;p&gt;The process of clustering involves dividing a population or dataset into a number of distinct groups or clusters. The goal is to ensure that data points within the same cluster are more similar to each other compared to data points in different clusters. At the same time, data points in different clusters should be dissimilar or less similar to each other.&lt;/p&gt;

&lt;p&gt;In this chapter, different clustering techniques in R are showcased, which encompass k-means clustering, k-medoids clustering, hierarchical clustering, and density-based clustering. The initial two sections illustrate the application of the k-means and k-medoids algorithms to cluster the iris dataset. Following that, the third section demonstrates an example of hierarchical clustering using the same dataset. Lastly, the concept of density-based clustering and the utilization of the DBSCAN algorithm are discussed, along with a demonstration of clustering using DBSCAN and assigning labels to new data points using the clustering model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;K-Means Clustering&lt;/strong&gt;&lt;br&gt;
We begin by using the iris dataset for k-means clustering. In the code below, the species column is removed before we can implement the kmeans() method on the new dataset. We then store the clustering result inside kmeans.result and set the cluster number to 3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ypkbkmieec1w019dw08.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ypkbkmieec1w019dw08.png" alt="Image description" width="749" height="472"&gt;&lt;/a&gt;&lt;br&gt;
The next step is to compare our clustering output with the Species label to check for classification accuracy. Based on the results, we can see that the 'setosa' cluster is the easiest to segment. However, the features of the 'virginica' and 'versicolor' clasters tend to overlap--albeit to a small degree.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftwpiw6enf2cgre124yx6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftwpiw6enf2cgre124yx6.png" alt="Image description" width="376" height="136"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Subsequently, the clusters and their centers are visualized in a plot (refer to the figure below). It is important to note that although the data contains four dimensions, only the first two dimensions are considered for the plot presented below. In the plot, certain black data points appear to be near the green centroid (marked as an asterisk); however, in the four-dimensional space, they may actually be closer to the black centroid. Additionally, it is crucial to acknowledge that the outcomes of k-means clustering can differ across different runs due to the random selection of initial cluster centroids.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmfkz68n22t6gzl3htv5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmfkz68n22t6gzl3htv5.png" alt="Image description" width="709" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The k-Medoids Clustering&lt;/strong&gt;&lt;br&gt;
In this section, k-medoids clustering is demonstrated using the functions pam() and pamk() in R. K-medoids clustering is similar to k-means clustering, but with a key difference: while k-means represents a cluster with its center, k-medoids represents a cluster with the data point closest to the center. This characteristic makes k-medoids more robust in the presence of outliers.&lt;/p&gt;

&lt;p&gt;The PAM (Partitioning Around Medoids) algorithm is a classic approach for k-medoids clustering. However, it can be inefficient for clustering large datasets. To address this limitation, the CLARA algorithm is introduced. CLARA draws multiple samples of the data, applies the PAM algorithm to each sample, and returns the best clustering result. This approach performs better on larger datasets. In R, the functions pam() and clara() from the cluster package are implementations of PAM and CLARA, respectively.&lt;/p&gt;

&lt;p&gt;To specify the desired number of clusters (k), the user needs to provide the value as input when using the pam() or clara() functions. However, an enhanced version called pamk() from the fpc package does not require the user to explicitly choose k. Instead, it estimates the optimal number of clusters using the average silhouette width.&lt;/p&gt;

&lt;p&gt;The code snippet provided below demonstrates how to perform k-medoids clustering using the pam() and pamk() functions in R.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14e9h8bx8y488sq4gkra.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14e9h8bx8y488sq4gkra.png" alt="Image description" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6sv5y3695sw4yexekqk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6sv5y3695sw4yexekqk.png" alt="Image description" width="800" height="250"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyc5jfw8l1mdjtlnbl6zq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyc5jfw8l1mdjtlnbl6zq.png" alt="Image description" width="547" height="513"&gt;&lt;/a&gt;&lt;br&gt;
In the above example, the function pamk() yields two clusters: one representing the "setosa" species and the other comprising a mixture of the "versicolor" and "virginica" species. The figure above showcases the results through two charts. The left chart, known as a "clusplot" or clustering plot, visualizes the two clusters in a 2-dimensional space, with lines indicating the distances between the clusters. The right chart displays the silhouettes of the clusters. The silhouette coefficient (si) measures the quality of clustering for each observation. A high si value close to 1 indicates that the observation is well-clustered, while a small si value around 0 suggests that the observation lies between two clusters. Observations with a negative si value are likely assigned to the wrong cluster. In the given silhouette, the average si values are 0.81 and 0.62 for the two clusters, indicating that the identified clusters are well-separated and appropriately clustered.&lt;/p&gt;

&lt;p&gt;Let us now try pam() using k = 3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpzt8z3ibargeyu31u38.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpzt8z3ibargeyu31u38.png" alt="Image description" width="703" height="253"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspa5cl24fyaqh737mycg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspa5cl24fyaqh737mycg.png" alt="Image description" width="681" height="510"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80xin55dhlcdcrtfml4d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80xin55dhlcdcrtfml4d.png" alt="Image description" width="484" height="417"&gt;&lt;/a&gt;&lt;br&gt;
In the above outputs generated using pam(), we have a species 'setosa' cluster which is distinct from the other two clusters that primarily account for the 'versicolo' and 'virginica' clusters. I am still not sure whether pam() outperforms pamk() because the answer is contingent on domain knowledge and target problem. However, pam() seems is more ideallyic in our current context because it discerned between the three main clusters. The caveat is that we had to instruct pam() to use 3 clusters by initializing k as equal to 3. We already knew beforehand that there were three species in the iris dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hierarchical Clustering&lt;/strong&gt;&lt;br&gt;
hclust() provides a veritable strategy for R data scientists to conduct hierarchical clustering. The preliminary step is to use the iris dataset to retrieve 40 samples to avoid ending up with an overcrowded clustering plot. Just like we did in the previus sections, we have to remove the species variable before performing the clustering.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefoh27o8ukhfg2clmy3w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefoh27o8ukhfg2clmy3w.png" alt="Image description" width="503" height="182"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1otm4wp2vfeuvfjpjwf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo1otm4wp2vfeuvfjpjwf.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I know the first line is a bit confusing so Let's break it down:&lt;/p&gt;

&lt;p&gt;dim(iris) returns the dimensions of the iris dataset, where the number of rows is obtained by accessing the first element. 1:dim(iris)[1] creates a sequence from 1 to the number of rows in the iris dataset. sample(1:dim(iris)[1], 40) randomly selects 40 values from the sequence generated in the previous step, without replacement. This means that each selected index will be unique and not repeated within the sample. Finally, the selected indices are assigned to the variable idx.&lt;/p&gt;

&lt;p&gt;As demonstrated in k-means clustering, the above figure shows that while the 'setosa' cluster is easy to distinguish, the model overlaps some of the classifications for the 'virginica' and 'versicolor' clusters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Density-based Clustering&lt;/strong&gt;&lt;br&gt;
Finally, the fpc package will supply us with the DBSCAN algo to help us cluster numeric data based on density. DBSCAN primarily relies on the eps and MinPts parameters. The former defines the neighborhood size (or reachability distance)while the latter describes the lowest number of points.&lt;/p&gt;

&lt;p&gt;If a point α has at least MinPts neighbors, it is considered a dense point. All points within its neighborhood are considered density-reachable from α and are assigned to the same cluster as α.&lt;/p&gt;

&lt;p&gt;The advantages of density-based clustering are its ability to detect clusters of different shapes and sizes and its robustness to noise. In contrast, the k-means algorithm tends to identify clusters that are spherical in shape and have similar sizes.&lt;/p&gt;

&lt;p&gt;Here is an illustration of density-based clustering applied to the iris dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh3vf6i63kda6kh9k03u6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh3vf6i63kda6kh9k03u6.png" alt="Image description" width="388" height="229"&gt;&lt;/a&gt;&lt;br&gt;
In the table above, the numbers "1" to "3" in the first column represent three distinct clusters that have been identified. On the other hand, the number "0" denotes noises or outliers, which are data points that do not belong to any of the clusters. These outliers are represented as black circles in the table below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv64dlhpkf3rd9cvsklvh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv64dlhpkf3rd9cvsklvh.png" alt="Image description" width="508" height="470"&gt;&lt;/a&gt;&lt;br&gt;
The clusters are shown below in a scatter plot using the first and fourth columns of the data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fum5p9kb3tgz0ucjhw22l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fum5p9kb3tgz0ucjhw22l.png" alt="Image description" width="517" height="469"&gt;&lt;/a&gt;&lt;br&gt;
An alternative approach to visualizing the clusters is by utilizing the plotcluster() function from the fpc package. It's worth mentioning that in this representation, the data points are projected in a manner that highlights the distinctions between the different classes or clusters. This projection allows for a clearer visualization of the cluster boundaries and the separation between different groups within the dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffugoo0hmcq89jbn7gzt9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffugoo0hmcq89jbn7gzt9.png" alt="Image description" width="510" height="452"&gt;&lt;/a&gt;&lt;br&gt;
The clustering model can also be employed to assign labels to new data points by assessing their similarity to existing clusters. To illustrate this, consider the following example: a sample of 10 objects is drawn from the iris dataset, and small random noises are introduced to create a new dataset for labeling. These random noises are generated using the runif() function, which employs a uniform distribution. By comparing the new data points to the existing clusters, the clustering model can assign appropriate labels to them based on their similarity to the pre-existing clusters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7h5f1bg40bor4sc66d10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7h5f1bg40bor4sc66d10.png" alt="Image description" width="639" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcfxlnerl3m9yxc8dzc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcfxlnerl3m9yxc8dzc7.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
As demonstrated above, only 8 of the 10 new unlabeled data are correctly labeled. The new data are shown as asterisk(“*”) in the above figure and the colors stand for cluster labels.&lt;/p&gt;

</description>
      <category>r</category>
      <category>datascience</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Decision Trees and Random Forest in R Programming</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Fri, 28 Jul 2023 07:52:55 +0000</pubDate>
      <link>https://dev.to/anvilicious/decision-trees-and-random-forest-in-r-programming-2404</link>
      <guid>https://dev.to/anvilicious/decision-trees-and-random-forest-in-r-programming-2404</guid>
      <description>&lt;p&gt;In this section, the process of constructing predictive models in R using the party, rpart, and randomForest packages is demonstrated. The chapter commences by constructing decision trees using the party package and employing the generated tree for classification purposes. Subsequently, an alternative approach to constructing decision trees using the rpart package is introduced. Finally, an example is provided to showcase the training of a random forest model using the randomForest package.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision Trees using Package party&lt;/strong&gt;&lt;br&gt;
This section illustrates the process of constructing a decision tree for the iris data using the ctree() function from the party package. More specifically, the features Sepal.Length, Sepal.Width, Petal.Length, and Petal.Width are utilized to predict the species of flowers. The ctree() function within the package builds the decision tree, while predict() enables predictions for new data. Prior to modeling, the iris data is divided into two subsets: training (70%) and test (30%). To ensure reproducibility of the results, a fixed value is set for the random seed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F68ww6b72b1owkld8mv5d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F68ww6b72b1owkld8mv5d.png" alt="Image description" width="800" height="399"&gt;&lt;/a&gt;&lt;br&gt;
The code below shows how to load the party package and build the decision tree model before outputting the prediction result. myFormula outlines our target variable (Species) while initializing the other variables as independent parameters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqvxagcdexnebg9xjhq3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqvxagcdexnebg9xjhq3.png" alt="Image description" width="680" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F563qrcd18rhour1ha3jo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F563qrcd18rhour1ha3jo.png" alt="Image description" width="571" height="91"&gt;&lt;/a&gt;&lt;br&gt;
Now let us explore our built tree using the print function to output the rules and by plotting the tree.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6j0whszvphjc24xdzgf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6j0whszvphjc24xdzgf.png" alt="Image description" width="616" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiazs97u64q2zywclt8t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiazs97u64q2zywclt8t.png" alt="Image description" width="574" height="478"&gt;&lt;/a&gt;&lt;br&gt;
In the figure above, the bar plot representing each terminal point exhibits the likelihoods of an occurrence being assigned to the three distinct categories. In the figure below, these probabilities are represented as "y" within the nodes. To illustrate, node 2 is denoted as "n=40, y=(1, 0, 0)," indicating the presence of 40 training occurrences, all of which pertain to the initial category, "setosa." Next, the constructed tree must undergo testing using test data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh9j3yo0umh1omy9ihmja.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh9j3yo0umh1omy9ihmja.png" alt="Image description" width="517" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu689a0rha23iw6rhg1p0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu689a0rha23iw6rhg1p0.png" alt="Image description" width="418" height="178"&gt;&lt;/a&gt;&lt;br&gt;
The current iteration of the ctree() function (specifically, version 0.9-9995) lacks robust handling of missing values. In this case, an instance with a missing value may be assigned to either the left or right sub-tree inconsistently, possibly due to surrogate rules.&lt;/p&gt;

&lt;p&gt;Another concern arises when a variable is present in the training data and provided to ctree(), but does not appear in the constructed decision tree. In such instances, the test data must also contain that variable in order for predictions to be made successfully using the predict() function. Additionally, if the categorical variable levels in the test data differ from those in the training data, prediction on the test data will fail.&lt;/p&gt;

&lt;p&gt;To address the aforementioned issues, one possible solution is to construct a new decision tree using ctree() after the initial tree is built. This new tree should only include variables that exist in the first tree. Furthermore, it is essential to explicitly set the categorical variable levels in the test data to match the corresponding variable levels in the training data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision Trees with Package rpart&lt;/strong&gt;&lt;br&gt;
To mix things up a little, we will use the bodyfat dataset alongside rpart package to create a decision tree model. rpart() helps us build the model so that we can select the decision tree with the least prediction error. Thereafter, we apply the model to data it has never seen before and generate prections using the usual suspect: predict(). But first things first, let us load the bodyfat dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfuev9n2cc8btk0x5ryb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfuev9n2cc8btk0x5ryb.png" alt="Image description" width="653" height="274"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0l4rnplzf1wuz93sskmt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0l4rnplzf1wuz93sskmt.png" alt="Image description" width="422" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpt6euu7zabohsdu4xy3m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpt6euu7zabohsdu4xy3m.png" alt="Image description" width="800" height="288"&gt;&lt;/a&gt;&lt;br&gt;
The following code splits the datasets into the test and train subsets before a decision tree model is built using the latter subset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51kp13krjfze365zqe6i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51kp13krjfze365zqe6i.png" alt="Image description" width="765" height="177"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zhrdokev5ry4ay2dw1w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zhrdokev5ry4ay2dw1w.png" alt="Image description" width="422" height="499"&gt;&lt;/a&gt;&lt;br&gt;
Let us visualize the built tree&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4u78yqoefg2ylxar01da.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4u78yqoefg2ylxar01da.png" alt="Image description" width="520" height="483"&gt;&lt;/a&gt;&lt;br&gt;
Now let us identify the one with the least error rate&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn481vsfxrw6cb9jc5qda.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn481vsfxrw6cb9jc5qda.png" alt="Image description" width="429" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w9tabidleir9fjcs0o7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w9tabidleir9fjcs0o7.png" alt="Image description" width="424" height="441"&gt;&lt;/a&gt;&lt;br&gt;
We can now use the best tree predict values and compare them to the real values in the bodyfat dataset. The following code uses abline() to draw a diagonal line. If the model is good enough then most of the points should be on or next to this line that represents the actual values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felfyd4g36zkv12wt8050.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felfyd4g36zkv12wt8050.png" alt="Image description" width="800" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Random Forest&lt;/strong&gt;&lt;br&gt;
Finally, let us install the randomForest package for our next predictive model that will use the iris dataset. Unfortunately, randomForest cannot handle datasets with missing values. Moreover, every categorical variable can only have a maximum number of 32 levels. If the levels exceed 32, transformation prior to feeding data is pertinent. You can also leverage package party's cforest() function as it does not limit categorical attributes to 32 levles. Nonetheless, you'll still end up using more memory and spend a lot of time training the model when you have too many levels. We begin by splitting the dataset into training and test subsets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k6wkhm3s8ps7q35x866.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k6wkhm3s8ps7q35x866.png" alt="Image description" width="604" height="78"&gt;&lt;/a&gt;&lt;br&gt;
The code below loads the required package and begins the training process. The logic is pretty much the same as in the other instances.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dmpwhp48tyfd7nqpcq8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dmpwhp48tyfd7nqpcq8.png" alt="Image description" width="576" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5llg510neweo3y892gmv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5llg510neweo3y892gmv.png" alt="Image description" width="576" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe47g9dg7gafa2bw73ald.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe47g9dg7gafa2bw73ald.png" alt="Image description" width="669" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mvwcacbwph1bqewyx9z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mvwcacbwph1bqewyx9z.png" alt="Image description" width="347" height="516"&gt;&lt;/a&gt;&lt;br&gt;
After that, we plot the error rates with various number of trees.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbopauel0hece1k6masy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbopauel0hece1k6masy.png" alt="Image description" width="401" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffiza2uy3twv6dnms6vvz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffiza2uy3twv6dnms6vvz.png" alt="Image description" width="285" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6ty54lvmx6bfj686qij.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6ty54lvmx6bfj686qij.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
Finally, the built random forest is tested on test data, and the result is checked with functions table() and margin(). The margin of a data point is as the proportion of votes for the correct class minus maximum proportion of votes for other classes. Generally speaking, positive margin means correct classification.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfq8ty7t77euobggqs3z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfq8ty7t77euobggqs3z.png" alt="Image description" width="332" height="151"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopkhnmozwwbm93iaxedh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopkhnmozwwbm93iaxedh.png" alt="Image description" width="408" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>4 Outlier Detection Strategies with R</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Thu, 27 Jul 2023 10:55:26 +0000</pubDate>
      <link>https://dev.to/anvilicious/4-outlier-detection-strategies-with-r-2ehd</link>
      <guid>https://dev.to/anvilicious/4-outlier-detection-strategies-with-r-2ehd</guid>
      <description>&lt;p&gt;In this article, I focus on 4 examples of outlier detection using R. I will start with univariate outlier detection before providing an example using Local Outlier Factor clustering, and time series outlier detection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Univariate Outlier Detection&lt;/strong&gt;&lt;br&gt;
We sometimes need to detect outliers when we are dealing with a single variable. R provides an elegant method for achieving this task: boxplot.stats(). The function (especially its 'out' component) returns the data typically  used to generate boxplots. The 'out' components contains an entire list of all the outliers in the dataset. We can also use 'coef' argument to change the whisker's boundaries. But that's a story for another day.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo7dg74cidj6m75ubnfgq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo7dg74cidj6m75ubnfgq.png" alt="Image description" width="354" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvfoqayzkve56916wiibt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvfoqayzkve56916wiibt.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
The same univariate outlier detection process is also appropriate for multivariate data. The code below shows how we can use the variables x and y inside a dataframe to detect outliers. We begin by detecting the variables outliers separately before achieving the same result for both column x and y.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvj2ylhx0joj9vmf7imve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvj2ylhx0joj9vmf7imve.png" alt="Image description" width="498" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftt05ls9ab42panyng4ft.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftt05ls9ab42panyng4ft.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
We can also assume that outliers are any extreme values in either column.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5hl9tvxffi55khvq6r0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5hl9tvxffi55khvq6r0.png" alt="Image description" width="445" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2n9xn0jaya6dmnoa1sw5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2n9xn0jaya6dmnoa1sw5.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Outlier Detection with LOF&lt;/strong&gt;&lt;br&gt;
LOF stands for "Local Outlier Factor." It is an unsupervised machine learning algorithm used for outlier detection in datasets. The LOF algorithm quantifies the 'outlierness' of each data point with respect to its local neighborhood. That is, a point's local density is contrasted with neighbors' and the point is regarded as sparse when the LOF value is more than 1 (making it an outlier). &lt;/p&gt;

&lt;p&gt;LOF is particularly useful when dealing with datasets containing irregularly shaped clusters or non-uniform density distributions. Unfortunately, we can only use this method on numeric data. We use lofactor() to estimate the local outlier factors using R's dprep and DMwR packages. In the example below, we detect the outliers using variable k as the assumed number of neighbors during the calculation. We have also generated a chart of the factor scores.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F216bdyn9r23cxuchnrue.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F216bdyn9r23cxuchnrue.png" alt="Image description" width="494" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzzd4smrbvvlmohr8g4s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzzd4smrbvvlmohr8g4s.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flnrav3oimxlrbbkwmur8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flnrav3oimxlrbbkwmur8.png" alt="Image description" width="497" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw1po651o98ujn5ivhld.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw1po651o98ujn5ivhld.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
In the code above, we use prcomp() to estimate the principal component analysis and biplot() to generate the chart by ploting the data with the two initial components. Principal Component Analysis (PCA) is a widely used statistical technique in data analysis and dimensionality reduction. It is used to transform high-dimensional data into a lower-dimensional space while preserving as much of the data's original variability as possible. PCA achieves this by finding a new set of orthogonal axes, known as principal components, along which the data has the maximum variance. In the chart above, the x- and y-axis are respectively the first and second principal components, the arrows show the original columns (variables), and the five outliers are labeled with their row numbers.&lt;/p&gt;

&lt;p&gt;Additionally, we can visualize outliers using a pairs plot, where we represent outliers with a red "+" symbol. The code is demontrated below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pbq1p9gzcxj5qn3bvlm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pbq1p9gzcxj5qn3bvlm.png" alt="Image description" width="481" height="524"&gt;&lt;/a&gt;&lt;br&gt;
Package dbscan has function lof(), an implementation of the LOF algorithm. While it serves the same function as lofactor(), it also has two extra methods that can leverage multiple k choice distance metrics and values. The following code shows an application of lof(). We first calculate the outliers the detect them by displaying the top values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r5erf5tt8m8zh67dfxq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r5erf5tt8m8zh67dfxq.png" alt="Image description" width="648" height="511"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;3. Outlier Detection by Clustering&lt;/strong&gt;&lt;br&gt;
When we cluster data, all the datapoints that do not belong to a specific group or category are automatically regarded as outliers. A good example is how DBSCAN's density-based clustering group objects into a single cluster is they belong the same densely populated area. As a result, an object that is not assigned a cluster are assumed to be outliers due to their isolation.&lt;/p&gt;

&lt;p&gt;However, in this section, we will use the k-means algorithm to achieve our goal as demonstrated in the code below. We first partition our data into k groups by placing the datapoints closest to their cluster centers. The next step is to estimate the dissimilarity (distance) between every item and their cluster centers to identify the ones with the largest distance values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5kpgobljjzz9owa4uskt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5kpgobljjzz9owa4uskt.png" alt="Image description" width="593" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcs6troy4i3jvo85qfmx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcs6troy4i3jvo85qfmx.png" alt="Image description" width="448" height="166"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;4. Outlier Detection from Time Series&lt;/strong&gt;&lt;br&gt;
We will now use a time series dataset for the purspoe of outlier detection. We begin by decomposing the data using function stl's robust regression methodology. The process is also known as robust fitting, and it allows us to to identify the outliers. Check out &lt;a href="http://cs.wellesley.edu/%7Ecs315/Papers/stl%20statistical%20model" rel="noopener noreferrer"&gt;http://cs.wellesley.edu/~cs315/Papers/stl%20statistical%20model&lt;/a&gt; if you wanna learn more about STL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lr0nsrlt10lk2ppaoco.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lr0nsrlt10lk2ppaoco.png" alt="Image description" width="682" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl4ydq6ex058mkku1qy3r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl4ydq6ex058mkku1qy3r.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
If you made it this far, the least I could do is offer you a veritable easter egg in the form of a book quote:&lt;/p&gt;

&lt;p&gt;"Mario, what do you get when you cross an insomniac, an unwilling agnostic and a dyslexic?"&lt;/p&gt;

&lt;p&gt;"I give."&lt;/p&gt;

&lt;p&gt;"You get someone who stays up all night torturing himself mentally over the question of whether or not there's a dog."&lt;/p&gt;

&lt;p&gt;David Foster Wallace - Infinite Jest&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>beginners</category>
      <category>r</category>
    </item>
    <item>
      <title>How to Generate and Save Fancy Graphs and 3D Plots as PDF in R Programming</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Mon, 03 Jul 2023 13:57:20 +0000</pubDate>
      <link>https://dev.to/anvilicious/how-to-generate-and-save-fancy-graphs-and-3d-plots-as-pdf-in-r-programming-kap</link>
      <guid>https://dev.to/anvilicious/how-to-generate-and-save-fancy-graphs-and-3d-plots-as-pdf-in-r-programming-kap</guid>
      <description>&lt;p&gt;This is a continuation of my &lt;a href="https://dev.to/anvilicious/fundamentals-of-data-analysis-in-r-programming-r-cheat-sheet-code-included-3m1p"&gt;previous article&lt;/a&gt; on fundamentals of data analysis using R. You can access all the code used in both articles from my &lt;a href="https://github.com/BDFL669/Data_Analysis_with_R/blob/main/Fundamentals_of_Data_Analysis_in_R_Programming.ipynb" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;In this section, various visually appealing graphs are showcased, such as 3D plots, level plots, contour plots, interactive plots, and parallel coordinates. &lt;/p&gt;

&lt;h2&gt;
  
  
  Scatter Plot
&lt;/h2&gt;

&lt;p&gt;We begin by creating a 3D scatter plot using the scatterplot3d package.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89gafdi6qddj8shsdc2g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89gafdi6qddj8shsdc2g.png" alt="Image description" width="464" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Heatmaps
&lt;/h2&gt;

&lt;p&gt;A heat map is a visual representation of a two-dimensional data matrix, which can be created using the heatmap() function in R. The following code demonstrates how to calculate the similarity between various flowers in the iris data using dist() and subsequently visualize it as a heat map.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzj68dlrs4b2bdd1zkij.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzj68dlrs4b2bdd1zkij.png" alt="Image description" width="357" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Level Plots
&lt;/h2&gt;

&lt;p&gt;A level plot is a type of graphical representation that displays the variation in a two-dimensional dataset using contour-like regions or color gradients. To generate a level plot, you can utilize the levelplot() function available in the lattice package. The grey.colors() function can be used to create a vector of gamma-corrected gray colors. Alternatively, you can use the rainbow() function, which generates a vector of continuous colors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9wg6gdxk1oitxs59jdk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9wg6gdxk1oitxs59jdk.png" alt="Image description" width="633" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Contour Plots
&lt;/h2&gt;

&lt;p&gt;A contour plot is a graphical representation that illustrates the variation of a two-dimensional dataset by displaying lines or regions of constant values, often used to visualize continuous data or functions. You can create contour plots in R using the contour() and filled.contour() functions from the graphics package, or by utilizing the contourplot() function available in the lattice package.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw16vfqfcnaw145g3z56t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw16vfqfcnaw145g3z56t.png" alt="Image description" width="620" height="349"&gt;&lt;/a&gt;&lt;br&gt;
To interpret the above contour plot, you can examine the lines or regions of constant values, which represent the levels or contours of the data. The contour lines indicate areas of similar values, with closer lines indicating a steeper change in the underlying data. The spacing between contour lines can provide insights into the data's gradient or rate of change. Additionally, the colors or shading used in a filled contour plot further indicate the magnitude or intensity of the data values.&lt;/p&gt;

&lt;h2&gt;
  
  
  Surface Plots
&lt;/h2&gt;

&lt;p&gt;An alternative method to visualize a numerical matrix is by using a 3D surface plot, which can be created using the persp() function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnn5b5ak8tquysbbangs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnn5b5ak8tquysbbangs.png" alt="Image description" width="436" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Parallel Cordinates
&lt;/h2&gt;

&lt;p&gt;Parallel coordinates also offer an effective means of visualizing multidimensional data, and you can generate a parallel coordinates plot using the parcoord() function from the MASS package, or by utilizing the parallelplot() function available in the lattice package.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9i2uxzl9fd78c76gehl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9i2uxzl9fd78c76gehl.png" alt="Image description" width="391" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuz6koweonwbscxxxr11f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuz6koweonwbscxxxr11f.png" alt="Image description" width="424" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Save Charts into Files
&lt;/h2&gt;

&lt;p&gt;If you generate numerous graphs during data exploration, it is advisable to save them as files. R offers several functions for this purpose. Below are examples of saving charts as PDF and PS files using pdf() and postscript() respectively. Additionally, you can generate picture files in BMP, JPEG, PNG, and TIFF formats using bmp(), jpeg(), png(), and tiff() respectively. Remember to close the files (or graphics devices) by using graphics.off() or dev.off() after plotting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr5r9ckcf4grtej7h4ko.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr5r9ckcf4grtej7h4ko.png" alt="Image description" width="235" height="187"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;I hope you had as much fun reading this as I did creating the content. Remember to install packages before imprting them using the library() method. Shoutout to Zhao's 'R and Data Mining' Textbook, wouldn't have done this without you!&lt;/p&gt;

</description>
      <category>r</category>
      <category>datascience</category>
      <category>data</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Fundamentals of Data Analysis in R Programming (R Cheat Sheet Code Included)</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Mon, 03 Jul 2023 07:59:43 +0000</pubDate>
      <link>https://dev.to/anvilicious/fundamentals-of-data-analysis-in-r-programming-r-cheat-sheet-code-included-3m1p</link>
      <guid>https://dev.to/anvilicious/fundamentals-of-data-analysis-in-r-programming-r-cheat-sheet-code-included-3m1p</guid>
      <description>&lt;p&gt;The code used here is available on &lt;a href="https://github.com/BDFL669/Data_Analysis_with_R/blob/main/Fundamentals_of_Data_Analysis_in_R_Programming.ipynb" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Simply click on the 'Open in Colab' button to seamlessly run the code. &lt;/p&gt;

&lt;p&gt;Now that that's out of the way, before any data analysis can begin, we first need dataset(s). R provides the Iris and BodyFat datasets to anyone who knows where to look.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#TH.data package is needed for access to the BodyFat dataset
install.packages("TH.data", repos = "http://cran.r-project.org")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We get the following output when we import the package and use str() to examine number of variables and entries (columns and rows).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9un67ty8jvibhxha1cs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9un67ty8jvibhxha1cs.png" alt="Image description" width="647" height="275"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can also probe the Iris dataset like so: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92108w7an9n9ioaecmts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92108w7an9n9ioaecmts.png" alt="Image description" width="534" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data import and export in R
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsdptuiozi9llp2d129e5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsdptuiozi9llp2d129e5.png" alt="Image description" width="328" height="141"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;a &amp;lt;- 1:10: This line creates a numeric vector a containing values from 1 to 10. The colon (:) operator is used to generate a sequence of numbers.&lt;/li&gt;
&lt;li&gt;save(a, file="./data/dumData.Rdata"): The save() function is used to save the variable a into a file named "dumData.Rdata". The file is saved in the "./data/" directory.&lt;/li&gt;
&lt;li&gt;rm(a): The rm() function is used to remove the variable a from the current R session. This means that the variable a is deleted and no longer accessible.&lt;/li&gt;
&lt;li&gt;load("./data/dumData.Rdata"): The load() function is used to load the previously saved "dumData.Rdata" file back into the R session. This action reads the file and restores the saved variable a along with its associated values.&lt;/li&gt;
&lt;li&gt;print(a): Finally, the print() function is used to display the contents of the variable a. Since a was loaded from the saved file, it contains the values from 1 to 10 that were previously assigned.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The below example demonstrates how to create a dataframe (df1), save it as a CSV file using write.csv(), and then load the dataframe from the file into df2 using read.csv().&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1utn31g8tq6d63r5w2ge.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1utn31g8tq6d63r5w2ge.png" alt="Image description" width="459" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Exploration
&lt;/h2&gt;

&lt;p&gt;To examine the size and structure of data, you can use various functions in R. Here are examples that showcase the usage of dim(), names(), str(), and attributes():&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk2tgprlphpgvpugwqyz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk2tgprlphpgvpugwqyz.png" alt="Image description" width="609" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;dim() returns the dimensions (number of rows and columns) of an object, such as a matrix or dataframe.&lt;/li&gt;
&lt;li&gt;names() retrieves the names of the variables or columns in an object, such as a dataframe.&lt;/li&gt;
&lt;li&gt;str() provides the structure of an object, displaying the data type and overall structure of the variables or columns.&lt;/li&gt;
&lt;li&gt;attributes() retrieves the attributes associated with an object, which can include additional information or metadata about the data.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We can also retrieve the first and last rows of data using head() and tail() methods&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9brbkeblfeq6nwxsi0d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9brbkeblfeq6nwxsi0d.png" alt="Image description" width="434" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpe7lme7vg5tok5oog9t5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpe7lme7vg5tok5oog9t5.png" alt="Image description" width="514" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Analyzing a Specific Variable
&lt;/h2&gt;

&lt;p&gt;You can use the summary() function to examine the distribution of each numeric variable in your data. It provides important summary statistics such as the minimum, maximum, mean, median, and quartiles (25% and 75%). Additionally, for factors or categorical variables, summary() displays the frequency of each level or category.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F316p04nvnwl4i394ae8r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F316p04nvnwl4i394ae8r.png" alt="Image description" width="401" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can obtain the mean, median, and range of a variable using the functions mean(), median(), and range(), respectively. Additionally, if you need to calculate quartiles or percentiles, you can use the quantile() function.&lt;/p&gt;

&lt;p&gt;To assess the variance of the Sepal.Length variable, you can use the var() function. Furthermore, you can examine its distribution by creating a histogram and density plot. For the histogram, you can utilize the hist() function, while the density() function enables you to generate a density plot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F06aqvedpwwupla25kulr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F06aqvedpwwupla25kulr.png" alt="Image description" width="408" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9gczrvt6tqulis0prr7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9gczrvt6tqulis0prr7.png" alt="Image description" width="406" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The evidence suggest that the variable is normally distributed with a relatively low deviation from the mean.&lt;/p&gt;

&lt;p&gt;You can also determine the frequency of factors in a dataset using the table() function. Once you have calculated the frequencies, you can visualize them using either a pie chart created with the pie() function or a bar chart created with the barplot() function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fct0bpd1i7kff0ouqzy4m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fct0bpd1i7kff0ouqzy4m.png" alt="Image description" width="357" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxx5ndf7adpdt1bxanp8e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxx5ndf7adpdt1bxanp8e.png" alt="Image description" width="367" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Explore Multiple Variables
&lt;/h2&gt;

&lt;p&gt;Once we have examined the distributions of individual variables, the next step is to explore the relationships between two variables. To accomplish this, we can calculate the covariance and correlation between the variables using the cov() and cor() functions, respectively. The cov() function provides the covariance, which measures the linear association between variables, while the cor() function calculates the correlation, which measures the strength and direction of the linear relationship between variables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0tvj6kqf21ohdi107v07.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0tvj6kqf21ohdi107v07.png" alt="Image description" width="448" height="428"&gt;&lt;/a&gt;&lt;br&gt;
To visualize the distribution of a variable, we can employ the boxplot() function, which generates a box plot, also known as a box-and-whisker plot. This plot displays key statistical measures, including the median, first quartile (25th percentile), third quartile (75th percentile), and any outliers present.&lt;/p&gt;

&lt;p&gt;The median is represented by a horizontal line within the box, while the box itself represents the interquartile range (IQR), indicating the range between the 25th and 75th percentiles. Outliers, if present, are depicted as individual points beyond the whiskers.&lt;/p&gt;

&lt;p&gt;Essentially, a box plot provides a concise summary of the central tendency, spread, and presence of outliers in a distribution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cwtujobjx2o0lf42dcz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cwtujobjx2o0lf42dcz.png" alt="Image description" width="379" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F26iqeji4sgrty97q9nhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F26iqeji4sgrty97q9nhp.png" alt="Image description" width="499" height="180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;aggregate() provides a way to generate the descriptive stats (mean, 1st quartile, median, mean, 3rd quartile, and max, respectively) for existing variables as shown above.&lt;/p&gt;

&lt;p&gt;To create a scatter plot for two numeric variables in R, you can use the plot() function. By using the with() function, you can avoid the need to explicitly add "iris$" before variable names. Additionally, in the provided code snippet, the colors (col) and symbols (pch) of the data points are set based on the Species variable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7jjjhfaxgpfq4rzowgu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7jjjhfaxgpfq4rzowgu.png" alt="Image description" width="525" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In situations where there are numerous data points, it is possible for some of them to overlap. To address this issue, we can employ the jitter() function to introduce a slight amount of randomness or noise to the data prior to plotting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsmz805n49lysrb0b5w1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsmz805n49lysrb0b5w1.png" alt="Image description" width="404" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Bringing it all together
&lt;/h2&gt;

&lt;p&gt;In this section, I have focused mainly on 'zero to minimum viable descriptive analysis' code. In part 2, I will double down on the more complex visualization techniques. However, the code in this article should be enough to generate a comprehensive data analysis report. &lt;/p&gt;

</description>
      <category>r</category>
      <category>beginners</category>
      <category>datascience</category>
      <category>colab</category>
    </item>
    <item>
      <title>A JavaScript/React Developer’s Guide to Building WordPress Themes from Scratch (Including Boilerplate Code)</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Sat, 01 Jul 2023 20:47:39 +0000</pubDate>
      <link>https://dev.to/anvilicious/a-javascriptreact-developers-guide-to-building-wordpress-themes-from-scratch-including-boilerplate-code-2e1g</link>
      <guid>https://dev.to/anvilicious/a-javascriptreact-developers-guide-to-building-wordpress-themes-from-scratch-including-boilerplate-code-2e1g</guid>
      <description>&lt;p&gt;I have spent the past 4 weeks trying to create a boilerplate for my current and future wordpress projects. A lot has happened during this period: from almost learning PHP to having to start again after accidentally deleting a great chunk of my code. Anyway, why not save every other React developer from suffering a similar ordeal?&lt;/p&gt;

&lt;p&gt;First things first, if you’re a React developer, you should be able to code along, and might have a fully functional project within hours. Spoiler alert, I have not worked on the CSS. The site looks uglier than a 90-year-old vitiligo patient. No offence to my brothers and sisters suffering from this horrible disease #VitiligoAwareness.&lt;/p&gt;

&lt;p&gt;I chose &lt;a href="https://liquidtelecom.dl.sourceforge.net/project/xampp/XAMPP%20Windows/8.2.4/xampp-windows-x64-8.2.4-0-VS16-installer.exe" rel="noopener noreferrer"&gt;XAMPP&lt;/a&gt; to launch WordPress on my Windows laptop. XAMPP is a widely used open-source software package that provides a local development environment for web applications. It includes a combination of software components commonly used in web development, making it easy to set up a local server environment for testing and development purposes.&lt;/p&gt;

&lt;p&gt;After downloading the installer from the provided link, click on it and follow the prompts to launch the application. Click ‘Start’ to start the Apache and MySQL services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vy9b7hglie5vtb9sdjw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vy9b7hglie5vtb9sdjw.jpeg" alt="Image description" width="665" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now head over to the &lt;a href="https://wordpress.org/latest.zip" rel="noopener noreferrer"&gt;WordPress&lt;/a&gt; site to download the appropriate zip file. Extract the file's content to C:\Xampp\htdocs or wherever the htdocs is in your local environment. When you navigate to localhost/wordpress on your browser, Bitnami will ask you to follow a series of steps to get the WordPress up and running locally. &lt;/p&gt;

&lt;p&gt;Open your command line (I’m using git on VS Code). Navigate to C:\Xampp\htdocs\wordpress\wp-content\themes. Fork and git clone the repository found here: &lt;a href="https://github.com/BDFL669/wp-theme-barebones.git" rel="noopener noreferrer"&gt;https://github.com/BDFL669/wp-theme-barebones.git&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;First &lt;code&gt;cd&lt;/code&gt; into /wp-theme-barebones, navigate to react-src, and npm install the packages. Thereafter, navigate to localhost/wordpress/wp-admin and scroll to the Appearances &amp;gt; Theme pane. You should see the React theme we just git cloned appear in the list. Click to activate and navigate to localhost/wordpress again to see the magic in action. &lt;/p&gt;

&lt;p&gt;Btw, for the contact form on the About page to work, please register your recipient email with &lt;a href="https://postmail.invotes.com/" rel="noopener noreferrer"&gt;PostMail&lt;/a&gt;, an open source email server that sends all client requests directly to your email. Remember to safely save your token as an environment variable and plug it into the About component. &lt;/p&gt;

&lt;p&gt;Suffice it to say, we have now setup WordPress as our CMS platform and React as the frontend. WordPress handles all the data management and content delivery tasks, all you need to do is make the right calls. I’ll publish another article soon to break down the code in case some of us find it challenging. Otherwise, I got most of the code from the Internet, especially Michael Sirano’s three-part series on the same subject – and I am merely his vessel. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;JavaScript Devs are some of the toughest people to wear shoes&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>wordpress</category>
      <category>beginners</category>
    </item>
    <item>
      <title>6 Mistakes I Made as a Junior React Dev</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Tue, 20 Jun 2023 17:17:26 +0000</pubDate>
      <link>https://dev.to/anvilicious/6-mistakes-i-made-as-a-junior-react-dev-3lfj</link>
      <guid>https://dev.to/anvilicious/6-mistakes-i-made-as-a-junior-react-dev-3lfj</guid>
      <description>&lt;p&gt;It’s nearly three years since I built my first application, but I am yet to secure a software development gig. Looking back, these are some of the preconditions for my current predicament.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Tutorial Hell (is one hellova drug)&lt;/strong&gt;&lt;br&gt;
For sure. What better way to learn than to watch and copy someone else do it? I have been too scared to venture into the wilderness, so I constantly pursue the divine guidance of YouTube University. The longer the tutorial the more rewarding it feels. I haven’t been bothered to code along for a while, not since I realized I could just git clone the project codebase. While the tutorials offer quick introductions to complex, abstract ideas, they do very little to improve my mastery and confidence in my skill. The happy feelings after finishing a tutorial do not last long anyway because imposter syndrome is always right around the corner. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Not Building Enough Projects From the Get Go (dabbling work and code ain't easy)&lt;/strong&gt;&lt;br&gt;
So, like I have said before, I wish I spent the last few years populating my portfolio page. In my defense, I did shift to data science right after learning React to escape the harsh reality of my divorce. Another reason would be my lack of confidence after building the first 3 beginner React projects. As a result, I never got to build something more complex from scratch as I had discovered the plethora of YouTube tutorials by then. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Copy Pasting Code (type answers from the internet instead of copy pasting)&lt;/strong&gt;&lt;br&gt;
I’ll have to get on this high horse for just a second. I have noticed that I tend to forget most of the code I paste. However, typing imbues in me enough muscle memory to at least reduce the idea-implementation funnel the next time I am solving a similar challenge. Of course, there is no point in typing every code I’ll ever use for the rest of my career, but I’ll be typing as much as I can – especially when learning concepts I deeply care about (like recursion lol). &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. To Be or not To Be a Writer (one has to Read a lot of code)&lt;/strong&gt;&lt;br&gt;
I’m not a programmer, I’m a serial typist. However, as Lizzie Logan would have it, I’m a writer the same way a potato is a battery. I can’t just wake up and code if I don’t read a lot of and about coding. We’ll never know why I never got to finish Eloquent JavaScript, or the other data science books I picked up. What I do know is the greats are the greats for a reason. Interestingly, 5 books a year is more than enough to improve my programming skills if I dedicate the rest of my time to gaining experiential knowledge. That’s literally a book every 70 days–if you were wondering whether Gustav Jung was right to postulate the will to laziness as stronger than our innate sex drives…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Tweeting about Coding when I Could be Coding&lt;/strong&gt;&lt;br&gt;
In lieu of coding and reading about programming, I have been dabbling my 9-5 with Twitter. Hours I could have dedicated to optimizing my LinkedIn profile to attract recruiters were spent tending to my FOMO. It is true that I have learned a lot and made meaningful connections (and crushes), but I have nothing to show for it yet. I’ll stick to writing articles about software dev whenever the urge to Tweet overwhelms me, especially if I want to rant about a bug that took me a week to fix. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Open Source&lt;/strong&gt;&lt;br&gt;
If I had known better, I would have started contributing to open-source projects three months after my first encounter with JavaScript. Other than preparing me for my future git workflows, pull requests gimme the confidence to venture out on my own more frequently instead of relying on YouTube tutorials most of the time. If you’re worried about skill level then just use Eddie’s Good First Issue Finder for the most suitable issues. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bringing it All Together&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anyway, instead of a conclusion section, lemme finish this article with quotes from Community (the series). It aptly describes my connection to software development. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Annie:&lt;br&gt;
The Dean had his seventh epiphany today, which has given me an epiphany of my own: the Dean is a genius. He has to be. If he isn't, then I've given almost two weeks of my life to an idiot; that is unacceptable. Therefore, the Dean is a genius, and I will die protecting his vision.&lt;br&gt;
Abed:&lt;br&gt;
Are you by any chance familiar with Stockholm syndrome?&lt;br&gt;
Annie:&lt;br&gt;
Is it something that the Dean created? Because if not, I don't care.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>react</category>
      <category>javascript</category>
      <category>beginners</category>
      <category>webdev</category>
    </item>
    <item>
      <title>A Complete Beginner’s Guide for Creating a React JavaScript Chat Application from Scratch -- No Coding Experience Required</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Fri, 02 Jun 2023 09:22:50 +0000</pubDate>
      <link>https://dev.to/anvilicious/a-complete-beginners-guide-for-creating-a-react-javascript-chat-application-from-scratch-no-coding-experience-required-1043</link>
      <guid>https://dev.to/anvilicious/a-complete-beginners-guide-for-creating-a-react-javascript-chat-application-from-scratch-no-coding-experience-required-1043</guid>
      <description>&lt;p&gt;I love JavaScript even though I’m not good at it (and might never be good at it?). So, when my friend @ developed &lt;a href="https://tinkr.tech/learn/" rel="noopener noreferrer"&gt;a course on how to create a chat application using JS&lt;/a&gt;, I couldn’t resist the urge to give it a read and implement &lt;a href="https://hopeful-bell-553050.netlify.app/" rel="noopener noreferrer"&gt;my own version of the project&lt;/a&gt;. You can find the course &lt;a href="https://tinkr.tech/learn/" rel="noopener noreferrer"&gt;here&lt;/a&gt; and give it a try too -- you’re also welcome to develop your own courses and publish them on his platform!&lt;/p&gt;

&lt;p&gt;Without further ado, let's create a boilerplate for our chat app before adding the necessary functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Create React APP
&lt;/h2&gt;

&lt;p&gt;Other than it's connections to Facebook, React is a perfect framework. For the purpose of this project, I will fire up a React App on Replit because it's literally the easiest way for any beginner to get started. Platforms like Visual Studio code and GitHub are also highly recommended. &lt;/p&gt;

&lt;p&gt;If you already have an account then simply log in. If you don't have then please click &lt;a href="https://replit.com/signup?from=landing" rel="noopener noreferrer"&gt;here&lt;/a&gt; to set one up. Don't worry, they won't charge you a thing for what we're about to do.&lt;/p&gt;

&lt;p&gt;Now that we're all on the same page (haha get it? Same page :)),  add a new repl by clicking the blue 'plus' button on the right. When prompted for a language please type and select:  &lt;/p&gt;

&lt;pre&gt;
Create React App
&lt;/pre&gt;  

&lt;p&gt;Next you'll need to select the 'Create Repl' button and presto! You now have your React boilerplate. Click 'Ctrl+Enter' on the keyboard to see the following screen:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw80lndiqzvsb9ci0w8la.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw80lndiqzvsb9ci0w8la.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Create the UI Component
&lt;/h2&gt;

&lt;p&gt;Now navigate to your App.js file and add the following code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3zblb9yaorjidlomq648.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3zblb9yaorjidlomq648.png" alt="React UI Components" width="614" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first two lines enable us to leverage React core functionalities and style our components using the App.css file. The next block of code instantiate a functional component that returns the HTML input fields where we will be keying in the message and user ID while interacting with our chat application. Before moving to JavaScript, let add the required CSS code by pasting the following lines:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feulqtam6wtyfr4t3bm84.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feulqtam6wtyfr4t3bm84.png" alt="CSS Code" width="800" height="1089"&gt;&lt;/a&gt;&lt;br&gt;
Most of the CSS is pretty self-explanatory. However, I'd like to point out the differences between position fixed, relative, and absolute. An element with the position property assigned the value of fixed should typically be accompanied with the 'top' or 'bottom' properties to remove it out of the normal document flow and place it exactly where it needs to be. Think about a  element, for instance. &lt;/p&gt;

&lt;p&gt;In addition, the relative and absolute attributes can be confusing. The rule of thumb is that when elements are positioned absolutely, their position tends to be in relation to a parent element most likely with the position property assigned the relative attribute. &lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Bring the Chat App to Life with JavaScript
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhbhlemljgyc9cyzve20.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhbhlemljgyc9cyzve20.png" alt="fetchGet function" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cq0o3bw1a886e6u0mkc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cq0o3bw1a886e6u0mkc.png" alt="populate" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe0rv75ml9j8nh3ocu2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe0rv75ml9j8nh3ocu2t.png" alt="onLoad" width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk0di8c5nuoj6dzyova27.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk0di8c5nuoj6dzyova27.png" alt="onKey up" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Few5sj56xnnxz6697xm8r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Few5sj56xnnxz6697xm8r.png" alt="get messages" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx3n7cvk6pahlqm4vul4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx3n7cvk6pahlqm4vul4.png" alt="final function" width="800" height="548"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>react</category>
      <category>css</category>
      <category>codenewbie</category>
    </item>
    <item>
      <title>JavaScript Security Best Practices: Protecting Your Applications from Common Vulnerabilities (Plus Code Snippets)</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Fri, 02 Jun 2023 09:14:27 +0000</pubDate>
      <link>https://dev.to/anvilicious/javascript-security-best-practices-protecting-your-applications-from-common-vulnerabilities-plus-code-snippets-56al</link>
      <guid>https://dev.to/anvilicious/javascript-security-best-practices-protecting-your-applications-from-common-vulnerabilities-plus-code-snippets-56al</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnw3rcy7slgwhgqiahe3d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnw3rcy7slgwhgqiahe3d.jpg" alt="Image description" width="570" height="300"&gt;&lt;/a&gt;&lt;br&gt;
JavaScript is pivotal in client-side development, demanding heightened vigilance to counter prevalent security vulnerabilities. In this article, I will explore optimal practices for JavaScript security, providing you with invaluable insights and techniques to safeguard your applications against potential threats.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Validation and Sanitization of Inputs&lt;/strong&gt;&lt;br&gt;
Validating and sanitizing user inputs constitutes a pivotal step in securing your JavaScript applications. By implementing robust input validation, you can effectively preempt common vulnerabilities like Cross-Site Scripting (XSS) attacks. Employing JavaScript libraries like DOMPurify lets you sanitize user inputs, effectively neutralizing potentially malicious code. Installing the requisite package before utilization is vital to ensure seamless execution. A particularly enlightening resource that helped me write this article can be accessed through this &lt;a href="https://www.youtube.com/watch?v=YjFTidoXOOk" rel="noopener noreferrer"&gt;link&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8ioync59xue55hzsvnx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8ioync59xue55hzsvnx.png" alt="Image description" width="643" height="270"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;2. Steering Clear of Eval()&lt;/strong&gt;&lt;br&gt;
Utilizing the eval() function has emerged as a significant security hazard, given its potential to execute arbitrary code, thereby enabling code injection vulnerabilities. Instead, consider adopting alternative approaches such as JSON.parse() for parsing JSON data or leveraging the Function() constructor for executing dynamic code. Here is an example to illustrate my point:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwadxbfnsvopfq7t00zhd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwadxbfnsvopfq7t00zhd.png" alt="Image description" width="800" height="404"&gt;&lt;/a&gt;&lt;br&gt;
The first example demonstrates the usage of JSON.parse() to securely parse JSON data. This method transforms a JSON string into a JavaScript object, enabling convenient access to the data using object properties.&lt;/p&gt;

&lt;p&gt;In the second example, the code utilizes the Function() constructor to execute dynamic code. By passing a JavaScript code string to the constructor, a new function is created, which can be invoked. This alternative approach offers enhanced control and security compared to the usage of eval().&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Preventing Cross-Site Scripting (XSS)&lt;/strong&gt;&lt;br&gt;
Cross-Site Scripting (XSS) incidents occur when untrusted data is rendered in browsers without adequate escape measures. To mitigate this risk, employ content security policies (CSP) to discern trusted content sources, blocking unauthorized script execution. Imposing stringent CSP rules significantly diminishes the likelihood of XSS vulnerabilities. Here is how I do this in my HTML files:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhyw7plw8x3pyys0nwza.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhyw7plw8x3pyys0nwza.png" alt="Image description" width="800" height="229"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;4. Safeguarding Against Cross-Site Request Forgery (CSRF)&lt;/strong&gt;&lt;br&gt;
To avert CSRF attacks, your JavaScript applications must integrate robust CSRF protection mechanisms. This entails generating and incorporating anti-CSRF tokens within your forms, validating them on the server side, and ensuring their correspondence with the user's session. Frameworks like Express.js offer middleware, such as csurf, simplifying the implementation of CSRF protection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5ba0qnpaum2zbqdsnml.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5ba0qnpaum2zbqdsnml.png" alt="Image description" width="624" height="275"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;5. Content Security Policy (CSP) Headers&lt;/strong&gt;&lt;br&gt;
Leveraging Content Security Policy (CSP) headers protect your web app from content types that can be loaded on your website. By implementing CSP headers, you effectively impede unauthorized script execution, inline script injection, and other prospective security risks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F94x166iu7ce723w2y5pz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F94x166iu7ce723w2y5pz.png" alt="Image description" width="744" height="348"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;6. Frequent Dependency Updates&lt;/strong&gt;&lt;br&gt;
This one is a no-brainer. Failure to use the most recent version of a framework or library exposes your web app to a plethora of risks. Yap, I used ‘plethora’ like some medieval warlord. Anyway, regularly updating your dependencies ensures you derive the benefits of the latest security enhancements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Enforcing Access Control&lt;/strong&gt;&lt;br&gt;
Effective access control is pivotal in protecting sensitive data and functionalities. Strictly authorize and authenticate users, permitting access solely to authenticated and authorized individuals. You can effectively enforce access control by implementing user roles, permission levels, and robust authentication mechanisms. This process requires a series of articles, so lemme know in the comment section below if you’d like me to delve deeper. Meanwhile, YouTube University is always your friend. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Securing Password Storage&lt;/strong&gt;&lt;br&gt;
When handling user passwords, it is imperative to avoid storing them in plaintext. Instead, adopt secure cryptographic hash functions like bcrypt to store hashed and salted passwords, ensuring robust security. Feel free to leverage the following example after npm or yarn installing ‘bcrypt’:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15j1gsrsfdhiez9r3j1b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15j1gsrsfdhiez9r3j1b.png" alt="Image description" width="781" height="338"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Bring it all together&lt;/strong&gt;&lt;br&gt;
I hope this article was worth your time. I'm planning to publish a series of posts this year so lemme know in the comment section if there are any specific topics you'd like me to focus on. Until we meet again, adieu! &lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>beginners</category>
      <category>devops</category>
    </item>
    <item>
      <title>Manipulating the DOM using JavaScript Methods</title>
      <dc:creator>Nigel Okoth</dc:creator>
      <pubDate>Tue, 24 Aug 2021 09:55:38 +0000</pubDate>
      <link>https://dev.to/anvilicious/manipulating-the-dom-using-javascript-methods-492d</link>
      <guid>https://dev.to/anvilicious/manipulating-the-dom-using-javascript-methods-492d</guid>
      <description>&lt;p&gt;What is the best method for manipulating the DOM using JavaScript without exposing your app to new threats or reducing its speed? &lt;/p&gt;

&lt;p&gt;Hint: It is certainly not .innerHTML. &lt;/p&gt;

&lt;p&gt;In this post, I've defined what DOM manipulation is, explained when you shouldn't use .innerHTML and when to use it, and provided alternative methods that won't reduce the quality of your output when working with large text data!&lt;/p&gt;

&lt;p&gt;First, I'd like to assert that I'm a big fan of .innerHTML when manipulating short text data. For instance, my friend @KrisVii (on Twitter) has created this &lt;a href="https://tinkr.tech/learn/" rel="noopener noreferrer"&gt;awesome Chat App course&lt;/a&gt; on Tinkr.tech where anyone can post their course or learn programming from scratch. The JavaScript course helped me create this &lt;a href="https://hopeful-bell-553050.netlify.app/" rel="noopener noreferrer"&gt;demo&lt;/a&gt; that's currently hosted on Netlify.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Document Object Model?
&lt;/h2&gt;

&lt;p&gt;Moving on, Mozilla defines the Document Object Model (DOM) as "a programming interface for web documents. It represents the page so that programs can change the document structure, style, and content. The DOM represents the document as nodes and objects; that way, programming languages can interact with the page."&lt;/p&gt;

&lt;p&gt;For example, if you want to render some text on the DOM when the user clicks a button, you can add an event listener that listens for click events by starting with an HTML boilerplate and adding a div with an ID called "container" like so:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpytez8bo0ql9tqo3h552.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpytez8bo0ql9tqo3h552.png" alt="insert div in html" width="597" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Manipulating the DOM using .innerHTML
&lt;/h2&gt;

&lt;p&gt;Now it's time for some bitter-sweet JavaScript. First, we add an event listener that can only be triggered after the page has loaded like so:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F222jz53hloo49eo18uhe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F222jz53hloo49eo18uhe.png" alt="DOM content loaded" width="597" height="240"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Once the DOM content has loaded, the application will execute an arrow function that takes in zero arguments and executes the manipulation methods we're about to insert. Let's start with the .innerHTML method. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8ta5zuulrby1qx8mayc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8ta5zuulrby1qx8mayc.png" alt="Illustrating .innerHTML" width="673" height="276"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We'll first assign a variable called "container" that selects the div#container we created in our HTML boilerplate using the .querySelector method. Using .innerHTML, we can welcome a user to their profile once the page has loaded. But there are some caveats to using this strategy. &lt;/p&gt;

&lt;h2&gt;
  
  
  Disadvantages of .innerHTML
&lt;/h2&gt;

&lt;p&gt;The .innerHTML method is very slow (especially when working with large text documents) because JavaScript has to reparse content when working with this method. &lt;/p&gt;

&lt;p&gt;Mozilla defines parsing as "analyzing and converting a program into an internal format that a runtime environment can actually run." Unfortunately, any previously added event listeners will be removed when the div#container is being reparsed. Moreover, hackers can steal session cookies that typically contain private user data using cross-site scripting. So what other options do we have?&lt;/p&gt;

&lt;h2&gt;
  
  
  Alternatives to .innerHTML
&lt;/h2&gt;

&lt;p&gt;Instead of using innerHTML, we are going to rely on the createElement(), innerText(), and append() methods to manipulate the DOM. Before that, I need to explain again that it is perfectly fine to use innerHTML() when working with small text documents. When you simply alter or insert text inside HTML p or div tags using innerHTML, the method won't really affect the quality of your output. &lt;/p&gt;

&lt;p&gt;In contrast, the use-case I explained in paragraph 3 above should not be executed with innerHTML due to obvious reasons. First, lets write some code then I'll explain what's going on in the next paragraph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6o074wqkb575qmx5dw1h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6o074wqkb575qmx5dw1h.png" alt="New methods" width="622" height="402"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;In the previous .innerHTML() example, the client will only see the text "Welcome to your profile" when the page loads because this method is only ideal for manipulating small text data. However, we can optimize the experience using .createElement() and .append() methods such that the client can see their profile photo and maybe a button for changing the theme from white to black and vice versa. Since this is a tutorial about DOM manipulation methods, I didn't write any code for changing the theme.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bringing it all together:
&lt;/h2&gt;

&lt;p&gt;And we're done! That was simple wasn't it? If you have seen a typo or a error you'd like to correct kindly reach out in the comments section. I'm also looking for suggestions on what topics you'd like me to cover next. &lt;/p&gt;

&lt;p&gt;Until next time, hüvasti sõbrad! &lt;/p&gt;

&lt;p&gt;(Which is the Estonian version of adios amigos). &lt;/p&gt;

</description>
      <category>javascript</category>
      <category>html</category>
    </item>
  </channel>
</rss>
