<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Awadelrahman M. A. Ahmed</title>
    <description>The latest articles on DEV Community by Awadelrahman M. A. Ahmed (@awadelrahman).</description>
    <link>https://dev.to/awadelrahman</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F655381%2F3f374ebe-a135-4ee0-9eb0-4af17e23ae85.png</url>
      <title>DEV Community: Awadelrahman M. A. Ahmed</title>
      <link>https://dev.to/awadelrahman</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/awadelrahman"/>
    <language>en</language>
    <item>
      <title>Chatting with the Cloud: Setting up a ChatGPT Service on AWS</title>
      <dc:creator>Awadelrahman M. A. Ahmed</dc:creator>
      <pubDate>Sat, 11 Feb 2023 17:28:10 +0000</pubDate>
      <link>https://dev.to/awadelrahman/chatting-with-the-cloud-setting-up-a-chatgpt-service-on-aws-3pg0</link>
      <guid>https://dev.to/awadelrahman/chatting-with-the-cloud-setting-up-a-chatgpt-service-on-aws-3pg0</guid>
      <description>&lt;p&gt;Are you interested in setting up a ChatGPT service in the cloud? In this tutorial, I'll walk you through the steps of getting a ChatGPT service up and running on Amazon Web Services (AWS).&lt;/p&gt;

&lt;p&gt;Here's what you'll need before you get started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An AWS account&lt;/li&gt;
&lt;li&gt;An OpenAI API key&lt;/li&gt;
&lt;li&gt;An SSH client like PuTTY or Terminal (if you're using a Windows machine)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Launch an EC2 Instance
&lt;/h2&gt;

&lt;p&gt;The first step is to launch an EC2 instance, which is essentially a virtual machine in the cloud that you can use to run your ChatGPT service. Here's how:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Log in to the AWS Management Console and navigate to the EC2 dashboard.&lt;/li&gt;
&lt;li&gt;Click the "Launch Instance" button.&lt;/li&gt;
&lt;li&gt;Choose the Amazon Linux 2 AMI.&lt;/li&gt;
&lt;li&gt;Select an instance type that meets the requirements for OpenAI's GPT-3 API. OpenAI recommends using an instance with at least 16 GB of RAM and 2 vCPUs, such as the c5.large instance type.&lt;/li&gt;
&lt;li&gt;Follow the steps to configure your instance and launch it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 2: Connect to Your Instance
&lt;/h2&gt;

&lt;p&gt;Once your instance is up and running, you'll need to connect to it using an SSH client. Here's how:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open your SSH client.&lt;/li&gt;
&lt;li&gt;Enter the public DNS or IP address of your EC2 instance.&lt;/li&gt;
&lt;li&gt;Log in as the &lt;code&gt;ec2-user&lt;/code&gt; user.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 3: Install Dependencies
&lt;/h2&gt;

&lt;p&gt;Once you're connected to your EC2 instance, you'll need to install Git, Docker, and the dependencies required to run OpenAI's GPT-3 API. Here's the code you'll need to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Install Git
sudo yum install git -y

# Install Docker
sudo amazon-linux-extras install docker
sudo service docker start
sudo usermod -a -G docker ec2-user

# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.26.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

# Install pip
sudo yum install python3-pip -y
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Clone the OpenAI Repository
&lt;/h2&gt;

&lt;p&gt;Now that you've installed the necessary dependencies, you can clone the OpenAI repository and install the dependencies required to run the API. Here's the code you'll need to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Clone the OpenAI repository
git clone https://github.com/openai/api
cd api

# Install the dependencies
pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Start the API
&lt;/h2&gt;

&lt;p&gt;Finally, you can start the API by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker-compose up -d

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will start the API in the background. To test that your API is working, you can make a request to the endpoint &lt;code&gt;http://your-instance-ip:8080/request&lt;/code&gt; and provide your OpenAI API key in the &lt;code&gt;Key&lt;/code&gt;header. You should receive a response from the model!&lt;/p&gt;

&lt;p&gt;And that's it! You now have a fully functional ChatGPT service running on AWS. Start building your conversational applications today!&lt;/p&gt;

</description>
      <category>career</category>
      <category>learning</category>
      <category>motivation</category>
      <category>creativity</category>
    </item>
    <item>
      <title>How I use AWS SageMaker DeepAR to Manage my Child Diabetes! Part [1/3]</title>
      <dc:creator>Awadelrahman M. A. Ahmed</dc:creator>
      <pubDate>Sat, 12 Mar 2022 08:24:40 +0000</pubDate>
      <link>https://dev.to/awadelrahman/how-i-use-aws-sagemaker-deepar-to-manage-my-child-diabetes-part-13-4649</link>
      <guid>https://dev.to/awadelrahman/how-i-use-aws-sagemaker-deepar-to-manage-my-child-diabetes-part-13-4649</guid>
      <description>&lt;h1&gt;
&lt;center&gt;&lt;strong&gt; Introducing the Problem&lt;/strong&gt;&lt;/center&gt; 
&lt;/h1&gt; 




&lt;h3&gt;
  
  
  &lt;em&gt;"I Am Not a Medical Professional” Disclaimer&lt;/em&gt; !
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;I clearly state here that I am not a doctor or any kind of medical professional! I am writing these posts only to reflect my experience and share my story of how I used technology at a personal level to tackle one of the tedious and critical tasks we have in our daily life. So, this content does not constitute any professional medical advice, diagnosis, or treatment!&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Pre-
&lt;/h2&gt;

&lt;p&gt;I split my write-up into 3 parts which I post successively. In this first post, I intend to describe the problem and provide essential domain knowledge to make it smoother to follow up the following posts.&lt;/p&gt;

&lt;p&gt;The second post will describe the data, data extraction, preparation and transformation. The last (third) post will explain the modelling part and how I used AWS DeepAR algorithm to tackle the discussed problem. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;At this moment, the other two post are not yet published, I will update those lines with the links when published.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. What am I sharing?
&lt;/h2&gt;

&lt;p&gt;Being a parent of a child with type 1 diabetes (T1D) is a kind of a hassle, even with being fortunate to acquire smart systems like continuous glucose monitoring system (CGM) and insulin pumps. Those are small computerized devices to mimic the way the human pancreas works by delivering small doses of short acting insulin continuously. However, these devices still need a lot of careful tuning and parameterization to deliver the correct amount of insulin at the right time. The parameters and factors' values are basically calculated and set by diabetes doctors along with the discussion with the patient (parent in our case). &lt;/p&gt;

&lt;p&gt;Periodically (e.g., every 2 months), the diabetes doctor looks through the blood glucose levels registered by the CGM and the insulin amounts delivered by the insulin pump and tries to adjust particular factors, aiming to keep the blood glucose in range. Normally, the values are set following some rules and learning from experience, patterns and trends.&lt;/p&gt;

&lt;p&gt;I believe we are so privileged to have a doctor who regards discussing the technical details. So every check-up appointment, we have a deep discussion akin to brain storming session to decide which parameter to keep and which to adjust. &lt;/p&gt;

&lt;p&gt;Frankly to say, those discussions always tickle my sentiment regarding extracting knowledge and insights from data, as what we basically try to do is to find some patterns in the historical data and try to optimize the pump parameters in order to keep the blood glucose within range! &lt;strong&gt;It looks like a very legit machine learning problem!&lt;/strong&gt; That stimulated my current passion regarding &lt;strong&gt;AWS&lt;/strong&gt; and its machine learning algorithms, so I tried to experiment using AWS to enlighten our parameters-setting discussions! Yes, it might not yet be obvious how my problem is an &lt;em&gt;awesome&lt;/em&gt; use case of AWS, so let me explain the details of the problem to be more acquainted with technical terms. &lt;/p&gt;

&lt;h2&gt;
  
  
  3. Acting like a pancreas
&lt;/h2&gt;

&lt;p&gt;It is ironic that we can not actualize either the wise saying of the scientist and philosopher, Benjamin Franklin (1706-1790) &lt;em&gt;"Eat to live, and not live to eat."&lt;/em&gt; or the sarcastic quote by anonymous &lt;em&gt;"I don't eat to live, I live to eat"&lt;/em&gt;; without possessing a well-functioning pancreas!&lt;/p&gt;

&lt;p&gt;The pancreas is an organ that plays a very essential role in converting the food we eat into fuel for the body's cells. A principal function of the pancreas is to regulate blood glucose. The pancreas secretes a hormone called &lt;strong&gt;insulin&lt;/strong&gt; in response to elevated blood glucose levels. The insulin acts as a &lt;em&gt;key&lt;/em&gt; 🗝️ to unlock the cell allowing the glucose to enter the cell and being used for energy. I tried to visually simplify the idea in the animation below. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfuhrtu5ztftlyc1v8ks.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfuhrtu5ztftlyc1v8ks.gif" alt="InsulinKey" width="1083" height="707"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, without insulin, the glucose remains in the bloodstream, which can lead to dangerous complications like what is called &lt;em&gt;hyperglycemia&lt;/em&gt;. If hyperglycemia is not treated, it may lead to even more serious problem, a condition called &lt;em&gt;ketoacidosis&lt;/em&gt; (diabetic coma!). Without proper production of insulin, the body can't thrust glucose into cells and the body reacts by breaking down the body fat to use for energy!&lt;/p&gt;

&lt;p&gt;A consequence of this temporary remedy of producing energy by breaking down fats is the production of waste products called &lt;em&gt;ketones&lt;/em&gt;. It's a sad thing to know that our bodies cannot tolerate large amounts of ketones and tries to get rid of them through the urinary system. But it is even sadder to know that if the body cannot release all the ketones and they build up in the bloodstream, the ketoacidosis will be life-threatening and needs immediate treatment.&lt;/p&gt;

&lt;p&gt;The plot below from &lt;a href="https://en.wikipedia.org/wiki/Blood_sugar_level" rel="noopener noreferrer"&gt;[1]&lt;/a&gt; shows how the insulin production goes hand in hand with the glucose levels increase caused by the meals. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3t12642fz9wi2e9oqhw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3t12642fz9wi2e9oqhw.png" alt="insulineday" width="640" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You might get many insights from the above plot, but two essential observations to understand for our machine learning problem which also map sharply to two insulin components used by the insulin pump to imitate the pancreas are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;There are peaks (bursts) of insulin production by the pancreas every time the body consumes food. The insulin pump tries to imitate these peaks by injecting a quick-acting insulin before mealtimes and it is called &lt;strong&gt;Bolus&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The plot above never crosses 0 of the Y-axis. In other words, the pancreas, continuously, produces some base amount of insulin around the clock. To mimic this, &lt;strong&gt;Basal&lt;/strong&gt; insulin is continuously injected throughout the day as a longer-acting insulin that helps keep glucose levels steady day and night.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This plot from &lt;a href="https://www.keshred.com/insulin/" rel="noopener noreferrer"&gt;[2]&lt;/a&gt; shows the two insulin types Basal and Bolus working together to mimic the pancreas function. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hy8v77cq1piiwuin0nw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hy8v77cq1piiwuin0nw.png" alt="basalbolus" width="600" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  4. "How much insulin to inject" is a factors' game
&lt;/h2&gt;

&lt;p&gt;Now we know that insulin pumps are smart devices that attached to the patient body and tries to mimic the function of the pancreas by delivering both types of insulin, the Basal and the Bolus. &lt;/p&gt;

&lt;p&gt;We tried using different types of insulin pumps and each had its pros and cons for us. Lastly, we ended up using &lt;a href="https://www.omnipod.com/" rel="noopener noreferrer"&gt;Omnipod&lt;/a&gt; system for a couple of years now. Even though Omnipod still lacks the closed-loop control system compared to &lt;a href="https://www.medtronicdiabetes.com" rel="noopener noreferrer"&gt;Medtronic&lt;/a&gt; products, i.e., it does not allow the CGM (glucose sensor) to communicate directly to inject the insulin, we still see its tubeless system is a great advantage for our active kid.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Basal program
&lt;/h3&gt;

&lt;p&gt;The insulin pump PDM allows setting a &lt;em&gt;basal program&lt;/em&gt; (i.e., schedule) to deliver a pre-set insulin amount for each pre-set time period. I thought at the beginning that the basal should have only a single value for the whole day/week, but it turned out the need for insulin changes during the day based on the person's activity level and other conditions. We can also have &lt;em&gt;multiple&lt;/em&gt; basal programs to meet different activity levels and conditions such as weekends or gym-days. The image below shows how a basal program looks like as a graph and as a list that stored in the Omnipod PDM (personal diabetes manager).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzgorpkmegehdp34qpdm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzgorpkmegehdp34qpdm.png" alt="basalPDM" width="517" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4.2 Bolus settings
&lt;/h3&gt;

&lt;p&gt;When deciding to have a meal, we should also decide how much insulin we take for that a meal. A that moment, we should garb a pen and a piece of paper and calculate a bolus dose that consists of two components: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A component that is proportional to the &lt;strong&gt;carbohydrate-content&lt;/strong&gt; of the meal with a coefficient of proportionality that is called &lt;strong&gt;Insulin-to-Carbohydrate factor (ICF)&lt;/strong&gt;. This ICF defines how many carbohydrates can be handled by one unit of insulin.&lt;/li&gt;
&lt;li&gt;It is typical that the blood glucose be at an undesired level due to some error propagation. For instance if we incorrectly calculated the carbs for the breakfast, the blood glucose may stay high till the dinner time if we do not &lt;em&gt;correct&lt;/em&gt; as rapidly as possible. Hence a second bolus component is calculated when having a meal or whenever required during the day. This &lt;strong&gt;correction component&lt;/strong&gt; is controlled by a &lt;strong&gt;Correction Factor (CF)&lt;/strong&gt; which defines how much one unit of insulin lowers the blood glucose level, and then we can calculate how much insulin we need to reach our an optimal (preferred) target blood glucose level, e.g. 5.0 mmol/L.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, to calculate a bolus dose we use this simple equation:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fljxjdxmbxbtbtisf3arz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fljxjdxmbxbtbtisf3arz.png" alt="boluseq" width="800" height="102"&gt;&lt;/a&gt;&lt;br&gt;
As an example, suppose we pre-decided that these factor to be, ICF=20 and CF=10. And suppose the patient would like to drink a cup of milk which contains 10 gm of carbohydrates. At that moment, we measured the blood glucose and it was 7.0 mmol/L; however, we wish if we had it at 5.0 mmol/L. So to calculate the bolus, we plug those values into the above equation and that will lead to a bolus dose of 0.70 units as below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4g2kt6jhobkd838vinr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4g2kt6jhobkd838vinr.png" alt="boluseq3" width="800" height="47"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can imagine doing this calculation every time the patient needs to eat something! Things become more complicated when you know that those ICF and CF usually have different values for different times of the day, e.g. ICF might be 20.0 for the breakfast but 15.0 for the lunch! This why being equipped with an insulin pump that has a &lt;strong&gt;bolus calculator&lt;/strong&gt; within it along with a CGM (continuous glucose monitoring) can save a lot of hassle. &lt;/p&gt;

&lt;p&gt;We can compare the situation in which we do not have smart devices and the situation of having them. In the table below, I summarize the steps/activities that we must go through if we "decide" to have a meal. I describe how we perform each step in both situation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ofmhjmois16kv8uhdih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ofmhjmois16kv8uhdih.png" alt="table" width="800" height="230"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Motivation to use machine learning?
&lt;/h2&gt;

&lt;p&gt;Yes, so far that looks satisfactory and smart, and the table above gives a fairly positive picture of how the CGM and the insulin pump smartly lift a huge burden off our shoulders. However, to ensure an effective and tight control over blood sugar, we must &lt;em&gt;optimally&lt;/em&gt; design the the basal program and the bolus factors. We have already mentioned that those factors can have a wide range of settings, not only at the factor value dimension but also the time and the activity dimensions. In other words, those factors have different optimal values during the day and based on the activity and many other conditions. I envision that to look like the plot below which this leads to unlimited number of values. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxbt1e2rz7eaxm67rokg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxbt1e2rz7eaxm67rokg.png" alt="factorscloud" width="419" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Normally, the optimal values of the factors are set following some rules and learning from experience. As still there is a substantial portion of the setting procedure done empirically, I  take this as a strong motivation to leverage machine learning to  at least give some insights. My motivation is also supported by the fact that those devices register considerable amount of data to learn from.&lt;/p&gt;

&lt;p&gt;In the next post, I try to formulate the problem in the terminology of data science and machine learning context and discuss the data extraction and transformation.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>diabetes</category>
    </item>
    <item>
      <title>Tutorial: Graph Neural Networks for Social Networks Using PyTorch</title>
      <dc:creator>Awadelrahman M. A. Ahmed</dc:creator>
      <pubDate>Wed, 07 Jul 2021 16:41:10 +0000</pubDate>
      <link>https://dev.to/awadelrahman/tutorial-graph-neural-networks-for-social-networks-using-pytorch-2kf</link>
      <guid>https://dev.to/awadelrahman/tutorial-graph-neural-networks-for-social-networks-using-pytorch-2kf</guid>
      <description>&lt;h2&gt;
  
  
  1. Set your expectations of this tutorial
&lt;/h2&gt;

&lt;p&gt;You can follow this tutorial if you would like to know about &lt;strong&gt;Graph Neural Networks (GNNs)&lt;/strong&gt; through a practical example using PyTorch framework. &lt;br&gt;
I am aiming, at the end of this step-by-step tutorial, that you will be able to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Gain insights about what graph neural networks (GNNs) are and what type of problems they may solve.&lt;/li&gt;
&lt;li&gt;Know how graph datasets, which are expected by GNNs, look like. We will download and explore a social network dataset collected from GitHub.&lt;/li&gt;
&lt;li&gt;Construct graphs and visualize them using code.&lt;/li&gt;
&lt;li&gt;Construct and train a simple GNN model for node classification  task based on convolutional GNN using &lt;code&gt;torch_geometric&lt;/code&gt;, the geometric deep learning extension library for &lt;code&gt;PyTorch&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Of course you are so welcome to post comments if you have any question or you feel that I deviated from what I promised you here!&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  2. GNN 

&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;≈\approx&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 Graph + Convolution Neural Network!
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;A &lt;strong&gt;BIG&lt;/strong&gt; &lt;strong&gt;caveat&lt;/strong&gt; here is to emphasize that I do not mean GNNs are just CNNs that operate on graphs, but what I want to say is that I felt comfortable with GNNs when I linked it to my understanding of CNNs and learnt something about graphs. For sure there are many other variations of GNNs but let us stick to this for those 10 minutes of reading. I hope this works for you as well, note that  I put the 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;≈\approx&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 sign to avoid causing some people to cringe.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Graph neural networks, as their name tells, are neural networks that work on graphs. And the graph is a data structure that has two main ingredients: nodes (a.k.a. vertices) which are connected by the second ingredient: edges. You can conceptualize the nodes as the graph entities or objects and the edges are any kind of relation that those nodes may have. Also you should know that both nodes and edges &lt;em&gt;can&lt;/em&gt; have features (a.k.a. attributes), note the &lt;em&gt;can&lt;/em&gt; as having features is not a &lt;em&gt;must&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Mathematicians like to be concise and they represent the graph as the set  
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;G=(V,E)\mathcal{G = (V, E)} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathcal"&gt;G&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathcal"&gt;V&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathcal"&gt;E&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 where 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;G,V,E\mathcal{G, V, E} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathcal"&gt;G&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathcal"&gt;V&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathcal"&gt;E&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 are the graph, vertices and edges sets respectively. But for our tutorial, let us make some visually appealing illustrations. The figure below  represents a graph that consists of 4 nodes, represented by those blue circles with indices (1,2,3 and 4) and connected by &lt;em&gt;directed&lt;/em&gt; edges represented by those green arrows. This graph is directed because the edges may represent asymmetric relationship, for instance if the nodes represent social network users and node 1 follows node 2 it does not mean that node 2 follows node 1. The contrary is the &lt;em&gt;undirected&lt;/em&gt; graph which has bidirectional edges, then the relation could be friendship for example. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yxbpebsz6oqjf7fv3lo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yxbpebsz6oqjf7fv3lo.png" alt="graph0"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since you are now reading this post about GNNs, I can confidently assume that you already have knowledge about CNNs. Yes, convolutional neural networks, the deep neural network class, that is most commonly applied to image analysis and computer vision applications. &lt;/p&gt;

&lt;p&gt;Most of us, if not all, already know that images consist of pixels and the CNN leverages the convolution operation to calculate latent (hidden) features for different pixels based on their &lt;em&gt;surrounding&lt;/em&gt; pixel values. It does this by sliding a kernel (a.k.a. filter) over the input image and calculating the dot product between that small filter and the overlapping image area. This dot product leads to aggregate the neighboring pixel values to one representative scaler. Now let us twist our conceptualization of images a little bit and think of images as a graphs. You can think of the pixels of the image as the nodes of the graph and the edges are present only between the adjacent pixels. You might have your own way to represent the image as a graph, what matters is that you just have the image-to-graph concept in your mind. If you get this then it is easier for you to think of GNNs as a generic version of CNNs that can operate on graphs.&lt;/p&gt;

&lt;p&gt;The benefit of using GNNs is the provision of a generalized form that enables us to exploit non-Euclidean space data. In contrast to the pixels representation of images, the graph can be visualized as a collection of nodes and edges without having any order. So if we can learn node features in the latent space, we can do many interesting things. Those things are under three main categories:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Graph classification&lt;/em&gt;: where we can answer questions like: can we predict chemical compounds activity against lung cancer? Remember that chemical compounds can be represented as graphs with compound atoms as its nodes and chemical bonds are its edges.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge classification&lt;/em&gt;: as an example we can predict the congestion in a particular street, if we are given the city map. Then our graph has the streets as its edges and their intersections as its nodes.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Node classification&lt;/em&gt;: our tutorial problem falls under this category, as we are given a graph of a social network where users represent its nodes and the relationships between them are its edges and our task is to predict each user group (i.e. node class).&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  3. Adjacency matrix? Degree matrix? &lt;em&gt;an example&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;Let us take a simple type of graphs, the &lt;em&gt;undirected&lt;/em&gt; graph, this graph type is simple because the relationship between nodes is symmetric. Let us have an example of undirected graphs which has some numerical features attached to its nodes as well as its edges. Those features can represent any attributes that describe a node (blue boxes) or an edge (green boxes). Features are normally vectors, but we represent them by scalers here for simplicity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4admdyj8cvgr3191b5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4admdyj8cvgr3191b5w.png" alt="graph1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can express the above graph using two matrices which can capture its structure. The first is called the adjacency matrix and mostly denoted by &lt;strong&gt;A&lt;/strong&gt; which is a square matrix used to represent finite graphs like this one. The elements of &lt;strong&gt;A&lt;/strong&gt; indicate whether pairs of nodes are adjacent (i.e. connected by edges) or not in the graph. Those elements can be weighted (e.g. by edge features) as in our case; or can be unweighted (i.e. zeros &lt;br&gt;
for disconnected nodes and ones for the adjacent nodes). For our graph we come up with a symmetric matrix with zero diagonal as shown below. The other matrix is the features matrix &lt;strong&gt;H&lt;/strong&gt; (a.k.a. graph signal &lt;strong&gt;X&lt;/strong&gt;) which holds the nodes' features of the graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1rp7gummqde0jbxnseal.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1rp7gummqde0jbxnseal.png" alt="HandA"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;An interesting operation that we may like to do is to calculate the dot product between those two matrices. This product, if we closely look at it, is generating a new feature vector that captures the feature of the surrounding nodes for each node based on how those surrounding nodes are connected to it. You can take a few seconds to relate the numbers on the calculation to the graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bb0obvymmv1w1k2i8k1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bb0obvymmv1w1k2i8k1.png" alt="eq1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fntbtrlzhle5fwetir5rz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fntbtrlzhle5fwetir5rz.png" alt="hdash1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtbjvngengfh8s02mcq1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtbjvngengfh8s02mcq1.png" alt="hdash2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, if we take a critical look at the result we can observe two problems. The first is that when calculating the entry of the features matrix &lt;strong&gt;H'&lt;/strong&gt; for any of the nodes we &lt;em&gt;exclude&lt;/em&gt; that particular node features from the calculation as shown by red zeros. Which seems to be absurd! in the sense that we are ignoring our &lt;em&gt;central&lt;/em&gt; node! To solve this we can add self-connections to every node on the graph and this can be achieved by adding the identity matrix &lt;strong&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;I{I}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;I&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
&lt;/strong&gt; to the graph adjacency matrix, which will alter its zero-valued diagonal to ones. So, now we can define the modified adjacency matrix &lt;strong&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;A~\tilde{A}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord accent"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="accent-body"&gt;&lt;span class="mord"&gt;~&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
&lt;/strong&gt; as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kmabqemuvw6asoou45d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kmabqemuvw6asoou45d.png" alt="abar"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The second problem is the risk that may result from &lt;em&gt;aggregating&lt;/em&gt; the features as we might end up with extremely high values which can lead to computation instability. We can see already signs of that in our simple example on the latent feature matrix &lt;strong&gt;H'&lt;/strong&gt; which has entries at least 10 times more than the entries in the original &lt;strong&gt;H&lt;/strong&gt;. The solution of this problem is to &lt;em&gt;normalize&lt;/em&gt; the latent features by dividing by the graph &lt;em&gt;degree&lt;/em&gt; matrix &lt;strong&gt;D&lt;/strong&gt; (specifically we multiply by its inverse). The degree matrix is a diagonal matrix that contains information about the number of edges attached to each vertex. Our graph degree matrix looks like this: &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ys8qq5snrztj0hrw1zw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ys8qq5snrztj0hrw1zw.png" alt="degree"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then the latent space feature representation can be calculated by the following, note that you may see different ways to solve this problem in the literature, but their common goal is to achieve some level of normalization by exploiting the graph structure:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p9fzpzt7hd0xxm21p19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p9fzpzt7hd0xxm21p19.png" alt="eq4"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  4. Let us add some intelligence (learning)
&lt;/h2&gt;

&lt;p&gt;So far, we discussed how we can calculate latent features of a graph data structure. But if we want to accomplish a particular task we can &lt;em&gt;guide&lt;/em&gt; this calculation toward our goal. Particularly, we can leverage our last equation to be multiplied by a &lt;em&gt;learnable&lt;/em&gt; weight matrix &lt;strong&gt;W&lt;/strong&gt; and apply a nonlinear function to the product as in the next equation. We can learn the optimal &lt;strong&gt;W&lt;/strong&gt; based on the task at hand. And here is where we can start doing interesting things. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frha625rk7xhh16jmu1cv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frha625rk7xhh16jmu1cv.png" alt="gnneq1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Implementation wise, &lt;strong&gt;W&lt;/strong&gt; can be a linear transformation (e.g. a linear layer) of the product and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ϕ\phi&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϕ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is a non-linear activation function. I do not know if now you got the same intuition of mine (which is true) that we can stack as many functions of those as we like and learn the weight matrices &lt;strong&gt;W&lt;/strong&gt; using backpropagation. Stacking those functions together promote this equation to be a deep learning model.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. Time to get hands dirty!
&lt;/h2&gt;

&lt;p&gt;Even though the above functions look natural and manageable to implement using many machine learning frameworks,  &lt;code&gt;torch_geometric&lt;/code&gt; (&lt;a href="https://pytorch-geometric.readthedocs.io/en/latest/index.html" rel="noopener noreferrer"&gt;PyTorch Geometric&lt;/a&gt;), a geometric deep learning extension library for &lt;code&gt;PyTorch&lt;/code&gt;, provides many &lt;em&gt;variations&lt;/em&gt; of deep learning on graphs and other irregular structures.&lt;/p&gt;

&lt;p&gt;Of course, as this field is evolving, you will find many interesting original GNN models and many "bells and whistles" for better suiting your problem. But for the sake of this tutorial, we will keep it simple and use only one flavor of GNNs which is the graph convolutional operator class &lt;code&gt;GCNConv&lt;/code&gt; which is implemented from the paper &lt;a href="https://arxiv.org/abs/1609.02907" rel="noopener noreferrer"&gt;“Semi-supervised Classification with Graph Convolutional Networks”&lt;/a&gt;, if you would like to have a look at how it was developed.&lt;/p&gt;

&lt;p&gt;The particular problem we will be working on is a typical node classification problem. We have a large social network of GitHub developers named &lt;a href="https://snap.stanford.edu/data/github-social.html" rel="noopener noreferrer"&gt;musae-github&lt;/a&gt; which was collected from the public API. Nodes represent developers (platform users) who have starred at least 10 repositories and edges are &lt;em&gt;mutual&lt;/em&gt; follower relationships between them (note that word mutual indicates &lt;em&gt;undirected&lt;/em&gt; relationship). The node features are extracted based on the location, repositories starred, employer and e-mail address. Our task is to predict whether the GitHub user is a web or a machine learning developer. This target feature was derived from the job title of each user. The dataset referenced paper is &lt;a href="https://arxiv.org/abs/1909.13021" rel="noopener noreferrer"&gt;"Multi-scale Attributed Node Embedding"&lt;/a&gt;.   &lt;/p&gt;

&lt;p&gt;Working with data, a golden rule is to learn as we go about the dataset. I am pasting here a dataset statistics table from the same source, so we can have a preliminary idea about what we are dealing with. This statistics tell us that our data set has those many nodes and those many edges. It also tells us that the edges has no features and no directions. Also, the data has no temporal property. All this information can give us a feeling that our dataset is a right candidate for this tutorial in term of simplicity. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8eiarf89ehm31tj71m7k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8eiarf89ehm31tj71m7k.png" alt="datset"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After downloading the dataset, we can see 3 important files:&lt;br&gt;
&lt;code&gt;musae_git_edges.csv&lt;/code&gt; which contains the edges' indices.&lt;br&gt;
&lt;code&gt;musae_git_features.json&lt;/code&gt; which contains the nodes' features.&lt;br&gt;
&lt;code&gt;musae_git_target.csv&lt;/code&gt;  which contains the targets, i.e. node labels.&lt;/p&gt;
&lt;h3&gt;
  
  
  5.1 Required packages
&lt;/h3&gt;

&lt;p&gt;To begin with, we must download and import the required packages. To download &lt;code&gt;torch_geometric&lt;/code&gt; please follow strictly the &lt;a href="https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html#" rel="noopener noreferrer"&gt;installation guide here&lt;/a&gt;, there are different options if you want to install using Pip Wheels, please check your syntax to download the compatible version based on your machine and software.&lt;br&gt;
We will also need &lt;code&gt;NetworkX&lt;/code&gt; Python package, we will use it to visualize the graph, &lt;a href="https://networkx.org/documentation/stable/install.html" rel="noopener noreferrer"&gt;I installed with pip&lt;/a&gt;. From &lt;code&gt;torch_geometric&lt;/code&gt; we need to import &lt;code&gt;AddTrainValTestMask&lt;/code&gt;, this will help us to segregate the training and test sets later. We also need &lt;code&gt;json&lt;/code&gt;,&lt;code&gt;pandas&lt;/code&gt; and &lt;code&gt;numpy&lt;/code&gt; and some other packages.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;matplotlib&lt;/span&gt; &lt;span class="n"&gt;inline&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn.functional&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch_geometric.data&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Data&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch_geometric.transforms&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AddTrainValTestMask&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;masking&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch_geometric.utils.convert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;to_networkx&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch_geometric.nn&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GCNConv&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;networkx&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  5.2 Dataset exploration
&lt;/h3&gt;

&lt;p&gt;I assume the dataset files are in a subfolder named &lt;code&gt;data&lt;/code&gt;. We read them and plot the top 5 rows and the last 5 rows from the labels file. Even though we see 4 columns but only 2 columns concern us here: &lt;code&gt;id&lt;/code&gt; of the node (i.e. user) and &lt;code&gt;ml_target&lt;/code&gt; which is 1 if the user is a machine learning community user and 0 otherwise. Now, we are sure that our task is a binary classification problem, as we only have 2 classes.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/musae_git_features.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;json_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;data_raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/musae_git_edges.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;target_df&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/musae_git_target.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5 top nodes labels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_markdown&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5 last nodes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_markdown&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febkho4uzy0a30jzahmbp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febkho4uzy0a30jzahmbp.png" alt="headandtail"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another important property we need to look at the class balance, this is because severe class imbalances may cause the classifier to simply guessing the majority class and not making any evaluation on the underrepresented class. By plotting the histogram (frequency distribution), we see some imbalance as the machine learning (label=1) are fewer than the other classes. This leads to some issues, but let us say, for now, it is not a severe problem.&lt;/p&gt;

&lt;p&gt;We may also see how nodes are different based on the number of features. The second histogram tells that most of the users have between 15 and about 23 features, and very few of them have more than 30 and less than 5 features. The third histogram shows the most common features accross users, we can see different peaks on the distribution.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ml_target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Classes distribution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feat_counts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Number of features per graph distribution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;bins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Features distribution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo74ml5tspubldlcttkkb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo74ml5tspubldlcttkkb.png" alt="hists"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  5.3 Features encoding
&lt;/h3&gt;

&lt;p&gt;The nodes features tell us which feature is attached to each node. We can one-hot encode those features by writing our function &lt;code&gt;encode_data&lt;/code&gt;. Our plan is to use this function to encode a light subset of the graph (e.g. only 60 nodes) for the purpose of visualization. Here is the function.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;encode_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;nodes_included&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;nodes_included&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes_included&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="c1"&gt;# 
&lt;/span&gt;        &lt;span class="n"&gt;one_hot_feat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;this_feat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_raw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;one_hot_feat&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;this_feat&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;one_hot_feat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes_included&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)]).&lt;/span&gt;&lt;span class="nf"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;concatenate&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:,:]&lt;/span&gt;
        &lt;span class="nf"&gt;return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;sparse_feat_matrix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;We can plot the first 250 features columns (the total is 4005) of the code for the first 60 users. Below, we can see how sparse is the matrix that we construct for nodes features.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;data_encoded_vis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;sparse_feat_matrix_vis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;encode_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sparse_feat_matrix_vis&lt;/span&gt;&lt;span class="p"&gt;[:,:&lt;/span&gt;&lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Greys&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4auwrxa91qjlb4f37e0o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4auwrxa91qjlb4f37e0o.png" alt="sparse"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  5.4 Graph construction and visualization
&lt;/h3&gt;

&lt;p&gt;To construct our graph, we will use &lt;code&gt;torch_geometric.data.Data&lt;/code&gt; which is a plain old python object to model a single graph with various (optional) attributes. We will construct our graph object using this class and passing the following attributes, noting that all arguments are torch tensors. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;x&lt;/code&gt;: will be assigned to the encoded node features, its shape is &lt;code&gt;[number_of_nodes, number_of_features]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;y&lt;/code&gt;: will be assigned to the node labels, its shape is &lt;code&gt;[number_of_nodes]&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;edge_index&lt;/code&gt;: to represent an &lt;em&gt;undirected&lt;/em&gt; graph, we need to extend the original edge indices in a way that we can have two separate &lt;em&gt;directed&lt;/em&gt; edges connecting the same two nodes but pointing opposite to each other. For example, we need to have 2 edges between node 100 and node 200, one edge points from 100 to 200 and the other points from 200 to 100. This is a way to represent the undirected graph if we are given the edge indecies. The tensor shape will be &lt;code&gt;[2,2*number_of_original_edges]&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It deserve mentioning here that this &lt;code&gt;Data&lt;/code&gt; class is very abstract in a sense that you can add any attribute that you think it describes your graph. For instance we can add a metadata attribute &lt;code&gt;g["meta_data"]="bla bla bla"&lt;/code&gt; which make it flexible to encapsulate any information you would like. Now, we will build &lt;code&gt;construct_graph&lt;/code&gt; function which does the following:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;construct_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;node_features_list&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;node_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_features_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;node_labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ml_target&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;edges_list&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;edge_index01&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;edges_list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;long&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
    &lt;span class="n"&gt;edge_index02&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;edge_index01&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;long&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="c1"&gt;#.T
&lt;/span&gt;    &lt;span class="n"&gt;edge_index02&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,:]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edge_index01&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,:]&lt;/span&gt;
    &lt;span class="n"&gt;edge_index02&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,:]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edge_index01&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,:]&lt;/span&gt;
    &lt;span class="n"&gt;edge_index0&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cat&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;edge_index01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;edge_index02&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node_features&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node_labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edge_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edge_index0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;g_light&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node_features&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                     &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node_labels&lt;/span&gt;   &lt;span class="p"&gt;,&lt;/span&gt;
                     &lt;span class="n"&gt;edge_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edge_index0&lt;/span&gt;&lt;span class="p"&gt;[:,:&lt;/span&gt;&lt;span class="mi"&gt;55&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g_light&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;For drawing a graph, we construct our &lt;code&gt;draw_graph&lt;/code&gt; function. We need to convert our homogeneous graph to a NetworkX graph then pot using &lt;code&gt;NetworkX.draw&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;draw_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;#node_labels=data0.y
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_nodes&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This is a big graph, can not plot...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data_nx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;to_networkx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;node_colors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_nx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spring_layout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_nx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_nx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_cmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Set1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;node_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;node_colors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;node_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;connectionstyle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;angle3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;with_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edge_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arrowstyle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Now, we can draw a sub graph by calling &lt;code&gt;construct_graph&lt;/code&gt;  with &lt;code&gt;light=True&lt;/code&gt;. Then we can pass it to &lt;code&gt;draw_graph&lt;/code&gt;, to show the following graph. You can see how the nodes are connected by edges and labeled by color.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;g_sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;construct_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_encoded_vis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;draw_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g_sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3r2tdjyexhogymoussd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3r2tdjyexhogymoussd.png" alt="graphlish"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  5.5 Construct and train our GNN model
&lt;/h3&gt;

&lt;p&gt;We start by encoding the whole data by calling &lt;code&gt;encode_data&lt;/code&gt;  with &lt;code&gt;light=False&lt;/code&gt; and construct the full graph by calling &lt;code&gt;construct_graph&lt;/code&gt;  with &lt;code&gt;light=False&lt;/code&gt;. We will not try to visualize this big graph as I am assuming that you are using your local computer with limited resources.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;encode_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;construct_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_encoded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;light&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;The usual immediate step after data preparation in the machine learning pipelines is to perform data segregation and split subsets of the dataset to train the model and further validate how it performs against new data and save a test segment to report the overall performance.&lt;/p&gt;

&lt;p&gt;The splitting is not straight forward in our case as we have a single giant graph that should be taken all at once (there are some other ways to deal with graph segments, but let us assume this is the case for now). In order to tell the training phase which nodes that should be included during training and tell the inference phase which are the test data we can use &lt;em&gt;masks&lt;/em&gt; which are binary vectors that indicate (using 0 and 1) which nodes belong to each particular mask. &lt;code&gt;torch_geometric.transforms.AddTrainValTestMask&lt;/code&gt; class can take our graph and let us set how we want our masks to formed and it will add a node-level split via &lt;code&gt;train_mask&lt;/code&gt;, &lt;code&gt;val_mask&lt;/code&gt; and &lt;code&gt;test_mask&lt;/code&gt; attributes. In our training, we use 30% as a validation set and 60% as test set while we keep only 10% for the training. You may have different split ratios, but in this way we may have a more realistic performance and we will not overfit easily (I know you might disagree with me on this point)! We can also print the graph information and the number of nodes each set (mask). The numbers between the brackets is the shape of each attribute tensor.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;msk&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train_rest&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_test&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;msk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;training samples&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_mask&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validation samples&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;val_mask&lt;/span&gt; &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test samples&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_mask&lt;/span&gt; &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fatav9s2229mama09f4xa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fatav9s2229mama09f4xa.png" alt="split"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this point, I believe you agree with me that we are ready to construct our GNN model class. As we agreed we will use &lt;code&gt;torch_geometric.nn.GCNConv&lt;/code&gt; class, however there are many other layers you can try on &lt;a href="https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#convolutional-layers" rel="noopener noreferrer"&gt;PyTorch Geometric documentation&lt;/a&gt;.&lt;br&gt;
We will stack two &lt;code&gt;GCNConv&lt;/code&gt; layers, the first has input features equal to the number of features in our graph and some arbitrary number of output features &lt;code&gt;f&lt;/code&gt;. Then we apply a &lt;code&gt;relu&lt;/code&gt; activation function and deliver the latent features to the second layer which has output nodes equal to the number our classes (i.e. 2). In the &lt;code&gt;forward&lt;/code&gt; function, &lt;code&gt;GCNConv&lt;/code&gt; can accept many arguments &lt;code&gt;x&lt;/code&gt; as the nodes features, &lt;code&gt;edge_index&lt;/code&gt; and &lt;code&gt;edge_weight&lt;/code&gt;, in our case we only use the first two arguments.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SocialGNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;num_of_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SocialGNN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GCNConv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_of_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GCNConv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;edge_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;  &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edge_index&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edge_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;edge_index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edge_index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;As our model will predict the class of all nodes in the graph, we, however, want to calculate the loss and accuracy for specific set based on the phase we are in. For instance, during training we want to calculate the training loss and accuracy only based on the training set and here where we should use our masks. We will define functions &lt;code&gt;masked_loss&lt;/code&gt; and &lt;code&gt;masked_accuracy&lt;/code&gt; for which we pass the respective masks and it returns the corresponding loss and accuracy. The idea is to calculate the loss and accuracy for all nodes and multiply it by the mask to zero out the not required nodes.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;masked_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;criterion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;masked_accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;/=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;long&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;
    &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Now, we can define our training function in which we will use &lt;code&gt;torch.optim.Adam&lt;/code&gt; optimizer. We will run the training for some number of epochs and we keep track for the best validation accuracy. We also plot the losses and accuracies across epochs for the training. &lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_social&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# 00001
&lt;/span&gt;    &lt;span class="n"&gt;best_accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;

    &lt;span class="n"&gt;train_losses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;train_accuracies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="n"&gt;val_losses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;val_accuracies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="n"&gt;test_losses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;test_accuracies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ep&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;net&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masked_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;train_losses&lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;train_accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masked_accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                       &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                                       &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;train_accuracies&lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;train_accuracy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;val_loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masked_loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                             &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                             &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;val_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;val_losses&lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;val_loss&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;val_accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masked_accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                     &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                                     &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;val_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;val_accuracies&lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;val_accuracy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;test_accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;masked_accuracy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                      &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                                      &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;test_accuracies&lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;test_accuracy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;best_accuracy&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Epoch {}/{}, Train_Loss: {:.4f}, Train_Accuracy: {:.4f}, Val_Accuracy: {:.4f}, Test_Accuracy: {:.4f}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ep&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;train_accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val_accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;test_accuracy&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;best_accuracy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_accuracy&lt;/span&gt;

    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_losses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_losses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_losses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_accuracies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_accuracies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_accuracies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;At this point we constructed all the required functions and ready to instantiate our model and train it. We will construct the model with 16 filters and will use &lt;code&gt;nn.CrossEntropyLoss&lt;/code&gt; as our loss criterion. Below, we can see that our simple model reached a very decent accuracy on the test set, more than 87%. We can also see the learning curves (losses) and accuracies development through epochs on the top and bottom plots respectively.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;num_of_feat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_node_features&lt;/span&gt;
&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SocialGNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_of_feat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_of_feat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;criterion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CrossEntropyLoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;train_social&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzoqh07c6rpbw0xg3hdsz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzoqh07c6rpbw0xg3hdsz.png" alt="epochs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffl8bpi4un15dx2qfkitl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffl8bpi4un15dx2qfkitl.png" alt="plots"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  6. Conclusion and next steps
&lt;/h3&gt;

&lt;p&gt;We have gone through this step-by-step tutorial covering fundamental concepts about graph neural networks and developed our simple GNN model based on convolutional GNN on &lt;code&gt;PyTorch&lt;/code&gt; framework using &lt;code&gt;torch_geometric&lt;/code&gt; extension. You might want to access the github repository of this code. &lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Awadelrahman" rel="noopener noreferrer"&gt;
        Awadelrahman
      &lt;/a&gt; / &lt;a href="https://github.com/Awadelrahman/GNN4SocialNWTutorial" rel="noopener noreferrer"&gt;
        GNN4SocialNWTutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;h4&gt;
  
  
  6.1 Further steps that might be interesting to you:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Solving other categories of problems that GNNs can solve, i.e. graph classification and edges classification.&lt;/li&gt;
&lt;li&gt;You might want to check other GNN layers that &lt;code&gt;torch_geometric&lt;/code&gt; offers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7. Some resources that helped me to prepare this tutorial:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://pytorch-geometric.readthedocs.io/en/latest/index.html#" rel="noopener noreferrer"&gt;PyTorch Geometric Documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/1609.02907" rel="noopener noreferrer"&gt;“Semi-supervised Classification with Graph Convolutional Networks” Paper&lt;/a&gt; by Kipf, Thomas N., and Max Welling.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.youtube.com/watch?v=8owQBFAHw7E" rel="noopener noreferrer"&gt;Intro to graph neural networks presentation&lt;/a&gt; by Petar Veličković on YouTube.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Thank you for following this tutorial, I hope it was of help for you🙂.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>graphs</category>
      <category>pytorch</category>
    </item>
  </channel>
</rss>
