<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tasfin Mahmud</title>
    <description>The latest articles on DEV Community by Tasfin Mahmud (@tasfinmahmud).</description>
    <link>https://dev.to/tasfinmahmud</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3946759%2F8e034471-faea-427b-acc1-9b58c10ccb7f.jpeg</url>
      <title>DEV Community: Tasfin Mahmud</title>
      <link>https://dev.to/tasfinmahmud</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tasfinmahmud"/>
    <language>en</language>
    <item>
      <title>Debiasing Graph Neural Networks for Recommendation with Causal RL</title>
      <dc:creator>Tasfin Mahmud</dc:creator>
      <pubDate>Sat, 23 May 2026 05:24:37 +0000</pubDate>
      <link>https://dev.to/tasfinmahmud/debiasing-graph-neural-networks-for-recommendation-with-causal-rl-2783</link>
      <guid>https://dev.to/tasfinmahmud/debiasing-graph-neural-networks-for-recommendation-with-causal-rl-2783</guid>
      <description>&lt;p&gt;As part of my undergraduate research in &lt;strong&gt;Graph Neural Networks (GNNs)&lt;/strong&gt; and &lt;strong&gt;Causal Inference&lt;/strong&gt;, I've been exploring a major flaw in modern recommender systems: &lt;strong&gt;observational bias&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Standard recommendation algorithms—even state-of-the-art GNNs like LightGCN and NGCF—learn from biased data. Popular items get shown more often, which leads to more clicks, creating a feedback loop that reinforces popularity bias and buries niche items.&lt;/p&gt;

&lt;p&gt;To solve this, I built an open-source framework that combines &lt;strong&gt;GNNs with Causal Reinforcement Learning&lt;/strong&gt; to debias recommendations. &lt;/p&gt;

&lt;p&gt;Here is how I approached it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👋 &lt;strong&gt;Hi, I'm Tasfin Mahmud!&lt;/strong&gt; I'm a CS Researcher at BRAC University and an open-source contributor. You can learn more about my work on my &lt;a href="https://tasfinmahmud.github.io" rel="noopener noreferrer"&gt;portfolio website&lt;/a&gt; or my &lt;a href="https://github.com/TasfinMahmud" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🏗️ The Baseline GNNs
&lt;/h2&gt;

&lt;p&gt;I started by implementing three solid baseline architectures in PyTorch Geometric (PyG):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LightGCN:&lt;/strong&gt; The minimalist approach that drops non-linear transformations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NGCF (Neural Graph Collaborative Filtering):&lt;/strong&gt; Explicitly models high-order connectivities with feature interaction terms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GAT-CF:&lt;/strong&gt; Graph Attention Networks adapted for collaborative filtering.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These models perform incredibly well on standard metrics. But there is a catch: if you evaluate them on observational data, the metrics look great only because the test data shares the same exposure bias as the training data.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 Injecting Causal RL
&lt;/h2&gt;

&lt;p&gt;To break the popularity loop, I implemented four complementary causal debiasing techniques.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Inverse Propensity Scoring (IPS)
&lt;/h3&gt;

&lt;p&gt;The simplest way to fix exposure bias is to reweight the Bayesian Personalised Ranking (BPR) training loss. IPS divides the loss for each item by its exposure probability. Rarely shown items receive a higher gradient signal, while mega-popular items are scaled down. &lt;/p&gt;

&lt;h3&gt;
  
  
  2. Causal Embeddings (CausE)
&lt;/h3&gt;

&lt;p&gt;Here, the model maintains two separate embedding spaces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;factual&lt;/strong&gt; space (learned from the biased data)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;counterfactual&lt;/strong&gt; space (representing uniform exposure)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A discrepancy regularizer pulls the factual representations toward the unbiased counterfactual ones, preventing the model from overfitting to the exposure distribution.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Causal Policy Gradient
&lt;/h3&gt;

&lt;p&gt;Treating recommendations as a sequential decision-making problem, I used the &lt;strong&gt;REINFORCE&lt;/strong&gt; algorithm. The core innovation here is &lt;strong&gt;Causal Reward Shaping&lt;/strong&gt;: decomposing observed rewards into the "true preference" (causal component) and the "popularity bias" (confounding component). Using Doubly Robust (DR) estimation makes learning from logged data much more stable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Causal Discovery
&lt;/h3&gt;

&lt;p&gt;How do we know what the confounders are if they aren't explicitly measured? I implemented a causal discovery module using Truncated SVD on the exposure matrix to automatically identify latent confounding factors, which are then integrated into the reward shaping process.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 The Results
&lt;/h2&gt;

&lt;p&gt;I benchmarked these approaches using LightGCN on the &lt;strong&gt;MovieLens 100k&lt;/strong&gt; dataset:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Mode&lt;/th&gt;
&lt;th&gt;Recall@20&lt;/th&gt;
&lt;th&gt;NDCG@20&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Standard (Baseline)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.1676&lt;/td&gt;
&lt;td&gt;0.1624&lt;/td&gt;
&lt;td&gt;Standard biased observational learning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IPS Debiasing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.1453&lt;/td&gt;
&lt;td&gt;0.1543&lt;/td&gt;
&lt;td&gt;Re-weights rare items; expected to drop on biased test data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CausE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.1675&lt;/td&gt;
&lt;td&gt;0.1625&lt;/td&gt;
&lt;td&gt;Regularized against uniform exposure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Causal PG (DR)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.1593&lt;/td&gt;
&lt;td&gt;0.1602&lt;/td&gt;
&lt;td&gt;Doubly robust policy gradient&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;(Note: Evaluating debiased models on standard biased test sets results in lower raw metric scores because the test set shares the exposure bias. Unbiased logging data is required to see the true lift).&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 The Takeaway
&lt;/h2&gt;

&lt;p&gt;GNNs are powerful tools for recommendation, but without causal inference, they are simply learning to amplify existing biases in your dataset. By utilizing techniques like IPS and Causal Policy Gradients, we can build recommendation systems that truly understand user preference rather than just popularity.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Check out the full framework on my GitHub:&lt;/strong&gt; &lt;br&gt;
&lt;a href="https://github.com/TasfinMahmud/gnn-collaborative-filtering" rel="noopener noreferrer"&gt;gnn-collaborative-filtering&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Learn more about my research and open-source work:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://tasfinmahmud.github.io" rel="noopener noreferrer"&gt;tasfinmahmud.github.io&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me know in the comments if you've worked with Causal Inference for recommendation systems!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>opensource</category>
      <category>gnn</category>
    </item>
  </channel>
</rss>
