<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: theralavineela</title>
    <description>The latest articles on DEV Community by theralavineela (@theralavineela).</description>
    <link>https://dev.to/theralavineela</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3743847%2Fe248dc95-743f-495f-af84-9809a14b667d.png</url>
      <title>DEV Community: theralavineela</title>
      <link>https://dev.to/theralavineela</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/theralavineela"/>
    <language>en</language>
    <item>
      <title>My GSoC Journey So Far: Exploring Geographically Weighted Learning in PySAL</title>
      <dc:creator>theralavineela</dc:creator>
      <pubDate>Sat, 31 Jan 2026 13:56:02 +0000</pubDate>
      <link>https://dev.to/theralavineela/my-gsoc-journey-so-far-exploring-geographically-weighted-learning-in-pysal-4179</link>
      <guid>https://dev.to/theralavineela/my-gsoc-journey-so-far-exploring-geographically-weighted-learning-in-pysal-4179</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Over the past couple of months, I’ve been gradually ramping up my involvement with the PySAL / gwlearn ecosystem as part of my preparation for Google Summer of Code (GSoC).&lt;/p&gt;

&lt;p&gt;December and January were a bit of a balancing act — semester exams on one side, and open-source exploration on the other. While I couldn’t code full-time throughout this period, I stayed consistently engaged by following community discussions, testing compatibility issues, experimenting locally, and strengthening my understanding of geographically weighted (GW) models.&lt;/p&gt;

&lt;p&gt;This post summarizes what I worked on, what I learned, and the research direction I’m currently exploring.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;December: Foundations, Setup, and First Experiments&lt;br&gt;
Revisiting the Basics&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
I began by revisiting geospatial data science fundamentals, using:&lt;/p&gt;

&lt;p&gt;ISRO geospatial course notes&lt;/p&gt;

&lt;p&gt;PySAL documentation and mentor-recommended resources&lt;/p&gt;

&lt;p&gt;This helped solidify the theoretical grounding required for geographically weighted models, especially around spatial relationships and locality-aware learning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installing and Verifying gwlearn&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I installed the gwlearn package following the official GitHub documentation and verified the installation by checking the package version in Python.&lt;/p&gt;

&lt;p&gt;Instead of modifying library files directly, I created a playground script using the Guerry dataset — effectively a “Hello World” example for gwlearn — to ensure everything worked end to end.&lt;/p&gt;

&lt;p&gt;This single example validated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GeoPandas integration&lt;/li&gt;
&lt;li&gt;Kernel computations (tricube kernel)&lt;/li&gt;
&lt;li&gt;Scikit-learn–style .fit() API&lt;/li&gt;
&lt;li&gt;Model execution through the core gwlearn code path
Seeing successful output gave me confidence that my environment and dependencies were set up correctly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*&lt;em&gt;Learning About scikit-learn Metadata Routing&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
Later in December, I explored metadata routing, a newer feature introduced in recent versions of scikit-learn.&lt;/p&gt;

&lt;p&gt;Metadata routing allows non-standard data (like spatial geometry) to flow through pipelines without breaking sklearn’s strict estimator API. This is especially important for spatial models, where geometry is essential but not traditionally part of X or y.&lt;/p&gt;

&lt;p&gt;Understanding this concept turned out to be critical for my first real contribution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contribution: Making gwlearn More scikit-learn Compatible&lt;/strong&gt;&lt;br&gt;
The Problem&lt;/p&gt;

&lt;p&gt;Scikit-learn expects estimators to follow this signature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fit(X, y)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But gwlearn models inherently require geometry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fit(X, y, geometry)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mismatch caused:&lt;/p&gt;

&lt;p&gt;Failures in sklearn.utils.estimator_checks&lt;/p&gt;

&lt;p&gt;Incompatibility with Pipeline and GridSearchCV&lt;/p&gt;

&lt;p&gt;Geometry being dropped during cloning or prediction&lt;/p&gt;

&lt;p&gt;Passing geometry via the constructor or embedding it into X also caused additional issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I worked on improving compatibility by:&lt;/p&gt;

&lt;p&gt;Enabling sklearn metadata routing&lt;/p&gt;

&lt;p&gt;Declaring geometry as required routed metadata&lt;/p&gt;

&lt;p&gt;Refactoring fit() to accept **kwargs instead of explicitly requiring geometry&lt;/p&gt;

&lt;p&gt;Introducing an adapter pattern so gwlearn remains expressive internally while presenting a clean sklearn-style interface externally&lt;/p&gt;

&lt;p&gt;This approach allowed geometry to pass safely through pipelines, cross-validation, and hyperparameter search — without violating sklearn’s API constraints.&lt;/p&gt;

&lt;p&gt;Pull Request&lt;/p&gt;

&lt;p&gt;These changes were submitted as:&lt;/p&gt;

&lt;p&gt;PR #45 — Add sklearn metadata routing support and stabilize Cross-Validation&lt;br&gt;
&lt;a href="https://dev.tourl"&gt; https://github.com/pysal/gwlearn/pull/45&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The PR:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixes critical sklearn compatibility issues&lt;/li&gt;
&lt;li&gt; Stabilizes cross-validation behavior &lt;/li&gt;
&lt;li&gt; Keeps changes minimal and backward compatible&lt;/li&gt;
&lt;li&gt;Was verified using the Guerry dataset with Pipeline and GridSearchCV
It also sparked valuable design discussions with mentors — a great reminder that exploratory PRs are part of healthy open-source development.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*&lt;em&gt;January: Testing and Compatibility Work&lt;br&gt;
Pandas 3.0 Testing&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
Even while focusing on semester exams, I kept checking community updates and running compatibility tests.&lt;/p&gt;

&lt;p&gt;I ran the gwlearn test suite against both pandas 2.3.3 and pandas 3.0.0 using a GitHub Actions compatibility matrix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pandas 2.3.3&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All tests pass&lt;/p&gt;

&lt;p&gt;Including runs with -W error enabled&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pandas 3.0.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three test failures in test_base.py&lt;/p&gt;

&lt;p&gt;Objects expected to be trained models become scalars (float)&lt;/p&gt;

&lt;p&gt;This results in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AttributeError: 'float' object has no attribute 'predict_proba'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I discussed these results with mentors and am currently investigating the underlying cause, which appears related to changes in pandas behavior affecting model persistence or reconstruction.&lt;/p&gt;

&lt;p&gt;This highlighted the importance of testing against future dependency versions, not just current stable releases.&lt;/p&gt;

&lt;p&gt;**Current Focus: Geographically Weighted Matrix Decomposition&lt;/p&gt;

&lt;p&gt;At the moment, I’m studying geographically weighted matrix decomposition algorithms, particularly:&lt;/p&gt;

&lt;p&gt;Geographically Weighted Principal Component Analysis (GWPCA)&lt;br&gt;
**&lt;br&gt;
Graph-based spatial representations using libpysal.graph.Graph&lt;/p&gt;

&lt;p&gt;Research Direction&lt;/p&gt;

&lt;p&gt;_Proposed Focus:&lt;br&gt;
_&lt;br&gt;
Implement geographically weighted matrix decomposition algorithms (such as GWPCA) on top of libpysal.graph.Graph, to be included in the gwlearn sub-package of the PySAL federation, with a scikit-learn compatible API.&lt;/p&gt;

&lt;p&gt;This direction aligns well with:&lt;/p&gt;

&lt;p&gt;gwlearn’s goal of scalable, sklearn-friendly spatial models&lt;/p&gt;

&lt;p&gt;libpysal’s evolving graph-based infrastructure&lt;/p&gt;

&lt;p&gt;My interest in combining spatial statistics, machine learning, and clean API design&lt;/p&gt;

&lt;p&gt;I’m currently working through the theory and existing implementations to understand how these algorithms can be expressed cleanly within gwlearn’s estimator framework.&lt;/p&gt;

&lt;p&gt;Reflection&lt;/p&gt;

&lt;p&gt;Even during a semester-heavy period, staying consistently engaged — through testing, reading, and discussion — helped me maintain momentum.&lt;/p&gt;

&lt;p&gt;Key takeaways so far:&lt;/p&gt;

&lt;p&gt;Compatibility work is as important as new features&lt;/p&gt;

&lt;p&gt;Metadata routing is essential for spatial ML in sklearn ecosystems&lt;/p&gt;

&lt;p&gt;Testing against upcoming dependency versions surfaces issues early&lt;/p&gt;

&lt;p&gt;Open-source contribution is as much about communication as code&lt;/p&gt;

&lt;p&gt;I’m excited to build on this foundation and move toward deeper implementation work in the coming months.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>geospatial</category>
    </item>
  </channel>
</rss>
