Swapin Vidya

Posted on May 8

Building Smaller Graph Neural Networks for Edge Healthcare Systems

#machinelearning #healthcare #edgecomputing #pytorch

How I explored INT8 quantization, biological graphs, and CPU-only inference using PyTorch Geometric.

Healthcare AI is often discussed in terms of massive cloud infrastructure and expensive GPUs.

But many real-world systems do not operate inside large datacenters.

Small clinics, portable medical systems, rural deployments, and edge diagnostic devices frequently depend on:

low-power CPUs
limited memory
unstable connectivity
compact hardware environments

That raises an important engineering question:

Can graph neural networks become smaller and more deployable without completely losing their predictive behavior?

This project explores that question using biological graph data, Graph Neural Networks (GNNs), and manual INT8 quantization.

The Project

BioGraph-Edge-Quantizer GitHub Repository

The repository focuses on:

biological graph inference
resource-aware deployment
CPU-only execution
model compression
reproducible benchmarking

The system uses:

Python
PyTorch Geometric
GraphSAGE
TorchScript
Laravel API integration

The goal was not to build a “medical AI product.”

Instead, the focus was:

understanding how graph-based AI systems behave under hardware constraints.

Why Graphs Matter in Biology

Many biological systems naturally behave like graphs.

For example:

proteins interact with other proteins
genes regulate other genes
molecular pathways form connected networks

In this project, the graph structure comes from protein interaction relationships inspired by the STRING dataset.

Each node represents a protein.
Each edge represents a relationship or interaction.

The model then attempts a binary node classification task.

Simplified example:

Does this protein belong to a target functional category?
Is this interaction pattern significant?

This is where Graph Neural Networks become useful.

Why Use Graph Neural Networks?

Traditional neural networks process:

images
text
tabular data

But biological systems are highly interconnected.

GNNs are useful because they learn:

relationships
neighborhood behavior
graph structure

This project uses GraphSAGE, which is designed for inductive graph learning.

That means:

the model can generalize to unseen nodes
inference is more flexible for evolving graphs

The Real Problem: Edge Deployment

Most machine learning tutorials stop after:

“The model works.”

But deployment creates a different set of challenges:

memory limits
latency stability
CPU constraints
model size
reproducibility

This project explores:

INT8 weight packing
TorchScript deployment
bounded inference variance
edge-device behavior

What is INT8 Quantization?

Most neural networks store weights using FP32 (32-bit floating point values).

Quantization reduces precision.

Instead of:

32-bit weights

we use:

8-bit integer weights

The tradeoff:

smaller model
lower memory usage
possible accuracy reduction

In this project:

weights were manually converted to INT8
scale factors were stored separately
dequantization happened during inference

The Results

The interesting part was not raw speed.

It was understanding where quantization actually helps.

x86 Laptop Results

Hardware:

Intel i5-10210U
8 GB RAM
Windows 11

Results:

~75% reduction in model size
very small latency improvement
accuracy drop below 1%

That initially seemed disappointing.

But the explanation matters.

Graph neural networks are often:

memory-bound
aggregation-heavy

The bottleneck was not matrix multiplication alone.

It was:

graph traversal
feature movement
neighbor aggregation

ARM Edge Device Results

The same experiment was tested on:

Raspberry Pi 4
Cortex-A72 CPU
4 GB RAM

This time the gains became more noticeable.

Why?

Because smaller devices have:

tighter memory limits
smaller cache capacity
lower memory bandwidth

In those environments:

reduced model size matters more
memory pressure becomes a real constraint

This is an important observation for:

hardware vendors
embedded AI systems
edge healthcare infrastructure

Current System Architecture

The repository currently separates:

ML inference
API infrastructure

ML Layer

Python + PyTorch Geometric

API Layer

Laravel-based gateway

Current flow:

Laravel → Python subprocess → GNN inference → API response

This is intentionally simple for experimentation.

The repository also documents:

subprocess overhead
scalability limitations
future migration plans

Why This Matters for Medical Informatics

Medical informatics is not only about large AI models.

It is also about:

interoperability
infrastructure
reproducibility
deployment reliability
hardware-aware engineering

Even experimental systems benefit from:

deterministic execution
controlled benchmarking
transparent limitations

This project is not a clinical system.
It is a systems-engineering exploration around biological graph inference.

Important Limitations

A good engineering project should clearly state its limitations.

Current limitations include:

limited benchmarking scope
no clinical validation
subprocess overhead
no distributed inference
limited quantization optimization
prototype-level architecture

The project intentionally avoids claiming:

medical accuracy
production readiness
diagnostic capability

Why I Shared This Publicly

I wanted to document:

how edge AI systems behave
where quantization helps
where it does not
how biological graph workloads differ from standard AI pipelines

Many tutorials simplify deployment problems.

But practical ML engineering often involves:

bottlenecks
memory constraints
unstable performance behavior
tradeoffs between accuracy and footprint

Understanding those tradeoffs is valuable for:

developers
researchers
hardware engineers
students entering medical informatics

Repository

GitHub

BioGraph-Edge-Quantizer Repository

Areas Open for Collaboration

I would especially love feedback from people working in:

graph neural networks
embedded inference
medical informatics
edge hardware systems
PyTorch optimization
ONNX / TVM / ExecuTorch
biological network analysis

About the Author

Swapin Vidya

Interested in:

edge AI systems
reproducible ML infrastructure
biological graph computing
hardware-aware inference pipelines
healthcare-oriented systems engineering

GitHub:
Swapin Vidya GitHub

ORCID:
Swapin Vidya ORCID

DEV Community

Building Smaller Graph Neural Networks for Edge Healthcare Systems

The Project

Why Graphs Matter in Biology

Why Use Graph Neural Networks?

The Real Problem: Edge Deployment

What is INT8 Quantization?

The Results

x86 Laptop Results

ARM Edge Device Results

Current System Architecture

ML Layer

API Layer

Why This Matters for Medical Informatics

Important Limitations

Why I Shared This Publicly

Repository

GitHub

Areas Open for Collaboration

About the Author

Swapin Vidya

Top comments (0)