DEV Community

Rizwan Saleem
Rizwan Saleem

Posted on

Designing a Personal Knowledge Graph for Software Engineers

Designing a Personal Knowledge Graph for Software Engineers

Designing a Personal Knowledge Graph for Software Engineers

A personal knowledge graph (PKG) is a structured, queryable map of your skills, experiences, projects, and learning connections. It helps you reason about career moves, plan learning paths, and communicate your capabilities clearly to teams, recruiters, and future you. In this tutorial, you’ll build a practical PKG from scratch, grounded in real-world software engineering needs. You’ll see why PKGs matter, how to model them, and how to use lightweight tooling to grow your graph over time.

1. Why a PKG matters for software engineers

  • It aligns learning with career goals: by explicitly modeling skills and projects, you can spot gaps and prioritize learning.
  • It makes performance reviews and resumes explicit: your graph can generate tailored summaries for different roles.
  • It reduces cognitive load: you don’t have to remember every project; the graph stores relationships and provenance.
  • It scales with complexity: from a few skills to dozens of projects, teams, and technologies.

Example: A PKG helps you answer, “Which projects demonstrate my proficiency in distributed systems and how did I learn those concepts?” in a few clicks.

2. PKG data model: entities and relationships

A compact, practical model focuses on a few core entity types and their links.

  • Person: your profile with metadata (name, contact, current role).
  • Skill: a technology or domain capability (e.g., “Rust,” “Distributed Systems,” “Testing”).
  • Project: a concrete software initiative you contributed to.
  • Learning: an instance of learning activity (course, book, article, conference talk).
  • Experience: a job role or assignment with time bounds.
  • Evidence: artifacts that demonstrate capability (code, talks, blog posts, repos, PRs).

Relationships (edges):

  • Person has Skill
  • Project uses Skill
  • Project involves Experience
  • Learning teaches Skill
  • Experience includes Project
  • Evidence supports Skill or Project or Experience

This keeps the graph intuitive and queryable while remaining lightweight.

Illustration (text-based):

  • Person: You
  • Skill: TypeScript, System Design, Cloud Native, Testing
  • Project: E-commerce Checkout Service
  • Learning: “Designing Data-Intensive Applications” (book)
  • Experience: Senior Software Engineer at Acme Corp (2022-present)
  • Evidence: GitHub PRs, design docs, talks

Edges:

  • You has Skill TypeScript
  • E-commerce Checkout Service uses Skill TypeScript
  • E-commerce Checkout Service involves Experience Senior Software Engineer at Acme Corp
  • Design Doc Design for Checkout uses Skill System Design
  • Design Doc teaches Skill System Design
  • You has Evidence GitHub PR #123 ### 3. Minimal, practical schema you can implement today

A simple, query-friendly schema to start with:

  • Entities

    • Person: id, name, title, current_company, contact
    • Skill: id, name, domain (frontend, backend, devops, data)
    • Project: id, name, description, start_date, end_date
    • Experience: id, title, company, start, end
    • Learning: id, type (book/course/article/conference), title, author/instructor, date
    • Evidence: id, type (repo, talk, post), url, caption, related_entity_id
  • Relationships (edges)

    • person_skill(person_id, skill_id)
    • project_skill(project_id, skill_id)
    • project_experience(project_id, experience_id)
    • learning_skill(learning_id, skill_id)
    • experience_project(experience_id, project_id)
    • evidence_entity(evidence_id, related_entity_id)

You can store this in a lightweight graph database (Neo4j, Dgraph) or in a simple relational store with join tables. For portability, a local SQLite database with a small ORM or even a JSON-LD structure works.

4. Getting started: a minimal, runnable example

We’ll implement a tiny PKG using SQLite and Python with SQLAlchemy for a quick start. You can adapt to TypeScript/Node if you prefer.

Dependencies:

  • Python 3.11+
  • SQLAlchemy
  • SQLite (built-in)

Install:

  • python3 -m venv venv
  • source venv/bin/activate
  • pip install sqlalchemy

Code: models.py

  • Define ORM models for Person, Skill, Project, Experience, Learning, Evidence and association tables.

Code: seed.py

  • Seed some sample data: your name, a couple of skills, a project, a learning item.

Code: query.py

  • Simple CLI to list projects by skill, or list evidence for a project.

Sample snippets:

  • models.py (conceptual)
  • from sqlalchemy import create_engine, Column, Integer, String, Date, ForeignKey, Table
  • from sqlalchemy.orm import declarative_base, relationship, sessionmaker
  • Base = declarative_base()
  • person_skill = Table('person_skill', Base.metadata, Column('person_id', ForeignKey('persons.id')), Column('skill_id', ForeignKey('skills.id')) )
  • class Person(Base): tablename = 'persons'; id = Column(Integer, primary_key=True); name = Column(String)
  • class Skill(Base): tablename = 'skills'; id = Column(Integer, primary_key=True); name = Column(String)
  • class Project(Base): tablename = 'projects'; id = Column(Integer, primary_key=True); name = Column(String); description = Column(String)
  • relationships: person.skills = relationship('Skill', secondary=person_skill, back_populates='persons')

Seed.py and query.py follow standard patterns. The exact code is straightforward; I can provide a full snippet if you’d like.

Run:

  • python seed.py
  • python query.py list-projects-by-skill TypeScript

Output will show which projects used a given skill and what you recorded as evidence.

5. Practical workflows to grow your PKG

  • Add with intention: after every project, add a Project entity and connect the relevant Skills and Experiences. Attach Evidence (repo links, design docs, talks).
  • Track learning as you go: add Learning items and link to Skills they improve. This builds a traceable learning path.
  • Reflect quarterly: generate reports like “Projects by Skill” or “Learning-to-Skill mapping.” Use these to prepare performance reviews or update resumes.
  • Integrate with your resume and profiles: export a tailored resume from the PKG, selecting sections by skill clusters (Frontend, Backend, Cloud, Data).
  • Visualize relationships: simple graph visualization reveals clusters and gaps. Tools like Graphviz or small browser-based viewers can render your PKG.

Example query goals:

  • What projects demonstrate end-to-end expertise in distributed systems?
  • Which skills have the strongest set of evidences (PRs, talks, docs)?
  • What learning items most effectively bridge gap between current and target roles?

    6. Practical tooling options

  • Lightweight: JSON-LD or a local SQLite DB

    • Pros: portable, human-readable, easy to version with Git
    • Cons: limited tooling for complex queries
  • Graph databases:

    • Neo4j: powerful query language (Cypher), good for multi-hop relationships
    • Dgraph or ArangoDB: scalable graph options
  • Hybrid approach:

    • Store core entities in SQLite, export graphs to Graphviz or d3.js for visualization
    • Use a small REST or GraphQL API to query the PKG from your development environment

If you want a ready-to-run starter, I can provide a complete Python project (models.py, seed.py, query.py) or a Node.js version with Prisma and SQLite.

7. Extending the model: when to stop and what to add next

  • Add motivation and impact: attach a short impact narrative to Projects or Experiences (outcomes, metrics).
  • Introduce certifications and awards as a separate entity with dates and linking to Skills.
  • Make provenance explicit: record who reviewed or approved your work (mentor, manager) for credibility.
  • Add sentiment or confidence scores: a lightweight scale that helps you prioritize learning.
  • Integrate with external tools: link your PKG to GitHub repos, LinkedIn sections, or a personal website.

As you grow, keep the model focused. The more you add, the more you need to maintain. Schedule regular pruning sessions to retire outdated projects or reclassify skills.

8. Example skeleton: a starter JSON-LD representation

If you prefer a portable, standards-based format, here is a compact JSON-LD skeleton you can adapt:

{
"@context": "https://schema.org",
"@type": "Person",
"name": "Alex Engineer",
"url": "https://alex.example.com",
"knows": [
{
"@type": "Skill",
"name": "TypeScript",
"domain": "Frontend"
},
{
"@type": "Skill",
"name": "Distributed Systems",
"domain": "Backend"
}
],
"hasPart": [
{
"@type": "CreativeWork",
"name": "E-commerce Checkout Service",
"startDate": "2023-06",
"description": "High-availability checkout service with idempotent operations.",
"keywords": ["TypeScript", "Node.js", "Cloud"]
}
],
"mentions": [
{
"@type": "CreativeWork",
"name": "Design Doc: Checkout Architecture",
"author": "Alex Engineer",
"url": "https://repos.example.com/design-doc"
}
]
}

This structure is easy to version, parse, and export to your CV or portfolio pages.

9. Quick-start checklist

  • [ ] Define your PKG scope: decide which skills and domains matter for your current and target roles.
  • [ ] Pick a storage approach: SQLite with SQLAlchemy for a first build, or a JSON-LD file for portability.
  • [ ] Build the core entities and relationships (Person, Skill, Project, Experience, Learning, Evidence).
  • [ ] Seed your PKG with at least 3 projects, 5 skills, and 3 learning items.
  • [ ] Create simple queries to surface your strengths and gaps.
  • [ ] Create export templates for resumes, LinkedIn, and portfolio pages.
  • [ ] Schedule quarterly reviews to prune, add, and refine. If you’d like, I can provide a complete, ready-to-run Python project with all code files, plus a small web UI to view your PKG. Tell me whether you prefer Python or JavaScript/TypeScript, and your target storage (SQLite locally or a cloud graph DB).

-

Rizwan Saleem | https://rizwansaleem.co

Sources

Top comments (0)