I Built a JSON Schema for Complex Relationship Data
I've been working on a database with 592+ interconnected entries, and I wanted to share how I structured the JSON schema to handle complex relationships efficiently.
The Data Problem
Relationship data has multiple dimensions:
- Entity A + Entity B → Severity level
- Evidence type (clinical, case study, theoretical)
- Mechanism of action
- Reference citations
A flat list doesn't scale. Here's the schema I ended up with.
Schema Design
{
"interaction_id": "relation_001",
"entity_a": {
"name": "Sample Entity",
"code": "ABC"
},
"entity_b": {
"name": "Related Entity",
"class": "type",
"code": "XYZ"
},
"relationship": {
"severity": "major",
"mechanism": "enzyme_inhibition",
"evidence_level": "clinical_trial",
"description": "Affects the interaction"
},
"references": [
{
"id": "12345",
"title": "Reference study",
"year": 2020
}
]
}
Key Design Decisions
1. Separate Entities
Each entity and relationship is a separate object. This allows:
- Querying by entity: "What does this connect to?"
- Querying by relationship type
- Querying by severity
2. Evidence Levels
I used a simple enum:
-
clinical_trial: Controlled studies -
case_report: Individual case studies -
theoretical: Predictions
3. Severity Codes
-
major: Avoid combination -
moderate: Monitor or adjust -
minor: Informational -
unknown: No data
4. Reference Linking
Each relationship has a references array. This makes the data verifiable.
Implementation Notes
I store the relationships in a single JSON file for simplicity. For a production app, I'd normalize this into a proper database.
Open Source
The schema is available as open source. Feel free to fork, extend, or use it in your projects.
What database schemas have you designed for complex relationship data?
Top comments (0)