undefined | Dark Hacker News

1 points by kalturnbull 114 days ago

kalturnbull 114 days ago |

Most organisational context (why a decision was made, what depends on what, who owns which risk) lives in people's heads or is scattered across docs and chat threads. I've been building a tool that tries to make this context machine-readable by modelling it as a typed, constrained graph. Wanted to share the data modelling decisions and get feedback from people who work with schemas.

The modelling problem

You need to represent fundamentally different concepts in a single graph. The obvious approaches each break in their own way:

- Separate tables per type: graph traversal becomes a JOIN/UNION mess and your edge table needs polymorphic foreign keys

- Fully generic nodes: no semantic guardrails: anything can connect to anything, and an AI operating on the graph will happily propose nonsensical relationships

- PG table inheritance: poor tooling, painful migrations, query planning quirks

What I landed on

Single nodes table with a polymorphic type column (I called them "archetypes"), a JSONB components column for type-specific data, and a node_edges table with typed, directional edges. Built on Postgres via Supabase, no native graph DB.

14 node archetypes: task, project, goal, milestone, issue, risk, assumption, decision, constraint, resource, system, person, team, event.

11 verb-based edge types: contains, precedes, achieves, impacts, mitigates, defines, involves, evidences, uses, evaluates, relates.

I started with noun-based edges (hierarchy, dependency, regulation) and migrated to verbs because nouns are ambiguous about direction. "Is dependency the thing that depends, or the thing depended on?" With precedes, there's no question.

The constraint matrix

Not all connections are valid. A 143-entry matrix defines which archetype pairs can connect via which edge type and direction. A risk can impact a project, but a project can't impact a risk. A task can achieve a goal, but can't contain one. This is enforced at the application layer, not the DB.

An AI agent proposes graph mutations (create nodes, add edges), and its output is treated as untrusted input. Every proposed mutation hits pre-flight validation (edge physics, component schemas, duplicate detection, circular dependency checks) before the user sees it. The principle: if a quality rule matters, enforce it in code, not in a prompt.

Trade-offs I'm currently living with

- Constraint matrix is application-layer only: direct SQL can create invalid edges

- 14 archetypes is already high; each new one touches ~30 files across enums, rules, prompts, and tests

- JSONB components trade DB-level type safety for schema flexibility

- Closed 11-type edge vocabulary covers ~95% of relationships; the rest gets shoehorned into relates

- Graph-on-Postgres works at current scale but I have no idea where it stops working

What I'd like feedback on

Has anyone moved graph topology constraints from app layer into the DB (triggers, check constraints, etc.) and found it worth the complexity?

14 node types in a closed vocabulary: too many, too few, or does the right number depend entirely on the domain?

The link is a demo that loads sample projects if you want to see the model in action, no signup needed.