Graph Manifestation
Graph Manifestation is the resumable backend stage that turns completed raw graph extraction candidates into graph-node vectors and traversable Neo4j graph records.
It is not the same thing as extraction. Extraction creates trusted raw candidates. Manifestation proves which of those candidates have been embedded, written as graph nodes, and connected as graph relationships.
Why Graph Manifestation Exists
VySol needs a separate boundary between "the model extracted these candidates" and "the local graph systems can use these candidates." Qdrant node-vector writes and Neo4j graph writes can fail independently, so extraction completion cannot be treated as graph-storage completion.
Graph Manifestation exists to make that boundary explicit, resumable, and safe to inspect after interruption.
Who This Page Is For
This page is for contributors, power users, and AI coding agents that need to change graph-node embedding, Qdrant node-vector writes, Neo4j node or relationship writes, manifestation manifests, graph cleanup, or manifestation resume behavior.
What Graph Manifestation Owns
Graph Manifestation owns:
- per-book
graph_manifestation.jsonstate - graph-node embedding work for extracted node candidates
- Qdrant graph-node vector writes
- Neo4j extracted-node writes
- Neo4j extracted-relationship writes
- edge dependency tracking against manifested endpoint nodes
- chunk-scoped cleanup when raw graph candidates change
- manifestation warnings and result summaries
What Graph Manifestation Does Not Own
Graph Manifestation does not own:
- source-file ingestion
- text splitting
- chunk vector creation
- raw graph extraction prompts, parsing, or gleaning
- provider key scheduling policy
- model registry metadata
- cross-chunk entity resolution
- canonical entity merging
- Neo4j startup or credential creation
Normal Flow
Manifestation starts only after the book's graph extraction manifest is completed. The service loads the completed extraction manifest, then loads or creates the matching manifestation manifest for that book.
Each extracted node becomes one node state. VySol embeds the node text, writes the node vector to the Qdrant node-vector collection, then writes the corresponding ExtractedNode record to Neo4j. A node is only considered manifested after both the vector write and the Neo4j node write are confirmed.
Edges are not embedded. An edge waits until both endpoint nodes are manifested. When both endpoint nodes are ready, VySol writes the relationship to Neo4j as an extracted relationship.
The manifestation manifest records node embedding status, Neo4j node status, edge dependency status, retry counts, warnings, and summary counts so later runs can continue from saved state.
Inputs
Graph Manifestation receives:
- completed graph extraction manifests
- world metadata and the locked embedding profile
- existing manifestation manifests
- node embedding provider outcomes
- Qdrant node-vector store state
- Neo4j graph-writer outcomes
- provider key scheduler availability through the node embedder
Outputs
Graph Manifestation produces:
- graph-node vectors in Qdrant
ExtractedNoderecords in Neo4j- extracted relationship records in Neo4j
- per-book manifestation manifests
- book-level manifestation result summaries
- warning events when graph storage is unavailable
- chunk-scoped cleanup requests for stale graph outputs
Saved State And Resume Behavior
The manifestation manifest is separate from the extraction manifest because manifestation can be partially complete even when extraction is fully complete.
If node vectors are already saved, a later manifestation pass does not embed those nodes again. If node vectors are saved but Neo4j node writes are still pending, a later pass retries only the Neo4j side before edge writes can proceed.
If the existing manifestation manifest matches the same world, ingestion run, book, and raw candidate fingerprints, VySol preserves trusted node and edge state. If a chunk's raw candidate fingerprint changes, manifestation resets that chunk's graph outputs and rebuilds its candidates from pending state.
Retry, Pause, And Abort Behavior
Manifestation is safe to pause or resume because each write boundary saves state after confirmed progress.
Node embedding failures, Neo4j node write failures, and Neo4j edge write failures use retry counts inside the manifestation manifest. Failures that reach the retry limit are treated as failed for that pass, but later manifestation runs can reset those failed states and try again after credentials, provider availability, or Neo4j health changes.
Neo4j unavailability is handled differently from a rejected write. When Neo4j is missing, stopped, or unreachable, manifestation leaves graph writes pending and records a warning instead of turning completed extraction output into a hard failure.
Failure Behavior
Manifestation rejects missing or incomplete extraction manifests before starting vector or graph writes.
Qdrant node-vector write failure is blocking because the node cannot be safely marked embedded until the vector store confirms the write.
Neo4j unavailability leaves node or edge writes pending. Neo4j write rejection can mark node or edge states failed after retries, but it must not rewrite raw extraction state as failed extraction.
Corrupt manifestation manifests are reset and rebuilt from the completed extraction manifest, with a warning saved in the new manifestation state.
User-Facing Behavior
The current behavior is backend-facing. Users and future UI surfaces see manifestation through ingestion result summaries, warning events, and saved progress state rather than through a dedicated graph manifestation screen.
A book can report partial ingestion completion when extraction finished but graph manifestation is still pending or failed.
System Interactions
Graph Manifestation interacts with:
- Knowledge Graph Extraction Pipeline, which produces completed raw node and edge candidates
- World Storage, which stores per-book manifestation manifests beside ingestion outputs
- Vector Storage And Chunk Embeddings, which provides the locked embedding profile used for node vectors
- Qdrant Vector Store, which stores graph-node vectors in profile-specific node collections
- Neo4j Graph Store, which stores extracted nodes and relationships
- Provider Key Scheduler, which schedules node embedding calls through shared provider credentials
Internal Edge Cases
- The extraction manifest can be missing or incomplete, so manifestation must reject it before writes start.
- A manifestation manifest can be corrupt, so it is rebuilt from completed extraction output.
- A node vector can be written while its Neo4j node write remains pending.
- A node embedder can return neither a vector nor a failure for a node, so the node is marked with a missing-embedding failure.
- An edge can wait because one or both endpoint nodes are not manifested yet.
- An edge can fail by dependency when an endpoint node fails.
- A dependency-ready edge write can fail and remain retryable until its retry budget is exhausted for that pass.
- A chunk's raw candidates can change, so stale node vectors and Neo4j rows for that chunk must be cleaned before rebuilt candidates are trusted.
Cross-System Edge Cases
- Qdrant node vectors and Qdrant chunk vectors share local vector storage but must stay in separate profile-specific collections.
- Provider key cooldowns or missing eligible credentials can pause node embedding without corrupting extraction or chunk embedding state.
- Neo4j can be unavailable after extraction and node-vector work complete, so graph writes must remain pending without invalidating raw extraction.
- Full-world re-ingest must clean old chunk vectors, node vectors, Neo4j rows, and derived book outputs while preserving stored source copies.
- Future entity resolution must treat manifested records as extracted candidates, not canonical entities.
Implementation Landmarks
Graph manifestation orchestration, models, adapters, storage helpers, structured errors, and Neo4j writing live under backend/ingestion/graph_manifestation.
Node-vector persistence is implemented through the Qdrant node-vector helpers in backend/embeddings. Ingestion wires manifestation after graph extraction through the text-source ingestion service.
What AI/Coders Must Check Before Changing This System
Before changing Graph Manifestation, check:
- extraction manifest completion checks
- manifestation manifest reconciliation
- node embedding status transitions
- Neo4j node status transitions
- edge dependency transitions
- retry-count reset behavior across later passes
- chunk-scoped stale cleanup
- Qdrant node collection separation from chunk collections
- Neo4j unavailable versus Neo4j write-failed behavior
- ingestion's definition of a fully completed book pipeline
Invariants That Must Not Be Broken
- Raw extraction completion is not manifestation completion.
- Manifestation must only consume completed extraction manifests.
- A node is manifested only after both its node vector and Neo4j node write are confirmed.
- An edge must not be written before both endpoint nodes are manifested.
- Node vectors and chunk vectors must not share the same Qdrant collection contract.
- Cleanup must stay scoped to the world, ingestion run, book, and chunk boundary being rebuilt.
- Neo4j unavailability must leave graph writes resumable instead of corrupting extraction state.
- Manifestation must not perform canonical entity resolution.