Modern graph databases often represent dynamic systems: applications evolving over time, relationships appearing and disappearing, and entities acquiring new attributes as data changes.
When the underlying graph is user-facing, maintaining a complete history of nodes and relationships becomes a critical capability.
This article presents a production-grade, bitemporal versioning model for Neo4j, supporting:
- Accurate historical reconstruction
- Time-travel queries
- Temporal relationship tracking
- Efficient ingestion
- Minimal impact on existing “current” queries
The approach is designed for high-read systems where graph state changes incrementally and users must view data at any point in time.
1. Design Goals
A temporal graph versioning system must satisfy the following constraints:
1.1 Minimal disruption to existing queries
Everyday queries (fetching the “current” graph) must remain simple:
MATCH (n) WHERE NOT n:Deleted
MATCH ()-[r]->() WHERE r.Status = "Active"
No complex temporal logic in the majority of queries.
1.2 Complete bitemporal representation
Every node or relationship must encode:
StartDate — when it became valid
EndDate — when it stopped being valid (NULL = current)
This enables time-travel queries and historical reconstruction.
1.3 Deterministic version merging
Each node and relationship must have a stable primary key so the ingestion pipeline can decide:
- Should this entity be created?
- Should it be updated?
- Should old versions be closed?
1.4 Efficient deletion detection
We cannot “blindly” delete nodes. Instead, the pipeline must:
- Mark entities touched in this ingestion cycle (via
lastUpdated) - Infer deletions by comparing against the process date
1.5 Neo4j MERGE limitations must be respected
Neo4j does not support:
MERGE (a)-[r:LINK {EndDate: NULL}]->(b)
This is why relationships use a Status property rather than attempting NULL-based merges.
2. Data Model
2.1 Versioned Nodes
Each logical entity is represented as multiple immutable node versions:
(:Entity {
Id: "E123",
StartDate: datetime("2024-01-10T00:00:00Z"),
EndDate: null,
lastUpdated: datetime("2024-12-01T10:00:00Z")
})
When a node becomes invalid:
-
EndDateis set -
:Deletedlabel is added
ASCII Diagram
+------------------+ +------------------+
| Entity (v1) | ----> | Entity (v2) |
| Id: E123 | | Id: E123 |
| Start: T1 | | Start: T2 |
| End: T2 | | End: null |
| Label: Deleted | | Label: <none> |
+------------------+ +------------------+
2.2 Versioned Relationships
Like nodes, relationships also maintain temporal state:
(a)-[:LINK {
Id: "R987",
StartDate: datetime("2024-01-10T00:00:00Z"),
EndDate: null,
Status: "Active",
lastUpdated: datetime("2024-12-01T10:00:00Z")
}]->(b)
Why we need Status
Neo4j cannot MERGE on EndDate = NULL, so we use:
Status = "Active"Status = "Deleted"
This provides a safe, deterministic merge target.
3. Ingestion Architecture (Multi-Phase)
Your ingestion pipeline comprises three phases, ensuring consistent versioning.
+-----------------------------------------------------+
| Ingestion Pipeline |
+-----------------------------------------------------+
| |
| Phase 1: Nodes → Create or update nodes |
| Phase 2: Links → Create or update relationships|
| Phase 3: Clean-up → Close missing versions |
| |
+-----------------------------------------------------+
3.1 Phase 1 — Node Ingestion
For each incoming node:
- MERGE by
Id - If node exists and attributes differ → close old version, create new
- Update
lastUpdated = processTime
Cypher (simplified)
MERGE (n:Entity {Id: $id})
ON MATCH SET
n.lastUpdated = $processDate
ON CREATE SET
n.StartDate = $processDate,
n.lastUpdated = $processDate
When detecting changes, the ingestion process may:
- Set EndDate on the previous version
- Add
:Deleted - Create a fresh version
3.2 Phase 2 — Relationship Ingestion
For each incoming relationship:
MATCH (a:Entity {Id: $src})
MATCH (b:Entity {Id: $dst})
MERGE (a)-[r:LINK {Id: $id}]->(b)
ON MATCH SET
r.lastUpdated = $processDate
ON CREATE SET
r.StartDate = $processDate,
r.Status = "Active",
r.lastUpdated = $processDate
If a relationship changed (attribute changes), the pipeline must:
-
Mark old relationship as:
r.EndDate = $processDater.Status = "Deleted"
-
Create a new version:
StartDate = $processDateStatus = "Active"
3.3 Phase 3 — Version Closure (Deletion Detection)
After phases 1 & 2, you detect deletions:
Any node whose lastUpdated != processDate is no longer valid:
MATCH (n:Entity)
WHERE n.lastUpdated <> $processDate AND NOT n:Deleted
SET n.EndDate = $processDate, n:Deleted
Same for relationships:
MATCH ()-[r:LINK]->()
WHERE r.lastUpdated <> $processDate AND r.Status = "Active"
SET r.EndDate = $processDate, r.Status = "Deleted"
This allows ingestion to determine “missing = deleted” without manual intervention.
4. Querying the Current Graph
Your versioning design enables extremely simple “current state” queries:
Nodes
MATCH (n:Entity)
WHERE NOT n:Deleted
RETURN n
Relationships
MATCH (a)-[r:LINK]->(b)
WHERE r.Status = "Active"
RETURN a, r, b
Minimal logic.
High performance.
Clean integration with UI/API.
5. Querying Historical Snapshots
To reconstruct graph state for a given timestamp T:
Nodes
MATCH (n:Entity)
WHERE n.StartDate <= $T AND (n.EndDate IS NULL OR n.EndDate > $T)
RETURN n
Relationships
MATCH (a)-[r:LINK]->(b)
WHERE r.StartDate <= $T AND (r.EndDate IS NULL OR r.EndDate > $T)
RETURN a, r, b
This produces an accurate, complete view of the graph at time T.
6. Go + Neo4j Driver Pseudo-code
Below is idiomatic Go pseudocode demonstrating versioned ingestion logic.
6.1 Creating/Updating a Node
func ingestNode(id string, props map[string]interface{}, processDate time.Time) {
session := driver.NewSession(neo4j.SessionConfig{AccessMode: neo4j.AccessModeWrite})
defer session.Close()
_, err := session.WriteTransaction(func(tx neo4j.Transaction) (interface{}, error) {
params := map[string]interface{}{
"id": id,
"processDate": processDate,
"props": props,
}
query := `
MERGE (n:Entity {Id: $id})
ON MATCH SET
n.lastUpdated = $processDate
ON CREATE SET
n.StartDate = $processDate,
n.lastUpdated = $processDate,
n += $props
`
return tx.Run(query, params)
})
if err != nil {
log.Fatal(err)
}
}
6.2 Closing Stale Nodes
func closeStaleNodes(processDate time.Time) {
session := driver.NewSession(neo4j.SessionConfig{AccessMode: neo4j.AccessModeWrite})
defer session.Close()
_, err := session.Run(`
MATCH (n:Entity)
WHERE n.lastUpdated <> $processDate AND NOT n:Deleted
SET n.EndDate = $processDate, n:Deleted
`, map[string]interface{}{
"processDate": processDate,
})
if err != nil {
log.Fatal(err)
}
}
7. Common Pitfalls & How This Model Solves Them
7.1 MERGE cannot match on NULL
Many developers attempt:
MERGE (a)-[r:LINK {EndDate: NULL}]->(b)
This does not work in Neo4j.
Solution:
Use Status for deterministic relationship merging.
7.2 Avoid overwriting nodes
You never update older versions.
Instead:
- Close old version (
EndDate,:Deleted) - Create new version
This preserves full history.
7.3 Efficient current-state filtering
Instead of comparing timestamps, we rely on:
NOT n:Deletedr.Status = "Active"
These are extremely fast and index-friendly.
8. Performance Considerations
Indexes
You should index:
Node: Entity(Id)
Node: Entity(Deleted)
Rel: LINK(Id)
Rel: LINK(Status)
Batching
Batching ingestion improves performance substantially.
Avoiding deep history scans
Historical reconstruction always uses date filtering, not traversal of version chains.
9. Summary of the Model
Nodes:
- Id
- StartDate
- EndDate
- lastUpdated
- :Deleted label
Relationships:
- Id
- StartDate
- EndDate
- Status ("Active"/"Deleted")
- lastUpdated
Ingestion phases:
1. Node ingest
2. Relationship ingest
3. Close stale versions
Key benefits:
- Clean, fast “current” queries
- Complete historical accuracy
- Deterministic version merging
- No risk of MERGE-on-NULL issues
- Proven scalability
10. Conclusion
Temporal versioning in Neo4j is not just a schema change—it is an architectural decision that affects ingestion pipelines, storage models, and query semantics.
The strategy described above enables:
- Efficient ingestion without overwriting data
- Simple current-state queries
- Accurate time-travel analysis
- Clean separation of active vs. historical data
- A scalable, deterministic versioning model
This design supports both high-performance applications and advanced tooling such as diffing, history exploration, and lineage tracking.
If you are building any graph system where state changes matter, this approach provides a strong, production-grade foundation for temporal graph modeling.
Top comments (2)
Can you add some pseudo code how you handle finding and closing previous relations?
Great question.
The key idea is that relationships are treated as immutable history records.
Whenever a new relationship version arrives:
Find the currently active relationship
Close it by setting valid_to
Create the new relationship with a fresh valid_from
Keep only one “active” edge at a time
Pseudo flow:
function upsertRelationship(source, target, type, newProperties, ts):
In Cypher, the “close previous relation” step typically looks like:
MATCH (a)-[r:DEPENDS_ON]->(b)
WHERE r.valid_to IS NULL
SET r.valid_to = $timestamp,
r.status = 'HISTORICAL'
Then the new edge is inserted:
CREATE (a)-[:DEPENDS_ON {
valid_from: $timestamp,
valid_to: NULL,
status: 'ACTIVE'
}]->(b)
This guarantees:
only one active relationship exists
full historical reconstruction remains possible
historical queries become simple interval filters
Example historical query:
MATCH (a)-[r]->(b)
WHERE r.valid_from <= $time
AND (r.valid_to IS NULL OR r.valid_to > $time)
RETURN a, r, b
In production systems, this logic is usually wrapped inside:
ingestion middleware
transactional write services
or CDC/event-processing pipelines
to guarantee atomicity and prevent overlapping active edges.