Snapshots
Understanding snapshots and how they optimize event replay and improve system performance.
A snapshot is a point-in-time representation of an aggregate's state. Instead of replaying all events from the beginning of time, you can start from a snapshot and only replay events that occurred after the snapshot was taken.
Snapshots are a performance optimization technique that reduces the time and resources needed to reconstruct the current state of an entity. They're particularly valuable for aggregates with long event histories.
Example: If an Order has 1,000 events, replaying all of them every time you need the current state would be expensive. A snapshot taken at event 900 means you only need to replay the last 100 events.
Dramatically reduce the time needed to load an aggregate by starting from a recent snapshot instead of replaying thousands of events. Critical for aggregates with long histories.
Lower read latency means faster response times for your users. Commands can be processed more quickly when less time is spent reconstructing state.
Reduce CPU and memory usage by minimizing the number of events that need to be deserialized, processed, and applied when loading aggregate state.
Snapshots provide faster recovery in disaster scenarios. Restore system state quickly without processing the entire event history from the beginning.
1. Event Stream Without Snapshot
Must replay all 1,000 events every time.
2. Event Stream With Snapshot
Load snapshot, then replay only 100 events.
A snapshot typically contains:
{
"snapshotId": "snapshot-abc123",
"aggregateId": "order-12345",
"aggregateType": "Order",
"version": 900,
"timestamp": "2024-01-15T10:30:00Z",
"state": {
"orderId": "order-12345",
"customerId": "customer-789",
"status": "SHIPPED",
"items": [
{
"productId": "product-456",
"quantity": 2,
"price": 29.99
}
],
"totalAmount": 59.98,
"currency": "USD",
"shippingAddress": { ... },
"paymentStatus": "PAID"
}
}Event Count Threshold
Create a snapshot every N events (e.g., every 100 events). Simple and predictable.
if (eventCount % 100 === 0) createSnapshot()Time-Based
Create snapshots at regular time intervals (e.g., daily). Good for predictable maintenance windows.
Take snapshot at midnight each dayOn-Demand
Create snapshots when loading takes too long. Adaptive to actual performance needs.
if (loadTime > threshold) createSnapshot()Hybrid
Combine multiple strategies. For example: every 100 events OR after 24 hours, whichever comes first.
if (eventCount % 100 === 0 || timeSince > 24h) createSnapshot()The process of loading aggregate state with snapshots:
Look for most recent snapshot
Query snapshot store for latest snapshot of the aggregate
Load snapshot state (if exists)
Deserialize snapshot and use it as starting state
Load events after snapshot
Query event store for events with version > snapshot version
Replay remaining events
Apply events to snapshot state to get current state
- •Keep events as source of truth: Snapshots are just an optimization, not a replacement
- •Store version number: Track which event version the snapshot represents
- •Make snapshots deletable: You should always be able to rebuild from events
- •Don't snapshot everything: Only create snapshots for aggregates that need them
- •Consider snapshot size: Large snapshots may not provide performance benefits
- •Handle snapshot failures gracefully: Fall back to full event replay if snapshot is corrupt
- •Version your snapshots: Handle schema changes in snapshot structure
Good Candidates
- ✓Aggregates with hundreds or thousands of events
- ✓Frequently accessed aggregates
- ✓Long-lived entities with complex state
- ✓Performance-critical operations
Skip Snapshots
- ✗Aggregates with only a few events
- ✗Rarely accessed aggregates
- ✗Short-lived entities
- ✗When replay is already fast enough
Benefits:
- • Faster aggregate loading
- • Reduced CPU and memory usage
- • Better scalability
Costs:
- • Additional storage space
- • Complexity in snapshot management
- • Need to handle snapshot versioning
- • Snapshot creation overhead