Transactions, WAL
& Crash Recovery
A bank transfer: debit account A, then credit account B. What if the power dies between those two steps? Without transactions, A is poorer and B is no richer — money just evaporates. This module is how databases make that impossible.
Why should you care?
When an AI writes "we'll save the order and send the email in parallel" — that's wrong. When it suggests cache.set() before db.commit() — that's wrong. Recognizing the "log first" pattern lets you catch real durability bugs in AI-written code instantly.
First, feel the problem
A transfer is really two steps: take $100 from A, then give $100 to B. Between those steps the money exists nowhere. If the power dies right then, where did it go? Run it both ways — leave the crash switched on — and keep an eye on the system's total.
The promise a transaction makes: ACID
A transaction wraps a group of changes in four guarantees. The money demo you just ran was atomicity and durability in action — here are all four.
Atomicity
Every step of the transaction happens, or none does. Never half-done.
Consistency
Rules like "total balance doesn't change" are true before and after.
Isolation
Concurrent transactions look like they ran in sequence, never mid-mess.
Durability
Committed means committed — even if the server explodes one millisecond later.
The fix: write it down before you do it
The write-ahead log (WAL) is an append-only file. Every change is recorded there — durably, in order — before the database itself is touched. The ordering is the entire trick. Compare the two possible orderings and what a crash does to each:
1. apply the change to the database 💥 CRASH HERE 2. write the change to the log ← never runs
1. write the change to the log, flush 💥 CRASH HERE 2. apply the change ← not yet done
Append-only
Entries are only ever added to the end — never edited in place. Appending is fast and crash-friendly.
Sequenced
Every entry carries a sequence number, giving a total order so recovery replays events exactly as they happened.
Durable on commit
A change is only "real" once its record is flushed to disk — that fsync is the heartbeat of durability.
The WAL Crash Lab
Now drive the engine yourself. Begin a transaction, insert a few rows, and watch the WAL (left, on disk) fill up while the database (right, in memory) stays empty — that is deferred writing. Then either Commit to apply the rows, or pull the plug with Crash and press Recover to see exactly what the log can rebuild.
The whole module in one experiment. Inserts pile up in the log but never touch the database until commit. A crash wipes memory but not the log. Recovery replays only what was committed. Atomicity, durability, and deferred writes — all visible in one panel.
Watch a commit happen
The actors talk it out. Notice the order: the WAL is flushed to disk before the index is touched. That flush is the exact moment "maybe" becomes "definitely."
The commit, in code
public void Commit()
{
_lock.EnterWriteLock();
try
{
// 1. Log the commit
_walManager.AppendEntry(new WALEntry
{
TransactionId = _transactionId,
OperationType = WALOperationType.Commit
});
// 2. Force flush to ensure durability
_walManager.Flush();
// 3. Apply buffered writes only AFTER the WAL is durable
_commitApplyCallback(_entries);
_state = TransactionState.Committed;
}
finally { _lock.ExitWriteLock(); }
}
The ACID-defining moment
AppendEntry(Commit) — write a COMMIT marker into the log.
Flush() — this line is durability. It forces the log to disk. If we crash one instruction later, recovery will replay this transaction.
_commitApplyCallback — only now do the buffered changes land in the B+ tree. Flush before apply. Always.
What a log entry remembers
public class WALEntry
{
public long TransactionId { get; set; }
public WALOperationType OperationType { get; set; }
public string TableName { get; set; }
public object? Key { get; set; }
public byte[]? OldValue { get; set; }
public byte[]? NewValue { get; set; }
public long SequenceNumber { get; set; }
}
Enough to redo or undo
OldValue + NewValue together — keep both and you can redo a change or undo it.
SequenceNumber — a total order across all entries, so recovery replays them in exactly the order they happened.
TransactionId — which transaction this belongs to. Recovery groups by this to find complete transactions.
Crash recovery, step by step
The power came back. Press play to watch the engine rebuild a correct state from nothing but the log on disk — replaying committed work, discarding everything else.
public void RecoverFromWAL(Action<WALEntry> applyEntry)
{
var entries = _walManager.ReadEntriesForRecovery();
var transactions = new Dictionary<long, List<WALEntry>>();
var committed = new HashSet<long>();
foreach (var entry in entries)
{
if (entry.OperationType == WALOperationType.Commit)
committed.Add(entry.TransactionId);
else if (entry.OperationType != WALOperationType.Rollback)
Bucket(transactions, entry);
}
foreach (var txnId in committed) // replay ONLY committed
foreach (var entry in transactions[txnId])
applyEntry(entry);
}
Group, filter, replay
Read every entry — the whole log, start to finish.
committed.Add(...) — note which transactions reached a COMMIT.
Bucket(...) — pile each non-commit entry under its transaction id.
replay ONLY committed — apply the committed buckets to the index; everything else is thrown away. The log is the single source of truth.
Checkpoints keep recovery fast
If the log grew forever, recovery would take forever. A checkpoint is the engine tidying up.
Flush all dirty pages
Push every in-memory change out to disk so the data file is current.
Write a Checkpoint entry
Record which transactions were still active at this moment.
Truncate the old WAL
Everything before the checkpoint is now safely in the data file — discard it.
Smaller WAL, faster recovery
Next crash, there's far less log to replay. Recovery time stays bounded.
"Log first, apply after" is ACID's beating heart. If the commit record is on disk, recovery will replay the change. If it isn't, the change never happened. There is no third state.
Deferred writes make rollback free. This engine doesn't touch the B+ tree until after commit. So rollback has nothing to undo — it just discards the buffer. The trade-off: other transactions can't see your in-flight writes (that's the next module).
Check yourself
Up next: one transaction is tidy. But a real database runs thousands at once. How does the engine keep them from corrupting each other's work? Locks, lock ordering, and the dreaded deadlock.