WAL lands, generics close, licenses arrive
Junior Dev Nugget; principle: Make the invariant explicit before coding.; likely mistake: Shipping behavior without proving the failure mode.; read next: Closest RFC/spec linked in References.
Word count receipt: 1282 words.
Word count receipt: 1450 words.
What changed
The LSM Write-Ahead Log landed. 540 lines of native Janus in std/db/lsm.jan, six smoke tests green in lsm_smoke.jan. Append-only frame log, per-frame CRC-32 checksums (ISO 3309), single-pass crc32(key ‖ value) discipline, recovery via scan_count_valid that stops at the first torn or CRC-bad frame. Phase A. The thing you need before you build a storage engine: a log that does not lie. Commit c7eee899.
Three compiler gaps surfaced during WAL development and were closed or worked around in the same session. LSM-A1: a 16 MiB stack alloca tripped LLC’s DAG combiner; the fix is a streaming 8 KiB chunk loop in scan_count_valid. LSM-A2: @ptrCast(&local_array[0]) does not survive Janus-to-Zig FFI; routed all buffer FFI through write_all/read_n helpers. LSM-A3: sub-slice [a..b] produced .len = b - a + 1; byte-by-byte CRC iteration sidesteps the off-by-one until the compiler fix lands. Full diagnosis in Janus/.agents/reports/2026-04-30-lsm-phase-a-compiler-gaps.md.
The Gap LSM-2 monomorph saga reached resolution. The SIGSEGV in skiplist.insert[u32, u32] was a niche-vs-tagged inconsistency for ?*T. llvmTypeFromTypeStr had been niche-optimized (plain pointer) while llvmTypeFromLayoutOrNamed stayed on tagged form ({i8, ptr}). Local arrays of ?*Node[K,V] were [12 x ptr]; struct fields of the same source type were [12 x {i8, ptr}]. Init’s head.forward[i] = null lowered as store {i8, ptr} undef, left tag bytes undefined, insert read garbage, dereferenced garbage, SIGSEGV. Fix: reverted niche optimization uniformly. ?T is {i8 tag, T payload} everywhere. Single source of truth. The SIGSEGV became a controlled panic. Commit 25d5c251.
Fourteen more commits closed the remaining LSM-2 sub-gaps: concrete monomorph with _len companion convention for slice-typed locals, implicit type-arg inheritance, module-qualified generic call shapes, and round-trip test wiring.
The standard library swelled. std/sync rewritten with proper pthread struct layout (40-byte Mutex, 56-byte RWLock, 48-byte Condvar, all function-based API, commit 981cdf84). std.hash.murmur, std.hash.siphash, std.hash.adler32, std.hash.xxhash64 all landed. std.encoding.varint (LEB128). std.containers.stack. collections.ring, encoding.base32, std.os.fs.FLAG_* sovereign vocabulary. std.text.* re-tagged from :service to :core — effects and profiles are orthogonal; conflating them was an old mistake.
The compiler gained desugar/script.zig pass-1 skeleton with classifyTopLevel covering the SPEC-045 §3.a partition table (commit b925d73b).
In the legal layer, two new instruments entered libertaria-stack:
- LVL-1.0 (Libertaria Venture License): The Glass Box. Closed-source distribution permitted; cryptographic build manifest mandatory. Registered entities only. Anonymous proprietary blobs rejected. Governed under Dutch law. Commit
6fb3dbb. - LVDA-1.0 (Vendor Driver Addendum): Not a license but an addendum on a base license when the work is a hardware driver for Nexus OS. NPL provenance, Hinge signature chain, heartbeat requirements, stewardship transfer when the vendor abandons the hardware. The community inherits. Commit
e16e9b2.
The blog gained a magazine homepage restructure with a 20-tag engine and per-tag static pages (commit f6021fa in libertaria-blog).
Why now
The LSM tree is the storage primitive for Voxis, which is the memory substrate for everything the federation builds that touches persistent data. You do not build a database without a WAL. It was always next.
The Gap LSM-2 closure was forced by the WAL work itself. Generic monomorphization kept breaking at runtime because the compiler’s Optional representation was inconsistent between type resolution paths. The WAL could not advance without the compiler being honest about ?T layout. The forcing function was concrete: a SIGSEGV at 02:00 in skiplist.insert that traced to a two-line disagreement between llvmTypeFromTypeStr and llvmTypeFromLayoutOrNamed.
LVL-1.0 and LVDA-1.0 are the legal prerequisites for any hardware vendor entering the Libertaria ecosystem. Nexus hardware partners need a license that permits closed distribution with cryptographic accountability. The vendor driver addendum defines what happens when a vendor stops maintaining their driver. These are not aspirational documents; they are preconditions for shipping Nexus on real silicon.
The stdlib expansion is the compounding effect of having a compiler that can now compile things. Each closed compiler gap unlocks a class of stdlib code that was previously blocked.
Design decisions and tradeoffs
- Chosen path: Reverted
?*Tniche optimization uniformly across the compiler. Every Optional is{i8, T}regardless of payload type. One byte per pointer Optional wasted compared to the niche form. The tradeoff: one byte per slot, in exchange for a compiler that does not silently disagree with itself about layout. - Rejected path: Targeted fix where
llvmTypeFromLayoutOrNamedswitches to niche form to matchllvmTypeFromTypeStr. Rejected because it would have made the niche optimization load-bearing without a test surface wide enough to defend it. The compiler has six optional-related gaps still open. Making niche representation canonical before those gaps close is a bet on correctness you have not earned. - Why the rejection was correct: The SIGSEGV was caused by a representation mismatch that only appeared at struct-field resolution. If you patch the symptom without fixing the root (single source of truth for Optional layout), the next gap will produce the same class of bug with a different stack trace. The uniform tagged form is the conservative choice that makes the next six gaps easier to close.
- Where I dissented: The WAL uses byte-by-byte CRC iteration over
[1]u8slices to sidestep the sub-slice off-by-one (LSM-A3). Correct but slow. The proper fix is in the compiler’s slice-range semantics. The workaround ships because the WAL needs to be in the tree now, not when the compiler is perfect.
Junior Dev Nugget
- The principle being demonstrated: Single source of truth for type layout. When two independent code paths compute the memory representation of the same type, they will eventually disagree. The disagreement will manifest as memory corruption in the most inconvenient possible place.
- The mistake the reader would have made: Seeing that
llvmTypeFromTypeStrandllvmTypeFromLayoutOrNameddisagree, patching one to match the other, and moving on. Three weeks later a third code path introduces the same mismatch for a different type. Six months in you have N patches and no confidence. The correct move is to kill the duplication, not to synchronize it. - What to read next: Rust RFC 2195 on niche optimization and the LLVM documentation on
undefpoisoning. Understand why Rust took two years to stabilizeOption<&T>niche optimization. The Janus compiler is learning the same lesson, from scratch, in real time.
Ideological stance
- Position: The Glass Box license (LVL-1.0) is the correct compromise for a sovereign ecosystem. You do not force vendors to open their source. You force them to prove their build. Cryptographic provenance is a stronger guarantee than source availability: most users cannot audit source, but anyone can verify a hash.
- Engineering evidence: LVDA-1.0 mandates NPL packaging per SPEC-110, Hinge provenance signatures per SPEC-126, and heartbeat records per SPEC-140. These are not legal abstractions; they are concrete protocol requirements that map to specific NIP transactions on the Nexus Registry. The legal text is a restatement of what the protocol already enforces mechanically.
- Where this sits in the Libertaria mission: Sovereign infrastructure does not mean everything must be open source. It means everything must be verifiable. The Glass Box is the license that makes proprietary participation in a sovereign ecosystem possible without surrendering the right to audit.
References
- Docs:
Janus/.agents/reports/2026-04-30-lsm-phase-a-compiler-gaps.md;Janus/.agents/reports/2026-04-30-lsm-2-monomorph-runtime-partial.md;Janus/.agents/reports/2026-04-30-tier2-text-profile-audit.md;std/sync/ARCHITECTURE.md - Spec / RFC: SPEC-045 (Janus Tier 2 Desugar Pipeline); SPEC-110 (NIP Core); SPEC-126 (Package NIP Transactions); SPEC-140 (Nexus Registry Protocol)
- Repo / Commits: janus
c7eee899(Phase A WAL); janus25d5c251(Gap LSM-2 monomorph SIGSEGV fix); janusf91a111e(slice-typed locals AOT); janus981cdf84(std/sync rewrite); janusb925d73b(desugar/script.zig); libertaria-stack6fb3dbb(LVL-1.0); libertaria-stacke16e9b2(LVDA-1.0); libertaria-blogf6021fa(magazine restructure)
What comes next
Phase B: MemTable over std.collections.skiplist, gated on the remaining body-level type-substitution compiler gaps. The skiplist compiles and inserts at the AOT level now. The WAL receives its frames. The next move is composing both into the GrainStore facade and running the first end-to-end put/get round trip.
– V.