The Allocator Revolution
How a generation of systems languages dragged the allocator out of hiding - and why making memory a first-class parameter changes everything about how you write code.
For fifty years, the question "where does this memory come from?" had a boring answer in systems code: malloc. It was global, it was implicit, and it was invisible. You called a function, the function allocated, and you had no idea - short of reading its source - whether it touched the heap at all. The new wave of systems languages (Zig, Odin, and Hare among them) treats that invisibility as a bug, not a feature. Their answer is the same idea expressed three different ways: the allocator is data, and data should be explicit.
This is not a syntactic flourish. Making the allocator a thing you pass around - rather than a global you reach for - rewrites the economics of memory in a program. It makes arenas and pools the default tool instead of an exotic optimization, it pushes lifetime decisions up to the caller who actually knows them, and it makes "no hidden allocations" a property you can audit rather than a hope.
The problem with hidden malloc
Consider the unremarkable C function. Somewhere inside, it calls malloc. As the caller, you cannot see this. You cannot redirect it, cannot batch it, cannot give it a faster pool, cannot make it allocate on the stack instead. The allocation policy is welded into the callee.
/* The caller has no say in WHERE this memory comes from. */
char *make_greeting(const char *name) {
size_t n = strlen(name) + 10;
char *s = malloc(n); /* global heap. always. forever. */
if (!s) return NULL;
snprintf(s, n, "Hello, %s!", name);
return s; /* and now WHO frees it? unclear. */
}
Two costs are hiding here. The first is the policy cost: every allocation goes through the one general-purpose heap, which must be thread-safe, must track every block individually for free, and must defend against fragmentation. The second is the ownership cost: the return value is a raw pointer with an undocumented contract. The caller has to know, by convention, that they own it and must free it exactly once.
C++ patched the second problem with RAII - destructors that run deterministically at scope exit, so ownership rides along with object lifetime. It is genuinely good, and for the ownership question it is arguably still the strongest answer in this group.
// RAII solves "who frees it" - ownership is the object's lifetime.
std::string make_greeting(const std::string& name) {
return "Hello, " + name + "!"; // allocates internally; freed by ~string
} // ...but WHERE it allocates is still hidden.
But notice what RAII does not solve. std::string still reaches for the global heap behind your back. The policy question - where the bytes come from - is as hidden as ever. (C++17's std::pmr allocators are precisely an attempt to claw that back, and we will come to them.) The allocator revolution is fundamentally about the first cost, the one RAII left on the table.
Making the allocator a parameter: Zig
Zig's design slogan is "if it isn't written, it doesn't happen - no hidden control flow, no hidden allocations." There is no global malloc. If a function needs memory, it must take an std.mem.Allocator as an argument, full stop. This is enforced socially and structurally: the standard library does it everywhere, so allocation is always visible at the call site.
const std = @import("std");
// The allocator is the FIRST thing a function that allocates asks for.
// You cannot allocate without one, so the call site always sees it.
fn makeGreeting(allocator: std.mem.Allocator, name: []const u8) ![]u8 {
// allocPrint returns an error union: allocation can fail, so `try`.
return std.fmt.allocPrint(allocator, "Hello, {s}!", .{name});
}
pub fn main() !void {
var gpa: std.heap.DebugAllocator(.{}) = .init;
defer _ = gpa.deinit(); // reports leaks of anything not freed
const allocator = gpa.allocator();
const greeting = try makeGreeting(allocator, "world");
defer allocator.free(greeting); // the caller owns it; the caller frees it
std.debug.print("{s}\n", .{greeting});
}
Three consequences fall straight out of this one decision:
- Allocation is auditable. A function's signature tells you whether it allocates. No
Allocatorparameter, no heap. This is a type-level guarantee, not a comment. - Failure is explicit. Because allocation is a normal function call, "out of memory" is a normal error value (
error.OutOfMemory), surfaced through Zig's error unions (![]u8) and handled withtry/catch. There is no surprise OOM-killer abort buried in a library. - The allocator is swappable for free.
DebugAllocatorfor development (it detects leaks and double-frees; it was namedGeneralPurposeAllocatorbefore the 0.14 cycle),std.heap.page_allocatorfor raw OS pages,FixedBufferAllocatorfor zero-heap operation, an arena for batch lifetimes - all behind the samestd.mem.Allocatorinterface, all chosen by the caller. The code under it never changes.
Zig deliberately has no RAII and no destructors ("no hidden control flow"). Cleanup is scheduled explicitly with defer and errdefer, which run at scope exit (the latter only on the error path). The trade is honest: you give up automatic destruction to gain the guarantee that nothing runs implicitly.
Threading it through the context: Odin
Odin reaches the same destination by a different road. Threading an allocator argument through every function by hand, as Zig does, is explicit but verbose. Odin's answer is an implicit, per-scope context value - a hidden parameter passed to every Odin-convention procedure - that carries a context.allocator (and a separate context.temp_allocator, a logger, and more).
package main
import "core:fmt"
import "core:mem"
// No allocator parameter in the signature: new/make/append/aprint all
// quietly use context.allocator. But it is EXPLICIT when you want it.
make_greeting :: proc(name: string) -> string {
return fmt.aprintf("Hello, %s!", name) // allocates via context.allocator
}
main :: proc() {
greeting := make_greeting("world")
defer delete(greeting) // freed via the same allocator
fmt.println(greeting)
// The revolution in one move: redirect EVERY allocation in a block by
// swapping the context allocator. No call sites change.
arena: mem.Arena
backing := make([]byte, 1 << 16)
defer delete(backing)
mem.arena_init(&arena, backing)
{
context.allocator = mem.arena_allocator(&arena)
g := make_greeting("arena") // now bump-allocated from `arena`
fmt.println(g)
// no delete(g): the whole arena is reclaimed below
}
mem.arena_free_all(&arena) // free everything at once
}
The clever part is that context.allocator propagates down the call tree. Set it at the top of a subsystem and every new, make, append, and fmt.aprintf underneath - including in library code you did not write - routes through your allocator, without a single call-site change. Where Zig says "name the allocator at every call," Odin says "name it once for a whole region of the program." Both make the policy explicit; they differ on the granularity at which you state it.
And when you do want to be surgical, the explicit form is right there: new(int, my_allocator), make([]u8, n, my_allocator). The context is a convenient default, not a cage.
The minimalist's take: Hare
Hare is the most C-like of the three, and it is honest about it. Its built-in alloc and free use a single global heap - there is no first-class Allocator parameter convention baked into the language or threaded through the stdlib the way there is in Zig or Odin.
use fmt;
use strings;
// alloc/free target the global heap. The "revolution" in Hare is
// cultural and structural, not a language-level allocator interface:
// you build region allocators by hand and pass them explicitly.
export fn main() void = {
// concat can fail to allocate (returns str | nomem), so handle it.
// `!` asserts success and aborts on out-of-memory.
const greeting = strings::concat("Hello, ", "world", "!")!;
defer free(greeting); // manual, paired with the alloc
fmt::println(greeting)!;
};
So is Hare part of the revolution at all? Yes - but on its own minimalist terms. Hare's contribution is to make the primitives clean and the runtime tiny: alloc(value) constructs-in-place and yields a typed *T, slices ([]u8) carry their length so buffer sizes travel with the data, and defer keeps the matching free next to the allocation. There are no hidden allocations because there is almost no runtime to hide them in - what you write is very close to what runs.
Crucially, Hare's slices make hand-rolled arenas trivial, and the idiom of passing an arena explicitly is exactly what its community reaches for. The language does not mandate an allocator interface; it makes building and passing one a five-line affair:
// A bump arena over one owned slice - explicit, passed by pointer.
type arena = struct {
base: []u8, // backing buffer (we own it)
used: size, // bump offset: the next free byte
};
fn arena_init(cap: size) arena = arena {
base = alloc([0u8...], cap)!, // alloc returns []u8 | nomem; `!` aborts on OOM
used = 0,
};
fn arena_alloc(a: *arena, n: size) *[*]u8 = {
const off = (a.used + 7) & ~7z; // 8-byte align
assert(off + n <= len(a.base), "arena exhausted");
const p = &a.base[off]: *[*]u8;
a.used = off + n; // bump: O(1), no bookkeeping
return p;
};
fn arena_free(a: *arena) void = free(a.base); // reclaim ALL at once
Hare proves the revolution is as much about philosophy (explicit lifetimes, no GC, no hidden costs, pass the region you mean) as it is about a specific Allocator type. The discipline travels even where the language feature does not.
Why arenas change the math
Once the allocator is something you hold, the most powerful tool it unlocks is the arena (also called a region or bump allocator), and its cousin the pool. This is the part that genuinely rewires how you write systems code.
A general-purpose heap has to support the worst case: allocations and frees in any order, interleaved, from any thread, for any lifetime. That generality is expensive - every block is individually tracked so it can be individually reclaimed. An arena throws all of that away by exploiting a fact that is true far more often than malloc's design assumes: a whole batch of allocations shares one lifetime.
An arena owns one big block. Allocation is "round up for alignment, return the current offset, advance the offset" - a handful of instructions, no locking, no metadata, no free list. You never free individual objects. When the batch is done, you reset the offset (or free the one backing block) and everything is reclaimed in a single operation.
/* A bump (arena) allocator: the whole idea in ~10 lines. */
typedef struct { uint8_t *base; size_t cap, used; } Arena;
void *arena_alloc(Arena *a, size_t size, size_t align) {
size_t off = (a->used + (align - 1)) & ~(align - 1); /* round up */
if (off + size > a->cap) return NULL; /* arena full */
void *p = a->base + off;
a->used = off + size; /* just move a pointer */
return p;
}
void arena_reset(Arena *a) { a->used = 0; } /* free EVERYTHING, instantly */
The payoff is threefold, and each point is a memory-management win:
- Speed. Allocation is O(1) with a tiny constant and no synchronization. Reclamation of an entire arena is also O(1) - a single pointer reset - instead of N calls to
free. - No per-object leaks or use-after-free for same-lifetime data. You cannot forget to free object #4,217 if you never free objects individually. The whole class of "matched malloc/free" bugs evaporates for arena-managed data. The one thing you must get right is the arena's lifetime - and there is exactly one of those to reason about instead of thousands.
- Locality. Sequentially bumped allocations are contiguous in memory, so walking them is cache-friendly in a way a fragmented general heap can never promise.
The mental shift is from per-object lifetime management to per-phase lifetime management. Parsing a request? Arena per request, reset when the response is sent. Building a frame in a game? Arena per frame, reset at vsync. Loading a level? Arena per level, free when it unloads. You stop asking "when does this object die?" and start asking "when does this phase end?" - which is usually a question you can actually answer.
This is why Odin ships a per-thread context.temp_allocator: a scratch arena for "this lives until I say otherwise (typically end of frame/turn)," reset wholesale and free of any individual delete. It is the bump allocator promoted to a language convention.
// Odin's temp allocator: a built-in scratch arena for short-lived junk.
text := fmt.aprintf("scratch %d", 42, allocator = context.temp_allocator)
fmt.println(text)
// ...no delete. Reclaimed in one shot:
free_all(context.temp_allocator) // typically called once per frame to reset the arena
Pools, the fixed-size cousin
When your objects are all the same size but have individual lifetimes (think: nodes that are created and destroyed independently), an arena's "free everything at once" rule is too coarse. The pool allocator is the answer: carve a slab into fixed-size slots, keep a free list of available slots, and allocate/free by pushing and popping that list - still O(1), still no fragmentation (every slot is interchangeable), still cache-friendly. Zig ships this directly as std.heap.MemoryPool(T).
const std = @import("std");
pub fn main() !void {
var gpa: std.heap.DebugAllocator(.{}) = .init;
defer _ = gpa.deinit();
// A pool of fixed-size Node slots, backed by the GPA.
const Node = struct { value: i32, next: ?*@This() };
var pool = std.heap.MemoryPool(Node).init(gpa.allocator());
defer pool.deinit(); // frees the whole pool at once
const a = try pool.create(); // pop a free slot - O(1)
const b = try pool.create();
a.* = .{ .value = 1, .next = b };
b.* = .{ .value = 2, .next = null };
pool.destroy(b); // push the slot back - O(1), no syscall
std.debug.print("{d}\n", .{a.value});
}
C++ catching up: std::pmr
The older languages were not blind to any of this; they just paid the late-mover tax. C++17 added std::pmr (polymorphic memory resources) precisely to retrofit explicit, runtime-swappable allocators onto the standard containers - the closest C++ comes to the Zig/Odin model. A std::pmr::monotonic_buffer_resource is an arena; pass it to a std::pmr::vector and every allocation routes through it.
#include <array>
#include <memory_resource>
#include <vector>
void parse_into(std::pmr::vector<int>& out); // allocator travels with the type
int main() {
std::array<std::byte, 8192> buffer{}; // stack storage, no heap
std::pmr::monotonic_buffer_resource arena{buffer.data(), buffer.size()};
std::pmr::vector<int> v{&arena}; // every push_back bump-allocates
for (int i = 0; i < 100; ++i) v.push_back(i * i);
// No per-element free. `arena` reclaims it all when it dies (here: stack).
}
It works, and it is genuinely useful. But it is opt-in per type (std::pmr::vector, not plain vector) and bolted on rather than pervasive - the default std::vector still uses the global new. The contrast is the whole point: in Zig the explicit allocator is the only option; in C++ it is the road less traveled.
The other end of the spectrum: HolyC and Forth
Not every systems language joined the revolution, and looking at the holdouts sharpens what the revolution actually is.
HolyC - Terry A. Davis's C dialect, the native language and shell of his single-developer operating system TempleOS - is firmly in the classic camp, with one fascinating twist. Its primitives are MAlloc() and Free() (note the capitalization), and like C they offer no allocator abstraction. But the twist is structural: TempleOS gives every task its own heap, and when a task dies, its entire heap is reclaimed automatically. That is, accidentally, an arena - at the granularity of a whole task.
// HolyC: per-task heaps. MAlloc's 2nd argument is OPTIONAL (defaults to NULL =
// THIS task's heap), so you can name where the memory comes from - a small
// step toward an explicit allocator.
U8 *p = MAlloc(256); // raw bytes from this task's heap; NOT zeroed
*p = 42; // (use CAlloc if you want it cleared to zero)
Print("%d\n", *p);
Free(p); // Free(NULL) is a safe no-op
// Aim an allocation at ANOTHER task's heap by passing its CTask* explicitly.
// The signature is MAlloc(I64 size, CTask *mem_task=NULL); pass `adam_task`
// (the root task whose heap outlives everything) to allocate there instead:
U8 *q = MAlloc(1024, adam_task); // explicit heap target: the seed of the idea
Free(q);
That optional heap-pointer argument to MAlloc is, in miniature, the explicit-allocator idea - you can name where the memory comes from. Davis built it for a system with no memory protection at all (everything ran in 64-bit ring 0, one flat address space), where a stray pointer could corrupt the whole machine. In that world, "when the task dies its heap is gone" was a pragmatic safety valve, and the per-task heap was effectively a coarse, automatic arena long before arenas were fashionable. It is a thoughtful design that arrived at part of the same insight from an entirely different motivation.
Forth, the oldest language here, is stranger still: its core data model is a bump allocator. The dictionary's data space grows by advancing a pointer called HERE, and the word ALLOT bumps it forward. That is an arena with a different name, predating the term by decades.
\ Forth's dictionary data space IS an arena: HERE is the bump pointer,
\ ALLOT advances it. A private region with a reset is a few lines.
CREATE ARENA 1024 ALLOT \ reserve 1024 bytes (we own them)
VARIABLE AP 0 AP ! \ arena offset: bytes used so far
: ARENA-ALLOC ( n -- addr ) \ bump-allocate n bytes
AP @ OVER + 1024 > ABORT" arena full"
ARENA AP @ + SWAP AP +! ; \ addr = base+off ; off += n
: ARENA-RESET ( -- ) 0 AP ! ; \ reclaim EVERYTHING by rewinding the offset
Forth also exposes MARKER, which records the dictionary state so you can later reclaim everything allocated after it - a region reset built into the language standard. The dynamic ALLOCATE/FREE word set (C-style malloc/free) is an optional add-on; the native, always-present model is the bump pointer. Forth had the mechanism; what the new languages added was the abstraction - the ability to name, pass, and swap an allocator as a value.
What actually changes in how you write code
Strip away the syntax and the revolution comes down to a single inverted default. Under hidden malloc, allocation policy is decided by the callee, deep in the call tree, with no knowledge of how its results will be used. Under explicit allocators, allocation policy is decided by the caller - the one place in the program that actually knows the lifetime, the thread, the performance budget, and the right strategy.
That inversion produces a recognizably different style of systems code:
- Functions advertise their costs. An allocator parameter (or a documented
context.allocatordependence) is a function telling you, in its signature, "I touch the heap." Memory behavior stops being a black box you have to read source to understand. - Lifetime is designed top-down, in phases. You stop sprinkling
freeacross thousands of call sites and start drawing arena boundaries around phases - per request, per frame, per level, per parse. One lifetime to reason about, not thousands of pairings to get right. - Allocation strategy becomes a tuning knob, not a rewrite. Swap a
DebugAllocatorfor an arena, or a global heap for a stack buffer, and the code underneath is untouched. Want to prove a hot path never heap-allocates? Hand it aFixedBufferAllocatorand watch it fail loudly if it tries. - Testing gets teeth. Run your whole test suite under a leak-detecting allocator (Zig's
DebugAllocator, or a tracking allocator in Odin) and every unfreed byte is a test failure with a stack trace - because allocation went through an interface you control.
The trade-off is real and worth stating plainly: you give up the convenience of "just call malloc" and the automation of RAII, and you take on the burden of deciding, everywhere, where memory comes from and when it dies. The revolution's bet is that for systems software - where that decision is the whole job - making it explicit, visible, and swappable is not a burden at all. It is the point.