C: the foundation everything sits on
Why a 1972 portable assembler became the lingua franca of computing - its flat machine model, its raw pointers, its tiny standard library, and the enduring quirks every language after it had to answer.
Every other language in this collection is, in some sense, an argument with C. C++ keeps C's machine model and adds destructors to tame it. Zig, Hare, and Odin keep the manual heap but drag the allocator into the open. Forth predates C and goes the other way entirely; HolyC is C reimagined for one machine with no protection at all. To understand any of them as a memory story, you have to start where they started: with C, and with the model of the machine that C quietly installed in everyone's head.
That model is the point of this article. Not C's syntax - its picture of memory. C took the byte-addressed, flat-address-space, pointer-everywhere reality of the PDP-11 and froze it into a portable language, and that picture turned out to be durable enough to outlive the PDP-11 by fifty years. When you write malloc in C, new in C++, alloc in Hare, or MAlloc in HolyC, you are reaching into the same conceptual heap C drew. The lingua franca is not just a language everyone can call; it is a mental model everyone shares.
Why C, and not something else
C did not win because it was the best-designed language of 1972. It won because of a single decision made around 1973: Dennis Ritchie and Ken Thompson rewrote the Unix kernel in it. An operating system written in a high-level language was a near-heresy at the time - kernels were assembly, full stop, because only assembly was fast enough and close enough to the hardware. C was the first language credible as a "portable assembler": abstract enough to move between machines, thin enough that you could still see the metal through it.
The mechanism of that thinness is what matters here. C makes a promise no garbage-collected language can make: the abstract machine maps predictably onto the real one. A struct is its fields laid out in declaration order (with padding you can reason about). An array is N elements back to back. A pointer is, on every machine anyone cares about, an address. There is no object header, no GC write barrier, no boxing, no hidden indirection. What you wrote is, to a first approximation, what runs.
That predictability is the reason C became the substrate. Operating systems, device drivers, language runtimes (including the runtimes of Python, Ruby, and the JVM), databases, embedded firmware - all the software that has to know exactly where its bytes are - converged on C because C is the language that lets you say exactly where your bytes are. And once enough of the world's foundational code was C, the C ABI - the calling convention, the struct layout rules, the extern "C" interface - became the universal interop layer. Every language here speaks it. Hare, Zig, and Odin all call libc directly; C++ is a near-superset; even Forth systems bind to C libraries. The lingua franca status is self-reinforcing: C is the language other languages have to talk to, so it never goes away.
The machine model: a flat array of bytes
Here is the entire C memory model in one sentence: memory is one big array of bytes, and a pointer is an index into it. Everything else - structs, arrays, the heap, the stack - is a way of carving up that one array.
#include <stdio.h>
#include <stdint.h>
int main(void) {
int x = 0x01020304;
/* A pointer is an address; a byte pointer lets us read the raw bytes. */
unsigned char *p = (unsigned char *)&x;
/* Walk the object byte by byte. The order printed reveals endianness:
little-endian machines store the low byte first (04 03 02 01). */
for (size_t i = 0; i < sizeof x; i++)
printf("%02x ", p[i]);
printf("\n");
return 0;
}
This is the move that makes C C. You can take the address of any object, reinterpret it as bytes, and walk it. The language does not hide the representation; it hands it to you. That is enormous power and the source of half of C's danger, and every language after it had to decide how much of that power to keep.
C draws three storage regions on top of the flat array, and the distinction between them is the first thing a systems programmer internalizes:
- Static storage - globals and
staticlocals. Allocated once at program start, alive for the whole run, addresses fixed at link time. - Automatic storage - ordinary local variables. They live on the stack: created when a function is entered, destroyed when it returns, for free, no bookkeeping. This is the cheapest memory in the language.
- Allocated (dynamic) storage - the heap, reached through
mallocand friends. This is the only memory whose lifetime you control, byte for byte, and therefore the only memory you can leak, double-free, or use after freeing.
#include <stdlib.h>
int g; /* static storage: lives the whole program */
void f(void) {
int local = 1; /* automatic: born here, dies at the `}` */
static int counter; /* static storage, function-local name */
int *heap = malloc(16); /* dynamic: YOU decide when it dies */
/* ... */
free(heap); /* ...and if you forget this line, it leaks */
}
The whole drama of manual memory management lives in that third bullet. The stack frees itself; static storage never needs freeing; the heap is the one place where the question "who frees this, and when?" has no built-in answer. Every language in this collection is, fundamentally, a different answer to that one question - and they all inherited the question from C.
Pointers: the load-bearing abstraction
A pointer is C's central idea and its most dangerous one. It is "the address of something," and from that one concept C derives arrays, strings, dynamic data structures, function callbacks, and the entire heap interface. But a C pointer is a thin abstraction: it is an address and a static type, and almost nothing else. It does not know whether the thing it points to is still alive. It does not know how many elements follow it. It does not know if it is null until you check. The discipline of C is the discipline of supplying, in your head and in your conventions, all the information the pointer itself does not carry.
Pointer arithmetic and the array/pointer duality
The deepest C quirk - the one that confuses every newcomer and that every successor language deliberately fixed - is that arrays are not pointers, but they decay into pointers at the slightest provocation.
#include <stdio.h>
void print_all(int *a, size_t n) { /* `a` is a pointer; the length is GONE */
for (size_t i = 0; i < n; i++)
printf("%d ", a[i]); /* a[i] is literally *(a + i) */
printf("\n");
}
int main(void) {
int nums[4] = {10, 20, 30, 40};
printf("sizeof nums = %zu\n", sizeof nums); /* 16: the WHOLE array */
print_all(nums, 4); /* `nums` decays to &nums[0] */
/* Inside print_all, sizeof a would be 8 (a pointer), not 16.
The length did not survive the call. You must pass it by hand. */
return 0;
}
a[i] is defined as *(a + i). The subscript operator is pure sugar over pointer arithmetic - which is why the famous oddity arr[3] == 3[arr] is true: both mean *(arr + 3). Pointer arithmetic is scaled by the pointed-to type (a + 1 advances by sizeof(int), not one byte), so the same machine model is reached through array notation or pointer notation interchangeably.
And here is the consequence that defines so much of C's bug surface: when an array is passed to a function, its length is not. The array decays to a bare pointer, sizeof inside the callee measures the pointer, and the count has to travel as a separate argument by convention. Nothing checks that the convention is honored. This single design fact - pointers that don't carry their length - is the direct ancestor of the buffer overflow, and it is precisely what slices fix in the languages that came later.
The null pointer and "no value here"
C represents "no object" with the null pointer, classically the macro NULL. Dereferencing it is undefined behavior - usually a crash, sometimes worse - and forgetting to check for it is a perennial bug. C23 finally added the typed nullptr constant, but the underlying idea is the same one every language inherited: a pointer can mean "nothing," and you must check before you trust it.
char *dup = malloc(len);
if (dup == NULL) { /* malloc CAN fail; the contract is on you */
/* out of memory - handle it, don't dereference */
return -1;
}
The standard library: small on purpose, sharp on purpose
C's standard library is famously minimal. There are no built-in dynamic arrays, no hash maps, no strings worth the name - just a handful of headers covering I/O (<stdio.h>), memory and conversions (<stdlib.h>), raw memory operations (<string.h>), math, and a few more. This is a feature, not a gap: a tiny library is a tiny runtime, and a tiny runtime is what lets C boot a kernel where no library exists yet.
For our purposes the library's beating heart is four functions in <stdlib.h> - the entire manual-heap interface that the rest of computing borrowed:
#include <stdlib.h>
#include <string.h>
void *malloc(size_t size); /* size bytes, UNINITIALIZED */
void *calloc(size_t n, size_t size); /* n*size bytes, ZEROED, overflow-checked */
void *realloc(void *p, size_t newsize); /* grow/shrink; MAY move the block */
void free(void *p); /* release; free(NULL) is a safe no-op */
Three subtleties in this interface have bitten every C programmer and shaped every successor:
mallocdoes not zero memory. The bytes you get are whatever was there before. Reading them before writing is undefined behavior.calloczeros (and checksn * sizefor overflow, which naivemalloc(n * size)does not).reallocmay move the block. It can return a different pointer, copying your data to a new location and invalidating every old pointer into the buffer. The idiomp = realloc(p, n);is itself a classic leak bug: ifreallocfails it returnsNULLwithout freeing the old block, so you must catch the result in a temporary first.freedoes not null your pointer. Afterfree(p),pstill holds the old address - now a dangling pointer. Use it and you have use-after-free, one of the most exploited bugs in software. The language will not stop you; it does not even know the memory is dead.
The other library quirk worth naming is the global errno: many functions report failure by setting a thread-local integer you have to check separately, a pre-exceptions, pre-error-union convention that survives because the ABI froze around it. And C strings are not a type at all - they are "a char * pointing at bytes terminated by a \0," which means every string operation is a pointer walk looking for a zero, and forgetting the terminator (or the room for it) is its own family of overruns. <string.h>'s memcpy, memmove, and memset are the raw tools, and the distinction between memcpy (overlapping regions are UB) and memmove (overlap is safe) is exactly the kind of sharp edge the small library leaves for you to know about.
The enduring quirk: undefined behavior
If pointers are C's central idea, undefined behavior (UB) is its central bargain. The standard divides "things the program might do wrong" into categories, and the harshest is UB: constructs for which the standard imposes no requirements whatsoever. Out-of-bounds access, use-after-free, signed integer overflow, reading uninitialized memory, dereferencing null, data races - all undefined. The compiler is permitted to assume they never happen, and to optimize on that assumption.
#include <limits.h>
int will_overflow(int x) {
/* Signed overflow is UB, so the compiler may ASSUME x + 1 > x always.
It can legally delete the overflow check below as "unreachable." */
if (x + 1 < x) return -1; /* may be optimized away entirely */
return x + 1;
}
int dangling(void) {
int *p;
{
int local = 42;
p = &local; /* p points into a frame about to vanish */
} /* `local` is dead here */
return *p; /* UB: use of a dangling pointer */
}
This is the bargain that bought C its speed: by not mandating bounds checks, overflow checks, or initialization, C lets the compiler generate code as tight as hand-written assembly. The cost is that a single mistake doesn't produce a clean error - it produces a program whose behavior the standard refuses to define, which in practice means crashes, silent corruption, or security holes that surface months later. UB is why C needs sanitizers (ASan, UBSan), static analyzers, and valgrind: the language deliberately moved safety out of the runtime, so the tooling has to put it back. Every "safer C" in this collection is, at bottom, an attempt to keep C's performance while shrinking C's UB surface.
A grab-bag of C's other enduring quirks, each of which a later language consciously reversed:
- Integer promotions and implicit conversions. Small types get promoted to
intin expressions; signed and unsigned mix silently; narrowing happens without a warning by default. Weak typing is fast and surprising. restrict(C99). A promise to the compiler that two pointers don't alias, so it can optimize as if they're independent. Break the promise and you get UB. It exists because arbitrary aliasing - any pointer might point at any object - otherwise blocks optimization.volatile. Tells the compiler a variable may change outside the program's control (memory-mapped hardware, a signal handler), so reads and writes can't be optimized away. Essential for embedded work, widely misunderstood elsewhere.- The preprocessor.
#includeis literal text inclusion; macros are textual substitution with no type awareness. It is a separate, weaker language stapled on top, and a fertile source of subtle bugs.
How the others answered C
C set the questions. The rest of this collection is the answer sheet, and reading the answers backwards is the clearest way to see what C actually is.
C++ keeps C's machine model almost entirely - the same flat memory, the same pointers, the same malloc if you want it - and adds the destructor. With RAII, a resource's lifetime rides on an object's scope, so the heap allocation a std::unique_ptr owns is freed deterministically when the pointer goes out of scope. C's unanswered "who frees this?" gets a structural answer: the object that owns it, automatically, at scope exit.
#include <memory>
#include <vector>
void example() {
auto p = std::make_unique<int[]>(16); // heap, owned by `p`
std::vector<int> v; // a real dynamic array, length carried
v.push_back(1);
// No free, no delete: ~unique_ptr and ~vector run at the closing brace.
} // deterministic, no GC
Zig keeps the manual heap but makes the allocator an explicit argument and replaces destructors with defer. C's invisible malloc becomes a visible parameter; C's "remember to free" becomes a defer sitting right next to the allocation.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit(); // reports leaks at shutdown
const a = gpa.allocator();
const buf = try a.alloc(u8, 16); // allocation can fail: `try`
defer a.free(buf); // cleanup lives beside the alloc
buf[0] = 42;
}
Hare is the most deliberately C-like of the moderns: a single global heap reached by built-in alloc/free, a tiny runtime, no GC. Its fix for C's biggest pointer wound is the slice - a pointer bundled with its length, so the size travels with the data and the "array decays, length is lost" bug simply cannot happen.
use fmt;
export fn main() void = {
let buf: []u8 = alloc([0u8...], 16)!; // a slice: pointer + length together
defer free(buf); // paired cleanup, C-style discipline
buf[0] = 42;
fmt::printfln("len carries with the data: {}", len(buf))!;
};
Odin keeps manual, GC-free memory too, but threads an implicit context.allocator through every procedure, so you can redirect all allocation in a region to an arena without touching call sites - making C's per-object free into per-phase bulk reclamation.
package main
import "core:fmt"
main :: proc() {
buf := make([]u8, 16) // uses context.allocator
defer delete(buf) // freed via the same allocator
buf[0] = 42
fmt.println(len(buf)) // length carried, like Hare's slices
}
Forth predates C and answers nothing - it asks its own questions. Its native memory model is a bump allocator: a pointer called HERE that ALLOT advances. Dynamic ALLOCATE/FREE (C-style heap) is an optional add-on; the always-present model is the arena C wouldn't standardize for decades.
\ Forth's data space is a bump allocator; ALLOT advances HERE.
CREATE BUF 16 ALLOT \ reserve 16 bytes in the dictionary
42 BUF C! \ store one byte
BUF C@ . \ fetch and print it
HolyC - Terry A. Davis's dialect, the native language and shell of his single-developer operating system TempleOS - is C reimagined for a machine with no memory protection at all. Everything ran in 64-bit ring 0 in one flat address space, the purest possible expression of C's "memory is one array of bytes" model, with every guardrail removed. Its MAlloc/Free mirror malloc/free closely - MAlloc returns uninitialized bytes just as malloc does, and the zeroing variant is the aptly named CAlloc, HolyC's calloc - but with one thoughtful twist of its own: memory comes from a per-task heap that is reclaimed automatically when the task dies, a coarse, automatic arena at the granularity of a whole task, arrived at from an entirely different motivation than the modern allocator crowd.
// HolyC: MAlloc is uninitialized like C's malloc; CAlloc zeroes (like calloc).
// Free(NULL) is a safe no-op. Memory comes from the task's heap, reclaimed
// wholesale when the task ends.
U8 *p = CAlloc(16); // 16 ZEROED bytes from this task's heap (MAlloc wouldn't zero)
p[0] = 42;
Print("%d\n", p[0]);
Free(p); // explicit, but task death would reclaim it anyway
The foundation that stayed
The remarkable thing about C is not that it has lasted, but why. It is not a beautiful language by modern lights - the array decay, the null terminator, the global errno, the undefined behavior minefield are real wounds, and every successor here is partly defined by which wound it set out to heal. C lasted because it did one thing first and definitively: it gave programmers a portable, predictable, almost-transparent picture of the machine's memory, and then it got out of the way.
That picture - flat bytes, raw pointers, three storage regions, a heap you own outright, and a contract you must keep yourself - is the shared vocabulary of every language in this collection. C++ refines it with destructors, Zig and Hare and Odin sharpen its tools and surface its costs, Forth predates it, HolyC strips it to ring 0. But they are all speaking dialects of the same underlying language about memory, and that language is C's. The foundation everything sits on is not C the syntax. It is C's model of where your bytes live - and fifty years on, we are all still building on it.