Concurrency & Threads
The same job across all seven systems languages: spawn a thread (or two), have it compute a sum, then join and read the result. With no garbage collector and a shared address space, the real question is how data crosses the thread boundary - almost always as an explicit pointer into memory that outlives the spawn, freed only after join. Compare C pthreads, C++ std::thread (RAII + join-or-terminate), Zig std.Thread (no hidden allocation), Odin core:thread, and Hare's bare-bones clone(2) (its stdlib still ships no thread module), then note that HolyC is cooperatively multitasked (Spawn/Yield, no preemption) and Forth has no threads at all without an extension.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
/* The thread function takes/returns void*; to hand data in or results
* out you pass POINTERS, so the heap (or stack of a still-live frame)
* is the shared channel. Here each worker owns one heap-allocated long. */
typedef struct { long n, sum; } Work;
static void *worker(void *arg) {
Work *w = arg; /* the void* we passed to pthread_create */
w->sum = 0;
for (long i = 1; i <= w->n; i++) /* sum 1..n */
w->sum += i;
return NULL; /* could also return a heap pointer */
}
int main(void) {
pthread_t t1, t2;
Work *a = malloc(sizeof *a); /* heap so both threads see live storage */
Work *b = malloc(sizeof *b);
if (!a || !b) { free(a); free(b); return 1; }
a->n = 1000; b->n = 2000;
pthread_create(&t1, NULL, worker, a); /* spawn; 'a' is shared with t1 */
pthread_create(&t2, NULL, worker, b);
pthread_join(t1, NULL); /* block until t1 finishes... */
pthread_join(t2, NULL); /* ...then t2; now results are visible */
printf("%ld %ld\n", a->sum, b->sum); /* 500500 2001000 */
free(a); /* main owns the buffers; free after join */
free(b);
return 0;
}POSIX threads share the process address space, so data crosses the boundary through void* pointers: here each worker gets a heap Work* that stays alive across the spawn. pthread_join blocks until the thread exits and establishes the happens-before edge that makes the writes visible, so main only frees the buffers after joining (link with -pthread).
#include <iostream>
#include <thread>
#include <vector>
#include <memory>
#include <numeric>
// std::thread runs any callable; arguments are COPIED into the thread
// unless you opt into std::ref. We capture a unique_ptr's payload by
// reference and write the result through it -- ownership stays in main.
long sum_to(long n) { // pure compute, returned by value
long s = 0;
for (long i = 1; i <= n; ++i) s += i;
return s;
}
int main() {
auto r1 = std::make_unique<long>(0); // results live on the heap, owned by main
auto r2 = std::make_unique<long>(0);
// Spawn two threads; the lambda captures the raw target by reference.
std::thread t1([&, p = r1.get()] { *p = sum_to(1000); });
std::thread t2([&, p = r2.get()] { *p = sum_to(2000); });
t1.join(); // must join (or detach) before ~thread,
t2.join(); // else terminate(); join syncs the writes
std::cout << *r1 << ' ' << *r2 << '\n'; // 500500 2001000
return 0; // unique_ptrs free the longs (RAII)
}std::thread takes any callable and copies its arguments by default, so to share storage you pass a pointer/std::ref; here the lambdas write through raw pointers borrowed from unique_ptrs that main still owns (RAII frees them). A std::thread must be join()ed or detach()ed before its destructor runs or the program calls std::terminate(); join() also synchronizes the worker's writes. (C++20's std::jthread joins automatically in its destructor.)
// HolyC/TempleOS is cooperatively multitasked and single-core: you spawn
// a "task" with Spawn(fp, data), and it yields cooperatively. There is no
// preemption and no memory protection, so all tasks share ring-0 memory
// directly -- the data pointer IS the shared channel, no copy is made.
class Work { I64 n, sum; };
U0 Worker(Work *w) // a task entry point: takes the data pointer
{
I64 i, s = 0;
for (i = 1; i <= w->n; i++) // sum 1..n
s += i;
w->sum = s; // write the result straight into shared memory
}
Work *a = MAlloc(sizeof(Work)); // heap shared with the spawned task
a->n = 1000; a->sum = 0;
CTask *t = Spawn(&Worker, a); // create + start a task running Worker(a)
while (TaskValidate(t)) // no real join: poll until the task dies
Yield; // cooperatively give it CPU time
Print("%d\n", a->sum); // 500500
Free(a);TempleOS has no pthread-style preemptive threads: it is cooperatively scheduled, so you Spawn a task and it must Yield for others to run. With no memory protection every task shares ring-0 memory, so the MAlloc'd Work pointer is passed straight through with no copy and written in place. There is no true join; the idiom is to poll TaskValidate (or use a shared flag) until the task exits, then Free the buffer.
const std = @import("std");
// std.Thread.spawn takes a config, a function, and a tuple of args that
// are passed by value -- to mutate shared state you pass a pointer. No
// hidden allocation: the thread's stack size comes from SpawnConfig.
fn worker(n: u64, out: *u64) void {
var s: u64 = 0;
var i: u64 = 1;
while (i <= n) : (i += 1) s += i; // sum 1..n
out.* = s; // write result through the pointer
}
pub fn main() !void {
var r1: u64 = 0;
var r2: u64 = 0;
// spawn returns !std.Thread; args is a tuple matching worker's params.
const t1 = try std.Thread.spawn(.{}, worker, .{ 1000, &r1 });
const t2 = try std.Thread.spawn(.{}, worker, .{ 2000, &r2 });
t1.join(); // blocks until the thread returns; no error to handle
t2.join();
std.debug.print("{d} {d}\n", .{ r1, r2 }); // 500500 2001000
}std.Thread.spawn(config, func, args_tuple) returns !std.Thread; the args tuple is passed by value, so shared mutable state crosses as an explicit pointer (&r1) into stack storage that outlives the join. There are no hidden allocations -- the worker's stack comes from SpawnConfig -- and join() blocks until the thread returns and makes its writes visible. (std.Thread.Mutex guards data that is touched concurrently.)
use fmt;
// Hare's standard library has NO threading module yet -- there is no
// pthread-style create/join wrapper. The only primitive is the raw Linux
// clone(2) syscall in sys::linux:
//
// use sys::linux;
// // share the address space (CLONE_VM) so memory crosses the boundary;
// // the child continues inline from the call (fork-style, not an fp entry):
// let stack = alloc([0u8...], 64 * 1024)!; // manual child stack you own
// match (sys::linux::clone(&stack[len(stack)], flags, null, &ctid, 0)) {
// case let pid: int => /* parent: futex-wait on ctid to "join" */;
// case void => /* child: do work, then sys::linux::exit(0) */;
// case let e: errno => /* handle */;
// };
//
// That is verbose and Linux-only, so the honest minimal answer is the
// single-task computation -- the shared-memory idiom is still a pointer
// into storage main owns.
type work = struct { n: u64, sum: u64 };
fn worker(w: *work) void = {
let s = 0u64;
for (let i = 1u64; i <= w.n; i += 1) {
s += i; // sum 1..n
};
w.sum = s; // write result through the pointer
};
export fn main() void = {
let a = alloc(work { n = 1000, sum = 0 })!; // heap, owned by main
defer free(a); // reclaimed after use
worker(a); // (would be the cloned child)
fmt::printfln("{}", a.sum)!; // 500500
};Unlike the others, Hare's standard library currently ships no thread module -- there is no unix::thread, no create/join, not even a mutex. The only mechanism is the raw Linux clone(2) syscall exposed as sys::linux::clone(stack, flags, parent_tid, child_tid, tls), which (like fork) returns the new tid to the parent and void to the child that continues inline; you share state by passing CLONE_VM and "join" by futex-waiting on the tid the kernel clears via CLONE_CHILD_CLEARTID. Because that is verbose and Linux-only, the honest minimal snippet runs the computation as a single task: the cross-boundary idiom is still an alloc'd pointer that main owns and frees with defer free(a) (note alloc(...)! propagates nomem).
package main
import "core:fmt"
import "core:thread"
// core:thread wraps OS threads. A ^thread.Thread carries user_index/data
// fields, but the idiomatic way to share is a pointer in a struct you own
// (memory routes through the implicit context.allocator).
Work :: struct { n: int, sum: int }
worker :: proc(t: ^thread.Thread) {
w := cast(^Work)t.data // the pointer we stashed before starting
s := 0
for i in 1..=w.n do s += i // sum 1..n
w.sum = s // write result into shared memory
}
main :: proc() {
a := new(Work); a.n = 1000 // heap via context.allocator
b := new(Work); b.n = 2000
defer free(a) // main owns the buffers
defer free(b)
t1 := thread.create(worker); t1.data = a // create paused...
t2 := thread.create(worker); t2.data = b
defer thread.destroy(t1)
defer thread.destroy(t2)
thread.start(t1); thread.start(t2) // ...then start
thread.join(t1); thread.join(t2) // block until both finish
fmt.println(a.sum, b.sum) // 500500 2001000
}Odin's core:thread creates a ^thread.Thread (initially paused), into which you stash a data pointer to a struct allocated via the implicit context.allocator; start runs it and join blocks until it exits. Sharing is explicit pointers -- new(Work) is owned by main, freed with defer free, and the worker writes through cast(^Work)t.data. thread.destroy reclaims the thread object itself after the join.
\ Standard ANS/Forth has NO portable threading model -- the language is
\ deliberately tiny and single-task. Concurrency is an environmental
\ extension: cooperative MULTITASKER words (PAUSE/ACTIVATE) on classic
\ systems like Gforth/SwiftForth, or raw OS threads via an FFI. Below is
\ the closest idiom: a cooperative task that yields, sharing one cell.
VARIABLE result \ a single shared cell (the "channel")
: sum-to ( n -- ) \ sum 1..n into 'result'
0 swap ( 0 n )
1+ 1 ?DO I + LOOP \ accumulate 1..n
result ! ; \ store the total into the shared variable
\ On a multitasker you would: build a task, ACTIVATE it on sum-to,
\ then PAUSE in a loop until it signals done -- cooperative, not
\ preemptive, so the worker must PAUSE to let others run. With no such
\ extension this simply runs inline (one task), which is the honest answer:
1000 sum-to
result @ . \ 500500Standard Forth has no built-in threads -- it is single-task by design -- so concurrency is always an extension: a cooperative MULTITASKER (PAUSE/ACTIVATE/STOP, as in Gforth or SwiftForth) or OS threads bolted on via an FFI. The closest portable idiom is a shared VARIABLE used as the channel plus a cooperative task that must PAUSE to yield; with no extension present the word simply runs inline as the single task, which is shown here. There is no join -- you spin on a shared done-flag instead.