Gleam and the BEAM Scheduler Under Load

Preamble

This post executes the workload from A Language-Agnostic Concurrent Workload for 2025 Comparisons in Gleam on the BEAM: thousands of processes, bounded mailboxes, supervised restarts, and the same throughput and percentile targets expected of Rust and Tokio: The Same Concurrent Workload in Type-Safe Threads. The goal is not a trophy number; it is to see how preemptive scheduling and per-process heaps behave when some jobs sleep (simulated I/O) while others stay CPU-hot.

What to expect (and why)

The BEAM’s reduction budget yields fairness: a long-running process cannot starve the system indefinitely because the scheduler preempts it. When jobs mix sleep and compute, tail latency often looks healthier than cooperative-only runtimes where one greedy task stalls others—until something blocks a scheduler thread (see July on NIF pitfalls).

Throughput may trade against per-process memory overhead. Millions of tiny processes are possible, but heap churn and message copying still cost money. Resident set and GC pressure are logged alongside jobs per second.

Instrumentation

Where possible, telemetry is captured in the spirit of observer: scheduler utilization, queue depths, and message queue lengths for straggler processes. If a mailbox grows without bound, that is a design bug masquerading as performance tuning.

Reproducibility

Hardware class, OTP release, Gleam version, random seeds, and queue bounds are recorded beside the results. Rust and Tokio: The Same Concurrent Workload in Type-Safe Threads mirrors those constants so Rust versus Gleam on the Same Bench: What the Numbers Suggest is defensible.

Debugging under load

When p99 spikes, schedulers_online can be reduced deliberately in test environments to amplify contention—painful but instructive. Logging storms and large message payloads that inflate copying costs are worth scanning for.

Runnable introspection (Erlang shell)

These calls turn “p99 weird” into evidence during March runs:

%% Scheduler topology
erlang:system_info(schedulers_online).
erlang:system_info(logical_processors_available).

%% A straggler process: message queue length + reductions consumed
Pid = pid(0, 42, 0).  %% replace with real pid from logs
erlang:process_info(Pid, message_queue_len).
erlang:process_info(Pid, reductions).

%% Run-queue depth is easiest to read from observer/telemetry; raw statistics/2 keys differ by OTP—check your release’s docs before scripting this in CI.

How to use the numbers: if message_queue_len climbs without bound for collectors, your backpressure story failed—fix design before blaming the scheduler. If reductions per job are huge, you may be CPU-heavy on a few processes; consider sharding work or moving hot loops to better reduction hygiene (smaller slices, more yields).

Reproducible benchmark checklist

Record OTP version, Gleam compiler version, ERTS build, kernel HZ, and whether CPU governor is performance.
Export observer screenshots or etop CSV for the steady-state window only.
Save raw JSONL metrics (January schema) alongside gnuplot/dat files used for charts.

Conclusion

This post is the BEAM evidence pass. Rust and Tokio: The Same Concurrent Workload in Type-Safe Threads reimplements the same harness with Tokio channels and tasks; Rust versus Gleam on the Same Bench: What the Numbers Suggest narrates side-by-side trade-offs without pretending one chart crowns a universal winner.