refreshed by cargo run --release --example bench_runner

What does each tier cost?

ZynML microbenchmarks measured through every tier of the Zyntax runtime: the BC interpreter alone, the BC interpreter after the HIR optimization pipeline (run_interp_safe_opts), the Cranelift JIT after tier-up, and the full ladder (Cranelift → LLVM) when built with the llvm-backend feature. Each kernel matches its rayzor / HaxeBenchmarks counterpart parameter-for-parameter. Mandelbrot is 875 × 500 / max_iter 1000 (reference checksum 112 798 515), n-body is the five-body solar system over 20 × 500 000 = 10 M advance steps returning Std.int(energy · 1 000 000), fib is the naive recursive fib(40) = 102 334 155. The page exposes where the runtime needs work; source lives at crates/zynml/examples.

Median of 9 · 3 warmup runs discarded Wall-clock · compile + execute timed separately Result pinned · output checked every run Tiered · BC interp → opt → Cranelift JIT → LLVM

Loading benchmark results…

Shorter bars are better. compile execute Each kernel runs through three tiers: the BC interpreter on raw HIR, the BC interpreter after the optimization pipeline (const-folding, CSE, load-CSE, leaf inlining, LICM, loop and reduction vectorization, CFG simplification, alloca→malloc promotion, drop-site insertion), and the Cranelift JIT after tier-up. Tiered mode cold-starts on the interpreter; the asynchronous Cranelift compile lands during the kernel's first few iterations and the timed call dispatches through native code from then on. Output is verified to match across tiers before each run is recorded.

Tier 2 (LLVM) caveat on Linux x86_64. The zyntax-tiered-llvm column compiles HIR through LLVM to a PIC object, links it via the system linker to a shared object, and dlopens the result for tier-up. The path is exercised end-to-end on macOS aarch64 in our local runs; on the Linux x86_64 CI runner the install is currently soft-falling to Cranelift on the bench kernels, so tiered-llvm numbers there track Cranelift to the millisecond. The cause (linker invocation, object layout, or dlopen resolution differing from macOS) is under active investigation. When you see the two tiers' numbers diverge in the chart it is real LLVM-tier execution; when they match, the runtime took the Cranelift floor on that runner.