Parallel Iterators
Rayon-style composable iterators: sum(), min(), max(), findAny(), any(), all(), reduce(), and more. Automatic parallelization with early-exit support.
Blitz is a high-performance, lock-free work-stealing parallel runtime for Zig, inspired by Rust’s Rayon library. It provides fork-join parallelism, parallel iterators, and efficient parallel sorting.
const blitz = @import("blitz");
pub fn main() !void { var data = [_]i64{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
// Parallel sum - automatically parallelized const sum = blitz.iter(i64, &data).sum();
// Parallel search with early exit const found = blitz.iter(i64, &data).findAny(struct { fn pred(x: i64) bool { return x > 5; } }.pred);
// Parallel transform in-place blitz.iterMut(i64, &data).mapInPlace(struct { fn double(x: i64) i64 { return x * 2; } }.double);
// Fork-join for divide-and-conquer const result = blitz.join(.{ .left = .{ computeLeft, left_data }, .right = .{ computeRight, right_data }, });}Blitz achieves significant speedups over Rust’s Rayon on equivalent benchmarks:
| Operation | Blitz | Rayon | Speedup |
|---|---|---|---|
join() fork-join (depth 20) | 0.54 ms | 0.71 ms | 1.31x |
iter().sum() (100M i64) | 3.1 ms | 8.2 ms | 2.6x |
sortAsc() (10M i64) | 89 ms | 119 ms | 1.34x |
Benchmarks on Apple M2 Pro, 10 cores
Parallel Iterators
Rayon-style composable iterators: sum(), min(), max(), findAny(), any(), all(), reduce(), and more. Automatic parallelization with early-exit support.
Fork-Join
Efficient divide-and-conquer with join(). Supports heterogeneous return types and up to 8 parallel tasks. Perfect for recursive algorithms.
Work Stealing
Lock-free Chase-Lev deque with Rayon’s sleep/wake protocol. Optimal load balancing with minimal contention and smart thread sleeping.
Parallel Sorting
Pattern-defeating quicksort (PDQSort) with automatic parallelization. 10x faster than std.mem.sort on large arrays.
The iterator API provides the most ergonomic way to parallelize data processing:
const data: []const i64 = &.{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
// Aggregationsconst sum = blitz.iter(i64, data).sum(); // 55const min = blitz.iter(i64, data).min(); // ?i64 = 1const max = blitz.iter(i64, data).max(); // ?i64 = 10
// Search with early exitconst found = blitz.iter(i64, data).findAny(isNegative); // Fast, any matchconst first = blitz.iter(i64, data).findFirst(isNegative); // Deterministic
// Predicates (short-circuit)const hasNeg = blitz.iter(i64, data).any(isNegative); // Stops on first matchconst allPos = blitz.iter(i64, data).all(isPositive); // Stops on first fail
// Mutationblitz.iterMut(i64, &data).mapInPlace(double); // Transform in-placeblitz.iterMut(i64, &data).fill(0); // Parallel memset
// Custom reductionconst product = blitz.iter(i64, data).reduce(1, multiply);For divide-and-conquer algorithms and independent parallel tasks:
// Two parallel tasks with different return typesconst result = blitz.join(.{ .count = .{ countItems, items }, // Returns usize .total = .{ sumValues, values }, // Returns i64});// Access: result.count, result.total
// Recursive parallel fibonaccifn parallelFib(n: u64) u64 { if (n < 20) return fibSequential(n); // Sequential threshold
const r = blitz.join(.{ .a = .{ parallelFib, n - 1 }, .b = .{ parallelFib, n - 2 }, }); return r.a + r.b;}High-performance parallel PDQSort:
var numbers = [_]i64{ 5, 2, 8, 1, 9, 3, 7, 4, 6 };
blitz.sortAsc(i64, &numbers); // Ascendingblitz.sortDesc(i64, &numbers); // Descendingblitz.sort(i64, &numbers, lessThanFn); // Custom comparator
// Sort structs by keyblitz.sortByKey(Person, u32, &people, struct { fn key(p: Person) u32 { return p.age; }}.key);For fine-grained control over parallelism:
// Parallel for with contextblitz.parallelFor(n, Context, ctx, bodyFn);blitz.parallelForWithGrain(n, Context, ctx, bodyFn, grain_size);
// Parallel map-reduceconst result = blitz.parallelReduce(T, n, identity, Context, ctx, mapFn, combineFn);┌─────────────────────────────────────────────────────────────┐│ User Code │├─────────────────────────────────────────────────────────────┤│ iter().sum() │ join(.{...}) │ sortAsc() │ parallelFor │├─────────────────────────────────────────────────────────────┤│ Work-Stealing Runtime ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │Worker 0 │ │Worker 1 │ │Worker 2 │ │Worker N │ ││ │┌───────┐│ │┌───────┐│ │┌───────┐│ │┌───────┐│ ││ ││ Deque ││ ││ Deque ││ ││ Deque ││ ││ Deque ││ ││ │└───────┘│ │└───────┘│ │└───────┘│ │└───────┘│ ││ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ ││ │ │ │ │ ││ └────────────┴─────┬──────┴────────────┘ ││ │ ││ ┌───────────┴───────────┐ ││ │ Sleep/Wake Manager │ ││ │ (JEC Protocol) │ ││ └───────────────────────┘ │└─────────────────────────────────────────────────────────────┘| Use Case | Recommendation |
|---|---|
| Data processing (sum, filter, transform) | blitz.iter() / blitz.iterMut() |
| Recursive divide-and-conquer | blitz.join() |
| Sorting large arrays | blitz.sortAsc() / blitz.sort() |
| Fine-grained parallel loops | blitz.parallelFor() |
| Map-reduce patterns | blitz.parallelReduce() |