Core API Reference
This document provides comprehensive documentation for all Blitz APIs, including function signatures, parameters, return types, examples, limitations, and best practices.
Initialization
Section titled “Initialization”Blitz uses a global thread pool that must be initialized before use.
init() !void
Section titled “init() !void”Initialize the thread pool with default settings (CPU count - 1 workers).
try blitz.init();defer blitz.deinit();Errors: Returns error.AlreadyInitialized if called twice, or thread creation errors.
initWithConfig(config: ThreadPoolConfig) !void
Section titled “initWithConfig(config: ThreadPoolConfig) !void”Initialize with custom configuration.
try blitz.initWithConfig(.{ .background_worker_count = 8,});defer blitz.deinit();Parameters:
| Field | Type | Default | Description |
|---|---|---|---|
background_worker_count | ?usize | null (auto) | Number of worker threads. null = CPU count - 1 |
deinit() void
Section titled “deinit() void”Shut down the thread pool and release resources.
blitz.deinit();Notes:
- Waits for all pending work to complete
- Safe to call multiple times
- Must be called before program exit to avoid resource leaks
isInitialized() bool
Section titled “isInitialized() bool”Check if the thread pool is initialized.
if (!blitz.isInitialized()) { try blitz.init();}numWorkers() u32
Section titled “numWorkers() u32”Get the number of worker threads.
const workers = blitz.numWorkers();std.debug.print("Using {} workers\n", .{workers});Iterator API
Section titled “Iterator API”The iterator API is the recommended way to use Blitz. It provides composable, chainable operations that automatically parallelize when beneficial.
Creating Iterators
Section titled “Creating Iterators”iter(T, slice) ParallelSliceIterator(T)
Section titled “iter(T, slice) ParallelSliceIterator(T)”Create an immutable parallel iterator over a slice.
const data = [_]i64{ 1, 2, 3, 4, 5 };const it = blitz.iter(i64, &data);const sum = it.sum(); // 15iterMut(T, slice) ParallelMutSliceIterator(T)
Section titled “iterMut(T, slice) ParallelMutSliceIterator(T)”Create a mutable parallel iterator for in-place operations.
var data = [_]i64{ 1, 2, 3, 4, 5 };blitz.iterMut(i64, &data).mapInPlace(double);// data is now [2, 4, 6, 8, 10]range(start, end) RangeIterator
Section titled “range(start, end) RangeIterator”Create a parallel iterator over an index range.
// Sum of 0 + 1 + 2 + ... + 99const sum = blitz.range(0, 100).sum(i64, identity);
// Execute function for each indexblitz.range(0, 1000).forEach(processIndex);Aggregation Methods
Section titled “Aggregation Methods”.sum() T
Section titled “.sum() T”Compute the sum of all elements in parallel.
const data = [_]i64{ 1, 2, 3, 4, 5 };const total = blitz.iter(i64, &data).sum(); // 15.min() ?T
Section titled “.min() ?T”Find the minimum element in parallel.
const data = [_]i64{ 5, 2, 8, 1, 9 };if (blitz.iter(i64, &data).min()) |m| { std.debug.print("Min: {}\n", .{m}); // Min: 1}.max() ?T
Section titled “.max() ?T”Find the maximum element in parallel.
const data = [_]i64{ 5, 2, 8, 1, 9 };if (blitz.iter(i64, &data).max()) |m| { std.debug.print("Max: {}\n", .{m}); // Max: 9}.reduce(identity, combine) T
Section titled “.reduce(identity, combine) T”Perform a custom parallel reduction.
const data = [_]i64{ 1, 2, 3, 4, 5 };
// Product of all elementsconst product = blitz.iter(i64, &data).reduce(1, struct { fn mul(a: i64, b: i64) i64 { return a * b; }}.mul); // 120Requirements:
combinemust be associative:combine(a, combine(b, c)) == combine(combine(a, b), c)identitymust be the identity element:combine(identity, x) == x
Search Methods
Section titled “Search Methods”All search methods support early exit - they stop processing as soon as the result is determined.
.findAny(predicate) ?T
Section titled “.findAny(predicate) ?T”Find any element matching the predicate. Fastest search method but non-deterministic.
const data = [_]i64{ 1, 2, 3, 4, 5, 6 };
const even = blitz.iter(i64, &data).findAny(struct { fn isEven(x: i64) bool { return @mod(x, 2) == 0; }}.isEven);.findFirst(predicate) ?FindResult(T)
Section titled “.findFirst(predicate) ?FindResult(T)”Find the leftmost element matching the predicate.
if (blitz.iter(i64, &data).findFirst(isEven)) |result| { std.debug.print("First even: {} at index {}\n", .{ result.value, result.index });}.any(predicate) bool
Section titled “.any(predicate) bool”Check if any element matches the predicate. Supports early exit.
const has_even = blitz.iter(i64, &data).any(isEven);.all(predicate) bool
Section titled “.all(predicate) bool”Check if all elements match the predicate. Supports early exit.
const all_positive = blitz.iter(i64, &data).all(isPositive);Mutation Methods
Section titled “Mutation Methods”These methods require iterMut (mutable iterator).
.mapInPlace(transform) void
Section titled “.mapInPlace(transform) void”Transform each element in place.
var data = [_]i64{ 1, 2, 3, 4, 5 };blitz.iterMut(i64, &data).mapInPlace(struct { fn double(x: i64) i64 { return x * 2; }}.double);// data is now [2, 4, 6, 8, 10].fill(value) void
Section titled “.fill(value) void”Set all elements to a value.
var data = [_]i64{ 1, 2, 3, 4, 5 };blitz.iterMut(i64, &data).fill(0);// data is now [0, 0, 0, 0, 0]Fork-Join API
Section titled “Fork-Join API”For divide-and-conquer algorithms and parallel task execution.
join(tasks) Result
Section titled “join(tasks) Result”Execute multiple tasks in parallel and collect results.
const result = blitz.join(.{ .user = .{ fetchUserById, user_id }, .posts = .{ fetchPostsByUser, user_id },});// result.user, result.postsParameters: Anonymous struct where each field is either:
- A function pointer (no arguments)
- A tuple
.{ function, arg1, arg2, ... }(up to 10 arguments)
Returns: Struct with same field names, containing each task’s return value.
Parallel Algorithms
Section titled “Parallel Algorithms”Sorting
Section titled “Sorting”All sort operations are in-place and use parallel PDQSort (Pattern-Defeating Quicksort).
sortAsc(T, data) void
Section titled “sortAsc(T, data) void”Sort a slice in ascending order.
var data = [_]i64{ 5, 2, 8, 1, 9, 3 };blitz.sortAsc(i64, &data);// data is now [1, 2, 3, 5, 8, 9]sortDesc(T, data) void
Section titled “sortDesc(T, data) void”Sort a slice in descending order.
sort(T, data, lessThan) void
Section titled “sort(T, data, lessThan) void”Sort with a custom comparator.
var data = [_]i64{ 5, -2, 8, -1, 9 };
// Sort by absolute valueblitz.sort(i64, &data, struct { fn byAbs(a: i64, b: i64) bool { return @abs(a) < @abs(b); }}.byAbs);sortByKey(T, K, data, keyFn) void
Section titled “sortByKey(T, K, data, keyFn) void”Sort by extracting a comparable key from each element.
const Person = struct { name: []const u8, age: u32 };var people: []Person = ...;
blitz.sortByKey(Person, u32, &people, struct { fn getAge(p: Person) u32 { return p.age; }}.getAge);sortByCachedKey(T, K, allocator, data, keyFn) !void
Section titled “sortByCachedKey(T, K, allocator, data, keyFn) !void”Two-phase sort: compute keys in parallel, then sort by cached keys.
Low-Level API
Section titled “Low-Level API”For fine-grained control over parallelism.
parallelFor(n, Context, ctx, body) void
Section titled “parallelFor(n, Context, ctx, body) void”Execute a function in parallel over a range.
const Context = struct { data: []f64, multiplier: f64,};
blitz.parallelFor(data.len, Context, .{ .data = data, .multiplier = 2.0,}, struct { fn body(ctx: Context, start: usize, end: usize) void { for (ctx.data[start..end]) |*x| { x.* *= ctx.multiplier; } }}.body);parallelForWithGrain(n, Context, ctx, body, grain) void
Section titled “parallelForWithGrain(n, Context, ctx, body, grain) void”Execute a function in parallel over a range with explicit grain size control.
Use this when you need fine-grained control over parallelization granularity. For most cases, prefer parallelFor() which auto-tunes based on data size.
Parameters:
n: Range size[0, n)Context: Type holding shared datactx: Context instancebody:fn(Context, start: usize, end: usize) voidgrain: Minimum elements per chunk
// Process with small grain for expensive operationsconst Context = struct { input: []const f64, output: []f64,};
blitz.parallelForWithGrain( data.len, Context, .{ .input = input, .output = output }, struct { fn body(ctx: Context, start: usize, end: usize) void { for (ctx.input[start..end], ctx.output[start..end]) |in, *out| { out.* = expensiveComputation(in); } } }.body, 100, // Small grain = more parallelism for expensive ops);
// Large grain for cheap operations (reduces overhead)blitz.parallelForWithGrain(n, void, {}, struct { fn body(_: void, start: usize, end: usize) void { for (start..end) |i| { cheapOperation(i); } }}.body, 10000); // Large grain = less overhead for cheap opsGrain size guidelines:
| Operation Cost | Recommended Grain |
|---|---|
| Expensive (>1μs/element) | 64-256 |
| Medium (~100ns/element) | 256-1024 |
| Cheap (<10ns/element) | 4096-10000 |
parallelReduce(T, n, identity, Context, ctx, map, combine) T
Section titled “parallelReduce(T, n, identity, Context, ctx, map, combine) T”Parallel map-reduce with full control over mapping and combining.
Maps each index to a value using map(ctx, index), then combines all values using the associative combine(a, b) function in a divide-and-conquer pattern.
Parameters:
T: Result typen: Range size[0, n)identity: Identity value for combine (e.g., 0 for sum, 1 for product)Context: Type holding shared datactx: Context instancemap:fn(Context, usize) T- Extract/compute value at indexcombine:fn(T, T) T- Associative binary operation
Requirements:
combinemust be associative:combine(a, combine(b, c)) == combine(combine(a, b), c)identitymust satisfy:combine(identity, x) == x
// Sum of squaresconst data = [_]i64{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
const sum_of_squares = blitz.parallelReduce( i64, // Result type data.len, // Range size 0, // Identity for addition []const i64, // Context type &data, // Context value struct { fn map(d: []const i64, i: usize) i64 { return d[i] * d[i]; // Square each element } }.map, struct { fn combine(a: i64, b: i64) i64 { return a + b; // Sum the squares } }.combine,);// sum_of_squares = 385
// Dot product of two vectorsconst DotCtx = struct { a: []const f64, b: []const f64 };
const dot_product = blitz.parallelReduce( f64, a.len, 0.0, DotCtx, .{ .a = a, .b = b }, struct { fn map(ctx: DotCtx, i: usize) f64 { return ctx.a[i] * ctx.b[i]; } }.map, struct { fn combine(x: f64, y: f64) f64 { return x + y; } }.combine,);
// Find maximum valueconst max_val = blitz.parallelReduce( i64, data.len, std.math.minInt(i64), // Identity for max []const i64, &data, struct { fn map(d: []const i64, i: usize) i64 { return d[i]; } }.map, struct { fn combine(a: i64, b: i64) i64 { return @max(a, b); } }.combine,);
// Count elements matching predicateconst count = blitz.parallelReduce( usize, data.len, 0, []const i64, &data, struct { fn map(d: []const i64, i: usize) usize { return if (d[i] > 5) 1 else 0; } }.map, struct { fn combine(a: usize, b: usize) usize { return a + b; } }.combine,);parallelReduceWithGrain(T, n, identity, Context, ctx, map, combine, grain) T
Section titled “parallelReduceWithGrain(T, n, identity, Context, ctx, map, combine, grain) T”Same as parallelReduce but with explicit grain size control.
// Use smaller grain for expensive map operationsconst result = blitz.parallelReduceWithGrain( f64, n, 0.0, Context, ctx, expensiveMap, add, 256, // Smaller grain for expensive operations);Collection & Scatter API
Section titled “Collection & Scatter API”For parallel materialization patterns (inspired by Polars).
parallelCollect(T, U, input, output, Context, ctx, mapFn) void
Section titled “parallelCollect(T, U, input, output, Context, ctx, mapFn) void”Parallel map that collects results into an output slice.
var input: [1000]i32 = undefined;var output: [1000]i64 = undefined;
blitz.parallelCollect(i32, i64, &input, &output, void, {}, struct { fn map(_: void, x: i32) i64 { return @as(i64, x) * 2; }}.map);Requirements: output.len must equal input.len.
parallelMapInPlace(T, data, Context, ctx, mapFn) void
Section titled “parallelMapInPlace(T, data, Context, ctx, mapFn) void”Transform elements in-place in parallel.
blitz.parallelMapInPlace(f64, data, void, {}, struct { fn transform(_: void, x: f64) f64 { return @sqrt(x); }}.transform);parallelFlatten(T, slices, output) void
Section titled “parallelFlatten(T, slices, output) void”Flatten nested slices into a single output slice in parallel.
const slices = [_][]const u32{ &.{1, 2, 3}, &.{4, 5}, &.{6, 7, 8, 9} };var output: [9]u32 = undefined;blitz.parallelFlatten(u32, &slices, &output);// output = [1, 2, 3, 4, 5, 6, 7, 8, 9]parallelScatter(T, values, indices, output) void
Section titled “parallelScatter(T, values, indices, output) void”Scatter values to output using index mapping.
const values = [_]u32{ 100, 200, 300 };const indices = [_]usize{ 5, 0, 3 };var output: [10]u32 = undefined;blitz.parallelScatter(u32, &values, &indices, &output);// output[0]=200, output[3]=300, output[5]=100Error-Safe API
Section titled “Error-Safe API”All tasks complete before errors propagate. See Error Handling for details.
tryJoin(tasks) E!Result
Section titled “tryJoin(tasks) E!Result”Execute error-returning tasks in parallel with error safety.
const result = try blitz.tryJoin(.{ .user = fetchUser, // returns !User .posts = fetchPosts, // returns ![]Post});// result.user, result.poststryForEach(n, E, Context, ctx, bodyFn) E!void
Section titled “tryForEach(n, E, Context, ctx, bodyFn) E!void”Parallel iteration with error handling.
try blitz.tryForEach(data.len, ParseError, Context, ctx, struct { fn body(c: Context, start: usize, end: usize) ParseError!void { for (start..end) |i| { try processItem(c.data[i]); } }}.body);tryReduce(T, E, n, identity, Context, ctx, mapFn, combineFn) E!T
Section titled “tryReduce(T, E, n, identity, Context, ctx, mapFn, combineFn) E!T”Parallel reduction with error handling.
const total = try blitz.tryReduce( i64, ParseError, data.len, 0, Context, ctx, struct { fn map(c: Context, i: usize) ParseError!i64 { return try parse(c.data[i]); } }.map, struct { fn combine(a: i64, b: i64) i64 { return a + b; } }.combine,);Scope & Broadcast API
Section titled “Scope & Broadcast API”For dynamic task spawning. See Scope & Broadcast for details.
scope(func) void
Section titled “scope(func) void”Execute a scope function. Tasks spawned within run in parallel when the scope exits.
blitz.scope(struct { fn run(s: *blitz.Scope) void { s.spawn(task1); s.spawn(task2); s.spawn(task3); }}.run);broadcast(func) void
Section titled “broadcast(func) void”Execute a function on all worker threads.
blitz.broadcast(struct { fn run(worker_index: usize) void { // Runs once on each worker thread initThreadLocal(worker_index); }}.run);getStats() PoolStats
Section titled “getStats() PoolStats”Get pool statistics for debugging.
const stats = blitz.getStats();std.debug.print("Executed: {}, Stolen: {}\n", .{ stats.executed, stats.stolen });Configuration
Section titled “Configuration”Grain Size
Section titled “Grain Size”The grain size controls the minimum work unit.
setGrainSize(size: usize) void
Section titled “setGrainSize(size: usize) void”Set the global grain size.
blitz.setGrainSize(1024);getGrainSize() usize
Section titled “getGrainSize() usize”Get the current grain size.
defaultGrainSize() usize
Section titled “defaultGrainSize() usize”Get the default grain size (65536).
You can also use the constant blitz.DEFAULT_GRAIN_SIZE (65536).
Guidelines:
| Operation Cost | Recommended Grain |
|---|---|
| Trivial (add, compare) | 10000-65536 |
| Light (simple math) | 1024-10000 |
| Medium (string ops) | 256-1024 |
| Heavy (I/O, allocation) | 64-256 |
Thread Safety
Section titled “Thread Safety”| Component | Thread Safety |
|---|---|
iter(), iterMut() | Create from any thread |
| Iterator methods | Execute on worker threads |
join() | Safe from any thread |
| Input slices | Must not be modified during operation |
Output of iterMut | Safe after operation completes |
Important: Do not modify input data while a parallel operation is running.