Skip to content
v1.0.0-zig0.15.2

Choosing the Right API

Blitz offers several parallel APIs, each optimized for different patterns. This guide helps you pick the right one.

Start here: “I want to…”

I want to…UseExample
Process every element in an arrayparallelForApply transform to each pixel
Sum / min / max an arrayiter().sum() / .min() / .max()Total revenue, highest score
Search for an elementiter().findAny()Find first negative value
Check a condition on all elementsiter().any() / .all()Any NaN? All positive?
Transform array to new arrayparallelCollectConvert i32[] to f64[]
Transform array in placeparallelMapInPlace or iterMut().mapInPlace()Normalize pixel values
Run 2-8 independent tasksjoinCompute stats + histogram
Run 2-64 dynamic tasksscope + spawnLoad multiple files
Run tasks that may failtryJoin / tryForEach / tryReduceParse + validate data
Sort an arraysortAsc / sortDesc / sortSort by timestamp
Merge many slices into oneparallelFlattenCombine per-thread results
Write to scattered positionsparallelScatterBuild hash table
Run code on every threadbroadcastInitialize thread-local state
Fire-and-forget background workspawnWrite audit log
Aggregate with custom logicparallelReduceWeighted average, dot product

Data Parallelism: iter vs parallelFor vs parallelCollect

Section titled “Data Parallelism: iter vs parallelFor vs parallelCollect”

These three APIs all process array elements in parallel, but they serve different purposes:

const data = [_]i64{ 1, 2, 3, 4, 5 };

iter() — Built-in aggregations and searches

const sum = blitz.iter(i64, &data).sum(); // 15
const min = blitz.iter(i64, &data).min(); // 1
const found = blitz.iter(i64, &data).findAny(pred);

Best for: Standard aggregations (sum, min, max, count), searches (findAny, findFirst), and predicates (any, all). Zero boilerplate.

parallelFor — Custom chunk processing

blitz.parallelFor(data.len, Context, ctx, struct {
fn body(c: Context, start: usize, end: usize) void {
for (c.data[start..end]) |*v| {
v.* = expensiveTransform(v.*);
}
}
}.body);

Best for: Side effects, in-place mutation, or when you need the chunk boundaries (start, end) for efficient batch processing.

parallelCollect — Map to a new array

blitz.parallelCollect(i64, f64, &input, &output, void, {}, struct {
fn transform(_: void, x: i64) f64 {
return @as(f64, @floatFromInt(x)) * 0.1;
}
}.transform);

Best for: Transforming an array of type T into an array of type U, producing a new output array.

Summary:

Featureiter()parallelForparallelCollect
Aggregation (sum, min)Built-inManualNo
Search (find, any, all)Built-inManualNo
In-place mutationiterMutYesNo
Map T -> UNoManualBuilt-in
Chunk access (start, end)NoYesNo
Custom grain sizeNoWithGrainWithGrain
BoilerplateLowMediumMedium

These APIs run independent tasks in parallel, differing in flexibility and guarantees:

join — Fixed tasks with results

const result = blitz.join(.{
.a = .{ computeA, arg_a },
.b = .{ computeB, arg_b },
});
// result.a, result.b

Best for: 2-8 tasks with known types and return values. Supports heterogeneous return types. Works with recursive divide-and-conquer.

scope + spawn — Dynamic task set

blitz.scope(struct {
fn run(s: *blitz.Scope) void {
s.spawn(task1);
s.spawn(task2);
if (condition) s.spawn(task3);
}
}.run);

Best for: Variable number of tasks decided at runtime. Tasks are void (no return values). Maximum 64 tasks.

spawn — Fire-and-forget

blitz.spawn(struct {
fn run() void { writeLog(); }
}.run);
// Returns immediately

Best for: Background work where the caller does not need the result.

Summary:

Featurejoinscopespawn
Max tasks8641
Return valuesYes (per task)NoNo
Wait for completionYesYes (at scope exit)No
Dynamic task countNo (comptime)Yes (runtime)N/A
RecursiveYesNoNo
Error variantstryJoinNoNo

iter() aggregations are best for standard operations:

const sum = blitz.iter(i64, data).sum();
const count = blitz.iter(i64, data).count();

parallelReduce is best for custom reductions:

// Dot product (no built-in for this)
const dot = blitz.parallelReduce(
f64, a.len, 0.0,
DotCtx, .{ .a = a, .b = b },
struct {
fn map(ctx: DotCtx, i: usize) f64 {
return ctx.a[i] * ctx.b[i];
}
}.map,
struct {
fn add(x: f64, y: f64) f64 { return x + y; }
}.add,
);
Featureiter()parallelReduce
sum, min, max, countBuilt-inManual
Custom reductionreduce(identity, combineFn)Full control
Map + reduceLimitedBuilt-in (map + combine)
Error handlingNotryReduce
Grain controlNoWithGrain

Every core API has an error-safe variant:

InfallibleFallibleGuarantee
jointryJoinAll tasks complete before error propagates
parallelFortryForEachAll chunks complete before error propagates
parallelReducetryReduceAll reductions complete before error propagates

Use try* variants when your parallel body can return a Zig error. See Error Handling for details.

Do you have an array/slice to process?
├── Yes
│ ├── Need sum/min/max/count?
│ │ └── iter().sum() / .min() / .max() / .count()
│ ├── Need to search/check?
│ │ └── iter().findAny() / .any() / .all()
│ ├── Need to map T[] -> U[]?
│ │ └── parallelCollect()
│ ├── Need to transform in place?
│ │ └── parallelMapInPlace() or iterMut().mapInPlace()
│ ├── Need to sort?
│ │ └── sortAsc() / sortDesc() / sort() / sortByKey()
│ ├── Need custom reduction?
│ │ └── parallelReduce()
│ └── Need chunk-based processing?
│ └── parallelFor()
├── Do you have independent tasks?
│ ├── 2-8 tasks with return values?
│ │ └── join()
│ ├── Dynamic number of tasks (up to 64)?
│ │ └── scope() + spawn()
│ ├── Tasks can fail?
│ │ └── tryJoin()
│ └── Fire-and-forget?
│ └── spawn()
├── Need to combine/flatten results?
│ ├── Merge slices into one?
│ │ └── parallelFlatten()
│ └── Scatter to indexed positions?
│ └── parallelScatter()
└── Need per-thread operations?
└── broadcast()
APIOverheadScalabilityBest Data Size
iter().sum()Very lowExcellent> 10K
iter().findAny()Very lowExcellent (early exit)Any
parallelForLowExcellent> 10K
parallelReduceLowExcellent> 10K
parallelCollectLowExcellent> 10K
join (2 tasks)~10 ns2x maxN/A
join (8 tasks)~50 ns8x maxN/A
scope (N tasks)~100 nsNx maxN/A
sortMediumGood> 10K
parallelFlattenLowGood> 100 slices
broadcastLowLinearN/A
// Matrix operations
const dot = blitz.parallelReduce(f64, n, 0.0, ctx_type, ctx, mapFn, addFn);
blitz.parallelFor(n, ctx_type, ctx, matmulBody);
// Statistical analysis
const result = blitz.join(.{
.mean = .{ computeMean, data },
.stddev = .{ computeStdDev, data },
.median = .{ computeMedian, data },
});
// ETL pipeline
blitz.parallelCollect(RawRecord, CleanRecord, raw, clean, void, {}, cleanFn);
blitz.sortByKey(CleanRecord, u64, clean, timestampKey);
const total = blitz.iter(CleanRecord, clean).reduce(0, sumRevenue);
// Entity updates
blitz.parallelFor(entities.len, EntityCtx, ctx, updatePhysics);
blitz.parallelFor(entities.len, EntityCtx, ctx, updateAI);
// Spatial queries
const nearest = blitz.iter(Entity, entities).minByKey(f64, distToPlayer);
// Per-pixel transform
blitz.parallelMapInPlace(Pixel, pixels, BrightnessCtx, ctx, adjustBrightness);
// Histogram
const histogram = blitz.parallelReduce(
[256]u32, pixels.len, .{0} ** 256,
[]const Pixel, pixels, histMap, histCombine,
);