b8811

Apr 16, 2026

Meta/llama.cppCLIvb8811

ggml-webgpu: compute pass batching and removing profiling overhead (#21873)

Update register tiling matmul to use f32 accumulation
fix profiling code
Fix register tiling matmul for chrome, i'm blaming dawn
Update batch tuning value for iOS
compile fix
Fix use of new load function
Move to a single query set for GPU profiling
Move to batching compute passes when not profiling
Refactor build_multi
remove iOS throttling now that we're batching compute passes

macOS/iOS:

Linux:

Windows:

openEuler:

← Back to feed