Back to feed

b8811

Apr 16, 2026
Meta/llama.cppCLIvb8811

ggml-webgpu: compute pass batching and removing profiling overhead (#21873)

  • Update register tiling matmul to use f32 accumulation

  • fix profiling code

  • Fix register tiling matmul for chrome, i'm blaming dawn

  • Update batch tuning value for iOS

  • compile fix

  • Fix use of new load function

  • Move to a single query set for GPU profiling

  • Move to batching compute passes when not profiling

  • Refactor build_multi

  • remove iOS throttling now that we're batching compute passes

macOS/iOS:

Linux:

Windows:

openEuler: