Back to feed

b8189

Mar 2, 2026
Meta/llama.cppCLIvb8189

ggml webgpu: Clean up per-thread parameter buffer pool and job submission logic (#19772)

  • Allow webgpu_buf_pool to resize if needed, remove inflight_threads, and replace inflight_threads with num_kernels for submission

  • Run clang-format

  • Keep track of num batched kernels that have not been submitted yet

  • Run clang-format

  • Increase buf pool max size

  • Increase param buf pool init size

  • Remove webgpu buf pool resizing

  • Merge with master

  • Add buffer pool growth

  • Move buffer pool growth outside of lock

  • Reduce max pool size to 32

  • Run clang-format

  • Only resize param buf pool

macOS/iOS:

Linux:

Windows:

openEuler: