Back to feed

b8857

Apr 20, 2026
Meta/llama.cppCLIvb8857

ggml-webgpu: updated matrix-vector multiplication (#21738)

  • merged properly, but slow q3_k and q5_k with u32 indexing

  • Start on new mat-vec

  • New format float paths working

  • Working q4_0

  • Work on remaining legacy q-types

  • port k-quants to new matvec

  • remove old shader

  • Remove old constants, format

  • remove accidental file


Co-authored-by: Neha Abbas nehaabbas@ReeseLevines-MacBook-Pro.local Co-authored-by: Reese Levine reeselevine1@gmail.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler: