Back to feed

b7809

Jan 22, 2026
Meta/llama.cppCLIvb7809

opencl: enable the general fp mm for non-cont input and as a fallback for specialized kqv kernel for adreno (#18970)

  • opencl: add copy_to_contiguous and utilize mm kernels

  • opencl: only copy to cont for f32 and f16 tensors

  • opencl: use cont mm for fallback when dst is large

  • opencl: use nb local to copy-to-cont

  • opencl: use local offset as well

macOS/iOS:

Linux:

Windows:

openEuler: