Back to feed

b7812

Jan 22, 2026
Meta/llama.cppCLIvb7812

mla : make the V tensor a view of K (#18986)

  • mla : pass V as a view of K to the FA op

  • cuda : adjust mla logic to new layout

  • kv-cache : fix rope shift

  • tests : remove comment

  • cuda : fix reusable_cutoff

Co-authored-by: Johannes Gäßler johannesg@5d6.de


Co-authored-by: Johannes Gäßler johannesg@5d6.de

macOS/iOS:

Linux:

Windows:

openEuler: