Back to feed

b9055

May 7, 2026
Meta/llama.cppCLIvb9055

model: Add Mimo v2.5 model support (#22493)

  • add mimo-v2.5 support

  • mimo-v2.5: fix modify_tensors row split

  • mimi-v2.5: forgot add_attn_value_scale plumbing

  • mimi-v2.5: fix tp dequant to detect tp rows

  • mimo-v2.5: fix TP iteration to be descending

  • mimo-v2.5: fix comment

  • mimo-v2.5: retain fused qkv

  • mimo-v2.5: missed the attn_value scale during merge

  • mimo-v2.5: fused QKV needs contiguous for scaling attention value

  • mimo-v2.5: move speech_embeddings. to TextModel filter_tensors

  • Update src/llama-hparams.h

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • mimo-v2.5: include MTP weights in gguf

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler: