Back to feed

b9310

May 25, 2026
Meta/llama.cppCLIvb9310

server: fix checkpoints creation (#22929)

  • common : add common_chat_split_by_role

  • cont : fix spans to reach end of message

  • server: fix checkpoints creation

  • extract message_spans from chat templates
  • find the prompt token position before the latest user message
  • split prompt batching at that position
  • create a context checkpoint before the latest user input
  • avoid periodic mid-prompt checkpoints when that position is known
  • handle multimodal prompts when mapping text/template positions to server prompt tokens
  • add --checkpoint-min-step to control minimum spacing between checkpoints
  • cont : clean-up

  • Support autoparser detection for message barriers

  • server: fix message span delimiter and update docs


Co-authored-by: Alde Rojas hello@alde.dev Co-authored-by: Georgi Gerganov ggerganov@gmail.com Co-authored-by: Piotr Wilkin piotr.wilkin@syndatis.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI: