Back to feed

b8053

Feb 14, 2026
Meta/llama.cppCLIvb8053

models : optimize qwen3next graph (#19375)

  • models : optimizing qwen3next graph

  • cont

  • wip

  • wip

  • wip

  • wip

  • wip

  • wip

  • wip

  • wip

  • wip

  • wip

  • cont : remove redundant q, g chunking

  • minor

  • minor

  • avoid passing masks around

  • avoid concats during chunking

  • naming + shapes

  • update names and use prefix to disable CUDA graphs

macOS/iOS:

Linux:

Windows:

openEuler: