b7703

Jan 11, 2026

Meta/llama.cppCLIvb7703

model: try to improve Qwen3 Next (#18683)

qwen3next: simplify qkvz projection
use ggml_swiglu_split
revert swiglu_split, but remove redundant repeat()
fix missing reshape
rm 2 redundant transposes
move mul_mat(k,q) to outside of chunking
rm redundant cont
improve g_cs_chunk
add comments about no cont
use std::pair instead of ggml_concat
vectorize key_gdiff calculation
rm unused tensor
avoid ggml_concat inside loop
bring back ggml_concat as it may not work on other backend
nits

macOS/iOS:

Linux:

Windows:

openEuler:

← Back to feed