b7432
[!WARNING] Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
Optimization: Qwen3 next autoregressive pass (#17996)
It's Qwen3 Next, the lean mean token generation machine!
Apply patches from thread
Remove recurrent version, only keep chunked and autoregressive
Remove unnecessary conts and asserts
Remove more extra conts and asserts
Cleanup masking
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12)
- Windows x64 (CUDA 13)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: