b8238
llama: end-to-end tests (#19802)
tests: add end-to-end tests per model architecture
fixup for rebase
fix use-after-free in llama-model-loader.cpp
fix CI
fix WebGPU
fix CI
disable CI for macOS-latest-cmake-arm64
use expert_weights_scale only if != 0.0f
comments
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: