b8116
quantize : add --dry-run option (#19526)
clean slate for branch
use 6 characters for tensor dims
add --dry-run to llama-quantize
use 6 characters for tensor dims (cont.)
no need to re-calculate ggml_nbytes for tensor
fix indent
show model and quant BPW when quant completes
add example to --help
new function
tensor_requires_imatrix, add courtesy warning about imatrixmissing func, move imatrix flag set
logic error
fixup tensor_requires_imatrix
add missing
GGML_TYPEssimplify and rename
tensor_type_requires_imatrixsimplify for style
add back Q2_K edge case for imatrix
guard ftype imatrix warning
comment ref #12557
remove per @compilade
remove unused
paramsparametermove
bool dry_runper GGmove
bool dry_runper GGUpdate src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
- Update src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
- Update src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: