b7705
debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check (#18692)
- debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check
This commit updates the pooling check in the debug example to also include LLAMA_POOLING_TYPE_UNSPECIFIED and not just LLAMA_POOLING_TYPE_NONE.
- debug : normalize both pooled and token embeddings
This commit updates debug.cpp to normalize embeddings for both pooled and non-pooled outputs. For pooled embeddings, normalization is applied to the single vector, and for non-pooled embeddings, normalization is applied to each token embedding vector individually.
The motivation for this is to enable non-pooled embeddings to be normalized which was not possible previously.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: