Back to feed

b7513

Dec 22, 2025
Meta/llama.cppCLIvb7513

ggml-hexagon: gelu optimization (#18151)

  • feat: working gelu with src0 put on vtcm

  • feat: gelu ping-pong for both in and out

  • fix: fixu compile error

  • break: distinguish dma ddr->vtcm and vtcm->ddr operation

  • fix: fix dma queue size

  • break: update dma api to either pop src or dst ptr

  • fix: fix activation vtcm allocation issue for src1 when swapperd

  • refactor: ping-pong gelu logic to avoid unnecessary if else

  • dma: improved queue interface and prefetch handling

  • gelu: fix N+2 block prefetch


Co-authored-by: Max Krasnyansky maxk@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler: