Back to feed

b8400

Mar 17, 2026
Meta/llama.cppCLIvb8400

hexagon: add neg, exp, sigmoid, softplus ops, cont, repeat ops (#20701)

Add element-wise unary ops needed by Qwen 3.5's DeltaNet linear attention layers. These ops follow the existing unary-ops pattern with VTCM DMA double-buffering.

  • neg: negate via scale by -1.0
  • exp: uses existing hvx_exp_f32 HVX intrinsics
  • sigmoid: uses existing hvx_sigmoid_f32_aa HVX intrinsics
  • softplus: log(1 + exp(x)) scalar fallback
  • CONT reuses the existing CPY infrastructure since making a tensor contiguous is equivalent to a same-type copy.
  • REPEAT implements tiled memory copy with multi-threaded execution via the worker pool, supporting f32 and f16 types. The kernel parallelizes across output rows and uses memcpy for each tile.

Co-authored-by: Max Krasnyansky maxk@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler: