b7748
llama : add adaptive-p sampler (#17927)
initial commit for branch
simplify constants
add params to
struct common_params_sampling, add reference to PRexplicitly clamp
min_targetandmax_targetto[0.0, 1.0]add args, rename
queue_size->window_sizeimproved comments
minor
remove old unused code from algorithm
minor
add power law case to
common_sampler_init, add sampler name mappingsclarify behaviour when
window_size = 0add missing enums
remove
target_rangeparam, maketarget == 1no-op, cleanup codeoops, straggler
add missing parameters in
server-task.cppcopy from author
ref: https://gist.github.com/MrJackSpade/9be99c7efbba7b95a41377e123b7b069
remove old debug log, style nit
fix compiler warning, add commented-out logging per token
re-write + change parameters + simplify
oops forgot args.cpp
fix leftover
window_sizeadd missing values to
common_params_sampling::print()with logging
does this fix it?
no, but does this?
update default decay
optimize
fix bad merge
my git skills are lacking
silence
missing initializer for memberupdate default decay to 0.9
fix logging
format (double)
add power law to the new
samplersvectorlog sampler init values
improve logging messages in llama_sampler_power_law
remove extraneous logging
simplify target computation
last commit with debug logging!
remove debug logging, explicitly clamp params at init
add
use_power_lawflag + logic, minor cleanupupdate
power-law->adaptive-pfix cold start EMA
ctx->weighted_sumis now initialized and reset totarget / (1.0f - clamped_decay)ctx->total_weightis now initialized and reset to1.0f / (1.0f - clamped_decay)
this fixes a "cold start" problem with the moving average
update
SHARPNESSconstant to10.0fminor style fixes
no functional changes
minor style fixes cont.
update
llama_sampler_adaptive_p_ifor backend sampling (ref: #17004)separate into
apply+acceptfunctionspending_token_idx: switch fromllama_tokentoint32
functionally identical (llama.h has typedef int32_t llama_token;),
but its more correct now
don't transform logits <= -1e9f
fix masking in backend top-p, min-p
address review comments
typo in comments
RND->RNGadd docs
add recommended values in completion docs
address PR feedback
remove trailing whitespace (for CI
editorconfig)add to adaptive-p to
common_sampler_types_from_chars
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: