-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cuda : enable CUDA graphs for MMID 1 <= BS <= 4
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19645
opened Feb 15, 2026 by
ggerganov
Loading…
1 task
common : fix Step-3.5-Flash format detection and thinking support
testing
Everything test related
#19635
opened Feb 15, 2026 by
jesseposner
Loading…
[server] save generated text for the /slots endpoint (for LLAMA_SERVER_SLOTS_DEBUG=1)
examples
server
#19622
opened Feb 14, 2026 by
matteoserva
Loading…
Fix gpt-oss tool calling: pass tool args and tool responses as json
#19620
opened Feb 14, 2026 by
matteoserva
Loading…
opencl: optimize mean and sum_row kernels
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
fix: GLM 4.5 streaming tool-call parsing + grammar error handling
examples
server
testing
Everything test related
#19612
opened Feb 14, 2026 by
Gunther-Schulz
Loading…
Add support for Tiny Aya Models
python
python script changes
#19611
opened Feb 14, 2026 by
saurabhdash2512
Loading…
ggml: ggml-cpu: force-no-lto-for-cpu-feats
ggml
changes relating to the ggml tensor library for machine learning
#19609
opened Feb 13, 2026 by
talhaHavadar
Loading…
llama : add group feature to split-mode to minimize GPU spread for running a model
examples
server
#19608
opened Feb 13, 2026 by
dan-and
Loading…
metal: use mul_mv_ext for large n on non-simdgroup_mm GPUs
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#19600
opened Feb 13, 2026 by
ai-janitor
Loading…
3 of 4 tasks
models : deduplicate delta-net graphs for Qwen family
model
Model specific
#19597
opened Feb 13, 2026 by
ggerganov
Loading…
2 tasks
mtmd : chat : Fix extra \n between text and media marker
examples
server
#19595
opened Feb 13, 2026 by
tdakhran
Loading…
Add a build target to generate ROCm artifacts using ROCm 7.11
devops
improvements to build systems and github actions
#19594
opened Feb 13, 2026 by
superm1
Loading…
docs : explicit about banning accounts that violates policy
#19593
opened Feb 13, 2026 by
ngxson
Loading…
Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19591
opened Feb 13, 2026 by
superm1
Loading…
WASM Relaxed SIMD Enhancement
ggml
changes relating to the ggml tensor library for machine learning
#19590
opened Feb 13, 2026 by
JeremyCEY
Loading…
hexagon : fix build release (#19444)
build
Compilation issues
#19587
opened Feb 13, 2026 by
mengshengwu
Loading…
server: add Anthropic-compatible python script changes
server
cache_read_input_tokens to usage metrics
examples
python
#19572
opened Feb 12, 2026 by
wrapss
Loading…
1 task done
[CMake] Enable test-chat out of tree build
testing
Everything test related
#19558
opened Feb 12, 2026 by
jplehr
Loading…
webui: add lazy markdown loading for improved performance
examples
server
#19557
opened Feb 12, 2026 by
shem-aleph
Loading…
1 task done
templates : fix double-escaping in gpt-oss tool call arguments and responses
#19553
opened Feb 12, 2026 by
yurekami
Loading…
3 tasks
make changes relating to the ggml tensor library for machine learning
ggml_is_view as API
ggml
#19539
opened Feb 12, 2026 by
foldl
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.