ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 14.9k
Star 95.1k

Code
Issues 405
Pull requests 716
Discussions
Actions
Projects 1
Wiki
Security 10
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 90 Milestones 0

New pull request New

716 Open 8,954 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

cuda : enable CUDA graphs for MMID 1 <= BS <= 4 ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#19645 opened Feb 15, 2026 by ggerganov

Loading…

1 task

graph : fix KQ mask, lora, cvec reuse checks

#19644 opened Feb 15, 2026 by ggerganov

Loading…

common : fix Step-3.5-Flash format detection and thinking support testing

Everything test related

#19635 opened Feb 15, 2026 by jesseposner

Loading…

Vulkan Scalar Flash Attention Refactor ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#19625 opened Feb 14, 2026 by 0cc4m • Draft

[server] save generated text for the /slots endpoint (for LLAMA_SERVER_SLOTS_DEBUG=1) examples server

#19622 opened Feb 14, 2026 by matteoserva

Loading…

Fix gpt-oss tool calling: pass tool args and tool responses as json

#19620 opened Feb 14, 2026 by matteoserva

Loading…

[WIP] refactor llama-quant.cpp examples

#19616 opened Feb 14, 2026 by ddh0 • Draft

opencl: optimize mean and sum_row kernels ggml

changes relating to the ggml tensor library for machine learning

OpenCL

Issues specific to the OpenCL backend

#19614 opened Feb 14, 2026 by shaofeiqi • Draft

fix: GLM 4.5 streaming tool-call parsing + grammar error handling examples server testing

Everything test related

#19612 opened Feb 14, 2026 by Gunther-Schulz

Loading…

Add support for Tiny Aya Models python

python script changes

#19611 opened Feb 14, 2026 by saurabhdash2512

Loading…

ggml: ggml-cpu: force-no-lto-for-cpu-feats ggml

changes relating to the ggml tensor library for machine learning

#19609 opened Feb 13, 2026 by talhaHavadar

Loading…

llama : add group feature to split-mode to minimize GPU spread for running a model examples server

#19608 opened Feb 13, 2026 by dan-and

Loading…

metal: use mul_mv_ext for large n on non-simdgroup_mm GPUs Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

ggml

changes relating to the ggml tensor library for machine learning

#19600 opened Feb 13, 2026 by ai-janitor

Loading…

3 of 4 tasks

models : deduplicate delta-net graphs for Qwen family model

Model specific

#19597 opened Feb 13, 2026 by ggerganov

Loading…

2 tasks

mtmd : chat : Fix extra \n between text and media marker examples server

#19595 opened Feb 13, 2026 by tdakhran

Loading…

Add a build target to generate ROCm artifacts using ROCm 7.11 devops

improvements to build systems and github actions

#19594 opened Feb 13, 2026 by superm1

Loading…

docs : explicit about banning accounts that violates policy

#19593 opened Feb 13, 2026 by ngxson

Loading…

Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#19591 opened Feb 13, 2026 by superm1

Loading…

WASM Relaxed SIMD Enhancement ggml

changes relating to the ggml tensor library for machine learning

#19590 opened Feb 13, 2026 by JeremyCEY

Loading…

hexagon : fix build release (#19444) build

Compilation issues

#19587 opened Feb 13, 2026 by mengshengwu

Loading…

server: add Anthropic-compatible cache_read_input_tokens to usage metrics examples python

python script changes

server

#19572 opened Feb 12, 2026 by wrapss

Loading…

1 task done

[CMake] Enable test-chat out of tree build testing

Everything test related

#19558 opened Feb 12, 2026 by jplehr

Loading…

webui: add lazy markdown loading for improved performance examples server

#19557 opened Feb 12, 2026 by shem-aleph

Loading…

1 task done

templates : fix double-escaping in gpt-oss tool call arguments and responses

#19553 opened Feb 12, 2026 by yurekami

Loading…

3 tasks

make ggml_is_view as API ggml

changes relating to the ggml tensor library for machine learning

#19539 opened Feb 12, 2026 by foldl

Loading…

Previous 1 2 3 4 5 … 28 29 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!