Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

readme: Fix typo in InternVL model name documentation Improvements or additions to documentation
#13440 opened May 10, 2025 by 99991 Loading…
CUDA: fix crash with partial offloading of MoE ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13439 opened May 10, 2025 by JohannesGaessler Loading…
CUDA: fix race conditions in FlashAttention kernels ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13438 opened May 10, 2025 by JohannesGaessler Loading…
tools : fix invalid free() examples server
#13436 opened May 10, 2025 by aumfer Loading…
CUDA: faster Deepseek FA, add Turing support ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13435 opened May 10, 2025 by JohannesGaessler Loading…
Webui dynamic config examples server
#13429 opened May 10, 2025 by ServeurpersoCom Loading…
vocab : add ByteDance-Seed/Seed-Coder python python script changes
#13423 opened May 10, 2025 by CISC Loading…
sycl: enable dpcpp nightly builds with oneMKL and oneDNN ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13406 opened May 9, 2025 by AD2605 Loading…
Add --disable-op-offload to improve -ot pp perf in MoE models like llama4 400B examples ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#13386 opened May 8, 2025 by hjc4869 Loading…
arg : add model catalog
#13385 opened May 8, 2025 by ngxson Draft
sycl: simplify bin_bcast_kernel ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13383 opened May 8, 2025 by AD2605 Loading…
musa: restore MUSA graph settings in CMakeLists.txt ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13382 opened May 8, 2025 by yeahdongcn Draft
gguf-py: Optimize GGUFReader read-only mode performance python python script changes
#13378 opened May 8, 2025 by Isotr0py Loading…
llama: Fix typos in multiple files ggml changes relating to the ggml tensor library for machine learning Kompute https://github.com/KomputeProject/kompute/ python python script changes SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13369 opened May 8, 2025 by co63oc Loading…
CUDA: update build CTK version to 12.8 devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13360 opened May 7, 2025 by thevishalagarwal Loading…
SYCL: Fix test-backend-ops crashes with SYCL-Graph ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13357 opened May 7, 2025 by EwanC Draft
common: add default reranker presets
#13352 opened May 7, 2025 by ca1yz Loading…
python : bump transformers version python python script changes
#13351 opened May 7, 2025 by ngxson Loading…
Add mistral-chat-7b preset for llama-server examples
#13348 opened May 7, 2025 by vahedshaik Loading…
Support Sp token Function Call Token Implementation
#13339 opened May 6, 2025 by glide-the Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.