Eval bug: llama-cli, spurious token added to assistant response #13402

matteoserva · 2025-05-09T11:53:14Z

Name and Version

version: 5327 (27ebfca)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

nvidia

Models

all

Problem description & steps to reproduce

After the user prompt is provided, the code enters this branch:

llama.cpp/tools/main/main.cpp

Line 716 in 0cf6725

    
           LOG_DBG("embd_inp.size(): %d, n_consumed: %d\n", (int) embd_inp.size(), n_consumed);

No new tokens are generated.

However, the following code assumes that there is a new token and it is inserted in the assistant response:

llama.cpp/tools/main/main.cpp

Line 824 in 0cf6725

assistant_ss << common_token_to_piece(ctx, id, false);

First Bad Commit

No response

Relevant log output

The easiest way is to set a breakpoint here and wait for the assistant message:

https://github.com/ggml-org/llama.cpp/blob/0cf6725e9f9a164c39f7a87214d60342f7f946d8/tools/main/main.cpp#L270

The text was updated successfully, but these errors were encountered:

CISC · 2025-05-09T12:24:44Z

I noticed this char before, I always just assumed it was a spurious prompt print (since most templates end with >, but I see now that it's repeating the last processed token of the template.

matteoserva added the bug-unconfirmed label May 9, 2025

CISC added bug Something isn't working and removed bug-unconfirmed labels May 9, 2025

matteoserva mentioned this issue May 9, 2025

Support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client #13196

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: llama-cli, spurious token added to assistant response #13402

Eval bug: llama-cli, spurious token added to assistant response #13402

matteoserva commented May 9, 2025

CISC commented May 9, 2025 •

edited

Loading

Eval bug: llama-cli, spurious token added to assistant response #13402

Eval bug: llama-cli, spurious token added to assistant response #13402

Comments

matteoserva commented May 9, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

CISC commented May 9, 2025 • edited Loading

CISC commented May 9, 2025 •

edited

Loading