Skip to content

Eval bug: llama-cli, spurious token added to assistant response #13402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
matteoserva opened this issue May 9, 2025 · 1 comment
Open

Eval bug: llama-cli, spurious token added to assistant response #13402

matteoserva opened this issue May 9, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@matteoserva
Copy link
Contributor

Name and Version

version: 5327 (27ebfca)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

nvidia

Models

all

Problem description & steps to reproduce

After the user prompt is provided, the code enters this branch:

LOG_DBG("embd_inp.size(): %d, n_consumed: %d\n", (int) embd_inp.size(), n_consumed);

No new tokens are generated.

However, the following code assumes that there is a new token and it is inserted in the assistant response:

assistant_ss << common_token_to_piece(ctx, id, false);

First Bad Commit

No response

Relevant log output

The easiest way is to set a breakpoint here and wait for the assistant message:

https://github.com/ggml-org/llama.cpp/blob/0cf6725e9f9a164c39f7a87214d60342f7f946d8/tools/main/main.cpp#L270
@CISC CISC added bug Something isn't working and removed bug-unconfirmed labels May 9, 2025
@CISC
Copy link
Collaborator

CISC commented May 9, 2025

I noticed this char before, I always just assumed it was a spurious prompt print (since most templates end with >, but I see now that it's repeating the last processed token of the template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants