Token count discrepancy when using Qwen2.5-VL with multiple images #3177

JjjFangg · 2025-04-15T13:32:20Z

System Info

I'm experiencing a token count discrepancy when using the Qwen2.5-VL model.

I uploaded 5 images with a resolution of 2048 x 1365 under the setting max_pixels=16384, and the total token count reported was around 71,000 tokens. However, when I use the Qwen2.5-VL processor locally to preprocess the same images, the total token count is only around 18,000 tokens.

This significant difference is causing inference failures on hosted endpoints due to token limits.

Could you please help clarify:

Why the token count is so much higher via the endpoint?

Is there a recommended way to align the endpoint token count with local processing?

Any way to reduce the token usage on multi-image inputs?

Thanks in advance!

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

chat_completion = client.chat.completions.create(
model="tgi",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
}
},
{
"type": "text",
"text": "Describe this image in one sentence."
}
]
}
],
top_p=None,
temperature=None,
max_tokens=150,
stream=True,
seed=None,
stop=None,
frequency_penalty=None,
presence_penalty=None
)

Expected behavior

Total token is about 18,000.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token count discrepancy when using Qwen2.5-VL with multiple images #3177

Token count discrepancy when using Qwen2.5-VL with multiple images #3177

JjjFangg commented Apr 15, 2025

Token count discrepancy when using Qwen2.5-VL with multiple images #3177

Token count discrepancy when using Qwen2.5-VL with multiple images #3177

Comments

JjjFangg commented Apr 15, 2025

System Info

Information

Tasks

Reproduction

Expected behavior