Skip to content

unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found #467

Open
@Godlovecui

Description

@Godlovecui

System Info

8*RTX4090, 24G
tensorrt_llm version: 0.11.0.dev2024051400

Who can help?

@t

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

export HF_LLAMA_MODEL=/network/model/Meta-Llama-3-8B
export ENGINE_PATH=/network/engine/engine_outputs_llama3_8B
python3 tools/fill_template.py -i llama_ifb/preprocessing/config.pbtxt tokenizer_dir:${HF_LLAMA_MODEL},triton_max_batch_size:64,preprocessing_instance_count:1
python3 tools/fill_template.py -i llama_ifb/postprocessing/config.pbtxt tokenizer_dir:${HF_LLAMA_MODEL},triton_max_batch_size:64,postprocessing_instance_count:1
python3 tools/fill_template.py -i llama_ifb/tensorrt_llm_bls/config.pbtxt triton_max_batch_size:64,decoupled_mode:False,bls_instance_count:1,accumulate_tokens:False
python3 tools/fill_template.py -i llama_ifb/ensemble/config.pbtxt triton_max_batch_size:64
python3 tools/fill_template.py -i llama_ifb/tensorrt_llm/config.pbtxt triton_backend:tensorrtllm,triton_max_batch_size:64,decoupled_mode:False,max_beam_width:1,engine_dir:${ENGINE_PATH},max_tokens_in_paged_kv_cache:2560,max_attention_window_size:2560,kv_cache_free_gpu_mem_fraction:0.5,exclude_input_in_output:True,enable_kv_cache_reuse:False,batching_strategy:inflight_fused_batching,max_queue_delay_microseconds:0

pip install SentencePiece
python3 scripts/launch_triton_server.py --world_size 8 --model_repo=llama_ifb/

Expected behavior

The triton server can be run correctly!

actual behavior

When I deploy llama3 (Meta-Llama-3-8B) in 8*RTX4090, it raises below error:

image
how to fix it? thanks~

additional notes

NO

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions