Skip to content

Misc. bug: error in remote conversion for the new ServiceNow Nemotron 15B model #13354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pwilkin opened this issue May 7, 2025 · 0 comments

Comments

@pwilkin
Copy link

pwilkin commented May 7, 2025

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3080, compute capability 8.6, VMM: yes
version: 5289 (15a28ec)
built with cc (Ubuntu 14.2.0-4ubuntu2) 14.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

Python/Bash scripts

Command line

convert_hf_to_gguf.py --remote ServiceNow-AI/Apriel-Nemotron-15b-Thinker

Problem description & steps to reproduce

Not sure if it's a problem with the converter, HuggingFace or the model, but just leaving it here. Local conversion (after downloading the model with huggingface_hub) works fine.

INFO:hf-to-gguf:Using remote model with HuggingFace id: ServiceNow-AI/Apriel-Nemotron-15b-Thinker
Traceback (most recent call last):
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 6104, in <module>
    main()
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 6098, in main
    model_instance.write()
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 401, in write
    self.prepare_tensors()
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 1933, in prepare_tensors
    super().prepare_tensors()
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 259, in prepare_tensors
    for name, data_torch in chain(self.generate_extra_tensors(), self.get_tensors()):
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 107, in get_remote_tensors
    remote_tensors = gguf.utility.SafetensorRemote.get_list_tensors_hf_model(remote_hf_model_id)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/devel/tools/llama.cpp/gguf-py/gguf/utility.py", line 134, in get_list_tensors_hf_model
    index_json = json.loads(index_str)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 35495 (char 35494)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant