Skip to content

Error in load_labels: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128) #13310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
swevrywhere opened this issue Apr 28, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@swevrywhere
Copy link

Describe the bug

A clear and concise description of what the bug is.

Steps/Code to reproduce bug

I'm seeing the following error.

File "/usr/local/lib/python3.10/dist-packages/nemo_text_processing/text_normalization/normalize.py", line 172, in __init__
  self.tagger = ClassifyFst(
 File "/usr/local/lib/python3.10/dist-packages/nemo_text_processing/text_normalization/fr/taggers/tokenize_and_classify.py", line 85, in __init__
  self.whitelist = WhiteListFst(input_case=input_case, deterministic=deterministic, input_file=whitelist)
 File "/usr/local/lib/python3.10/dist-packages/nemo_text_processing/text_normalization/fr/taggers/whitelist.py", line 44, in __init__
  graph = _get_whitelist_graph(input_case, get_abs_path("data/whitelist.tsv"))
 File "/usr/local/lib/python3.10/dist-packages/nemo_text_processing/text_normalization/fr/taggers/whitelist.py", line 38, in _get_whitelist_graph
  whitelist = load_labels(file)
 File "/usr/local/lib/python3.10/dist-packages/nemo_text_processing/text_normalization/fr/utils.py", line 41, in load_labels
  labels = list(csv.reader(label_tsv, delimiter="\t"))
 File "/usr/lib/python3.10/encodings/ascii.py", line 26, in decode
  return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128)

The issue is similar to this: #3452

Expected behavior

The expected behavior is to load without an error.

Environment overview (please complete the following information)

Installed with pip.

Additional context

I already have a proposed fix. I'll be submitting a pull request shortly.

@swevrywhere
Copy link
Author

Pull request with the fix here: NVIDIA/NeMo-text-processing#272

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant