Skip to content
This repository was archived by the owner on Dec 19, 2020. It is now read-only.

Add test for unicode branch names. #245

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

richardxia
Copy link
Member

This adds a test case that demonstrates a bug when the following conditions line up:

  1. You have a Git repo with a branch containing a non-ASCII character
  2. Your shell's locale has a non-UTF-8 encoding
  3. You have a distribution of Python that defaults to a non-UTF-8 encoding if the locale-related environment variables (LANG, LC_ALL, others?) don't specify a locale with a UTF-8 encoding

More concretely, I can deterministically break this on Ubuntu 16.04 using the distribution-provided Python 3. I, however, cannot reproduce on my MacBook because Apple's Python 3 defaults to UTF-8.

You can double check if you have a Python installation that is capable of reproducing the problem by looking at the output of the following command:

$ LANG=C python3 -c 'import locale; print(locale.getpreferredencoding(False))'

On Ubuntu 16.04, it returns ANSI_X3.4-1968 for me. On my MacBook, it returns UTF-8.

The more detailed error message is:

  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/sifive/wit/lib/wit/__main__.py", line 11, in <module>
    main()
  File "/opt/sifive/wit/lib/wit/main.py", line 65, in main
    create(args)
  File "/opt/sifive/wit/lib/wit/main.py", line 126, in create
    update(ws, args)
  File "/opt/sifive/wit/lib/wit/main.py", line 308, in update
    ws.checkout(packages)
  File "/opt/sifive/wit/lib/wit/workspace.py", line 201, in checkout
    package.checkout(self.root)
  File "/opt/sifive/wit/lib/wit/package.py", line 149, in checkout
    self.repo.checkout(self.revision)
  File "/opt/sifive/wit/lib/wit/gitrepo.py", line 248, in checkout
    proc_ref = self._git_command("show-ref")
  File "/opt/sifive/wit/lib/wit/gitrepo.py", line 293, in _git_command
    cwd=cwd)
  File "/usr/lib/python3.5/subprocess.py", line 695, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/usr/lib/python3.5/subprocess.py", line 1072, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.5/subprocess.py", line 1754, in _communicate
    self.stdout.encoding)
  File "/usr/lib/python3.5/subprocess.py", line 976, in _translate_newlines
    data = data.decode(encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 113387: ordinal not in range(128)

The issue is that we're setting universal_newlines=True in our subprocess.run() calls, which will check for the encoding of the current locale using locale.getpreferredencoding(False). If Git prints out a character that cannot be encoded in ASCII, then Python in an ASCII locale will blow up trying to decode it into a Unicode string.

@jackkoenig
Copy link
Collaborator

Nice test branch name 🙂

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants