Skip to content

wrong name/email parsing when email is empty #833

Closed
@ghost

Description

in file:

https://github.com/gitpython-developers/GitPython/blob/master/git/util.py

line 537 and 538


name_only_regex = re.compile(r'<(.+)>')
name_email_regex = re.compile(r'(.*) <(.+?)>')

since the email regexp ask for one or more characters non-greedy, the regexp will grab

"name <>"
as
name="name <>" email=None

the problem is that the right parse is name="name" email=""

which made my data pipeline produce wrong results, changing those two lines for


name_only_regex = re.compile(r'<(.*)>')
name_email_regex = re.compile(r'(.*) <(.*?)>')

that fixes the problem

for those like me who need something working right now, a workaround is to use git natively
author_name = os.popen(f"cd '{path}'; git --no-pager show -s --format='%aN' {sha1}").read()[0:-1]
author_email = os.popen(f"cd '{path}'; git --no-pager show -s --format='%aE' {sha1}").read()[0:-1]
commit_name = os.popen(f"cd '{path}'; git --no-pager show -s --format='%cN' {sha1}").read()[0:-1]
commit_email = os.popen(f"cd '{path}'; git --no-pager show -s --format='%cE' {sha1}").read()[0:-1]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions