Skip to content

Git checkout of large repositories is very slow #87

Closed
@kaidokert

Description

@kaidokert

Windows Build Number

10.0.18363.0

Processor Architecture

AMD64

Memory

200 Gb

Storage Type, free / capacity

SSD 200GB/ 1TB

Relevant apps installed

git version 2.31.1.windows.1

Traces collected via Feedback Hub

N/A

Isssue description

Checking out large repos even from a local mirror is slow, compared to Linux / Mac.

Even more importantly, so is switching branches / tags.

Steps to reproduce

Let's download a sample well-known repo, about 23 Gb

git clone --mirror https://github.com/chromium/chromium.git chromium-mirror

Now let's check out a source tree from local mirror:

Powershell, NTFS drive:

PS> Measure-Command { git clone chromium-mirror chromium }                                                                                                                                                                                        
Cloning into 'chromium'...                                                                                                                                                                                                                                                
done.                                                                                                                                                                                                                                                                     
Updating files: 100% (362149/362149), done.                                                                                                                                                                                                                               
                                                                                                                                                                                                                                                                          
TotalMinutes      : 12.370800555                                                                                                                                                                                                                                          

Just over 12 minutes.

On Linux, ext4, similar hardware

$ time git clone chromium-mirror chrome
Cloning into 'chrome'...
done.
Updating files: 100% (362149/362149), done.
real	0m23.937s

About 24 seconds.

Now, let's check out a bit older tag:
Powershell:

PS> Measure-Command { git checkout 60.0.3072.0 }                                                                                                                                                                                         
Updating files: 100% (539180/539180), done.                                                                                                                                                                                                                               

TotalMinutes      : 15.750022275                                                                                                                                                                                                                                          

15 minutes to switch a tag.

On Linux:

time git checkout 60.0.3072.0
Updating files: 100% (539180/539180), done.
Note: switching to '60.0.3072.0'.
..
real	0m22.045s

Again, about 22 seconds

Finally, let's delete these experiment directories:

Powershell:

PS> Measure-Command { rm -r chromium }
...
TotalMinutes      : 7.40425179833333                                                                                                                                                                                                                                      

7 minutes

Linux:

$ time rm -rf chrome
real	0m5.580s

5 seconds

Expected Behavior

Would expect checkout speed on similar disks to be at least on the same order of magnitude.

Actual Behavior

The operations in this example are about 25-30x slower, on almost identical hardware.

Of course the problem doesn't seem inherent to Git, it's a similar I/O problem when working with large directory trees with many files, i.e. Node node_modules issues ( #21 ) and others ( #17 #27 ), as evidenced by the fact that rm -r took over 60x longer.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions