Clone huge 16 GB Git repo with Eclipse Neon

1.4k Views Asked by At

Is there any way I can clone a huge Git repository (16+ GB) using the Git integration of latest Eclipse Neon?

I'm cloning by HTTP connection.

First, I ran into timeouts, but then increased the Remote connection timeout to 1800 seconds in Eclipse config.

Then the cloning almost completed, but at the very end it always fails telling me Premature EOF.

I have increased the http.postBuffer to 524288000 also (as many users suggested on StackOverflow), but this was not much of a help.

I also tried cloning the master branch only, but again, I was stuck with the same error message.

Is EGit not capable of handling such a big repo over HTTP?

4

There are 4 best solutions below

0
On BEST ANSWER

Eventually, I ended up cloning the repository using a SSH connection.

This works fine, even from within Eclipse (using EGit).

I had to create a SSH key in Eclipse properties, since Putty's PPK format is not compatible with Eclipse. Then, I managed to clone the entire repository.

Seems like HTTP is not suited to download a chunk of 16+ GB. :)

1
On

Do you really have a code project that's 16GB? That's pretty crazy, man!

I think the least painful way to go about this, is to open your shell and just type git clone http://my-url/project.git. And then try to see if you can make the repository somewhat smaller.

1
On

Depending on what you want to do with the repo, a shallow clone may be the solution (it won't bring the full git history): https://www.perforce.com/blog/141218/git-beyond-basics-using-shallow-clones

also, for such big repo, consider using git lfs in the future: https://git-lfs.github.com/

finally, I've seen many huge git repos that became so big because had files that wasn't supposed to be saved on git (executable files, binaries, videos, audio, and so on). If by mistake something like that happen, you can remove it from history using filter-branch. Check this SO ans: How to remove/delete a large file from commit history in Git repository? or this github article https://help.github.com/articles/remove-sensitive-data/

EDIT:

Microsoft has been developing GVFS that may be a solution in a near future (i think it's still not ready, but I haven't tested)

0
On

The only Git-related way to clone such a huge Git repo would be through the recent (February 2017) GVFS (Git Virtual File System).

As tweeted, for a 270GB repo:

“The Windows codebase has over 3.5M files. With GVFS (Git Virtual File System), cloning now takes a few minutes instead of 12+ hours.”

See github.com/Microsoft/GVFS.
GVFS is based on Git fork: github.com/Microsoft/git.
And based on a protocol whose specifications are described here.

This is not yet supported by EGit, or even regular Git for now.