No project of significant size that I've ever seen has retained its initial structure. Restructuring projects is a fact of life, but unfortunately Git doesn't make it easy.
Fundamentally this stems from the way Git works, treating changes as a succession of snapshots and not storing any other metadata. Of course this is part of what makes Git fast and efficient, but at the expense of making some common operations more difficult for users. Git really is a perfect 21st century illustration of the classic "Worse Is Better" paradigm of successful software 😀
Previously I discussed how to split a Git project apart into separate repositories.
Now I'm going to discuss how to do the opposite and merge separate repositories into one. On the face of it, this would seem a simpler task as Git has powerful support for merging...
Let's take the opposite example to my splitting apart article - say you have a main Git repo (ProjA) and a second repo (ProjB) in a subdirectory, ie ProjA/ProjB. You want to merge ProjB into ProjA and have a single master repo ProjA which will retain all the history of both projects. ProjB will remain in ProjA/ProjB.
Step 1: Temporarily Move ProjB
First of all we need to move the ProjB repo out of the ProjA tree, so that Git will be able to overwrite ProjB when we merge the repos:
$ cd ProjA
$ mv ProjB <new location>
Step 2: Remove ProjB from .gitignore
You probably have ProjB in the .gitignore file for ProjA. That needs to be removed so you can work on ProjB after the merge.
Step 3: Move ProjB files to ProjB/ProjB
If we just merge ProjB into ProjA, as in Step 4 below, all the ProjB files will end up in the root of ProjA. That's not what we want - we want them to go into the ProjB subdirectory after the merge. You would also likely have merge conflicts with common files like .gitignore.
Unfortunately this is the step where we see all the unpleasantness of Git 💩 - if we just make a ProjB subdirectory and git mv all the files to it (as described in my earlier post on Git renaming), history is only partially retained. git log --follow allows you to see the history of the moved files, but git diff, bisect etc can't find the revisions. You can still diff ProjB commits from the ProjB log, just not for an individual ProjB file. Future Git versions may fix this. If you are not bothered by these issues proceed to step 4.
However to fix it properly, we need to edit the commit history of ProjB to make it appear that the files have always been in the ProjB subdirectory. Caution that this is a destructive operation, so make sure you have a backup! There are also many ways to do this in Git and I recommend avoiding methods that involve using sed on the names of files - it's really, really easy to get that wrong. I prefer a more obviously correct method like this:
$ cd ProjB
$ git filter-branch --prune-empty --tree-filter '
if [ ! -e ProjB ]; then
mkdir -p ProjB
git ls-tree --name-only $GIT_COMMIT | xargs -I files mv files ProjB
fi'
Step 4: Merge ProjB into ProjA
From here it's pretty straightforward, we just merge ProjB into ProjA.
Note that --allow-unrelated-histories is required so that Git will merge commits that don't have a common root.
$ cd ProjA
$ git remote add ProjB <ProjB location>
$ git fetch ProjB
$ git merge --allow-unrelated-histories ProjB/master
$ git remote remove ProjB
or just
$ git pull --allow-unrelated-histories <ProjB location> master
Note that I'm only illustrating merging the master branch - if you have other branches these will have to be merged separately.
Step 5: (Optional) Remove ProjB Repo
After checking the ProjA file structure and history is all good, you can remove the old ProjB repo.
If things are not good, which I must admit they were not the first time I did this, you can reset to just before the merge and try again.
One of the things I do like about Git 😀 is the ease of undoing (and redoing) changes. To undo, find the commit hash just before the merge with:
$ git reflog | head
Then rewind to the good point:
$ git reset --hard <commit hash>
If things are not good, which I must admit they were not the first time I did this, you can reset to just before the merge and try again.
One of the things I do like about Git 😀 is the ease of undoing (and redoing) changes. To undo, find the commit hash just before the merge with:
$ git reflog | head
Then rewind to the good point:
$ git reset --hard <commit hash>
Comments
Post a Comment