Migrating part of repository in SVN

Phew, I finally did it. Migrating part of an svn-repository to a new repository while changing the destination paths isn’t as trivial as the svn manual implies. The workflow according to the book should look like this:

  1. Dump the repository while filtering with svndumpfilter
  2. Edit the dumpfile to change paths
  3. Load the dumpfile into the new repository

While this is basically what I did, it wasn’t as straight forward as it sounds…

Dumping and filtering the old repository isn’t that hard. The fact that you have to dump the whole repository and just snatch the relevant parts makes it annoying and slow, but it’s still doable:

svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/family/clockwork > clockwork.svndump

So, now we have the relevant portions of the old repository neatly in a dumpfile, let’s start editing. I wanted to move the project to a new repository in /group/work/svnrepo/ into a parent directory projects/department/clockwork/client. Ok, that’s easy. Just open the dumpfile in an editor and search-replace family with department and clockwork with clockwork/client. So far so good.

After checking and double checking that all the paths were right, it was now time for the third part: importing (this is going really smoothly, right?).

Here is were the troubles started piling up. If the repository contains binary files, svn calculates an md5 checksum of it and adds that to the dump. Apparently, when editing the text-part of the dumpfile, something in the binary data get’s screwed up as well, changing the md5 sum of the binary file. This results in svn throwing the folloing error at you:

svnadmin load /group/work/svnrepo/ < clockwork.svndump

svnadmin: Checksum mismatch, file '/projects/family/clockwork/clockwork.jpg':
expected: 7d72863dab994058bbca4622d54fe21d
actual: 0bf7ea2bc06df62039375bb2cb3ffbd2

Crap. So, what to do? Of course I tried dumping the repo again, using another editor to edit the file, filtering the dumpfile through sed, checking the validity of the repository, etc etc. Nothing worked. Even Google didn’t want to give me any answers, so I was getting quite desperate.

As a final trick, I tried to filter the stream coming from svnadmin dump directly through sed before redirecting the output to a file. There was quite a lot of sed’ing so the command grew into a monster:

svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/family/clockwork | sed 's//family//department/g' | sed 's//clockwork///clockwork/client//g' > clockwork.svndump

Now, after this seemingly equivalent operation to that of streaming the dumpfile saved on disk through sed, the svnadmin load suddenly worked!

svnadmin load /group/work/svnrepo/ < clockwork.svndump

Amazing… I don’t know why this works like this, but I’m happy that it does. Actually I would be even more happy if it had worked the way we all think it would…

But it didn’t stop here. Nooo. While I was comfortably assured that I had solved the deepest mysteries of svn, I was actually in line for a new ride on the horror train.

The next project I was about to migrate was actually an immigrant in a sense. It had started off in /projects/rolex, but after an svn copy operation it had moved to /projects/other/rolex. Now this was a true problem for svndumpfilter while filtering the dumpfile.

Naturally, since the project at the time of the dump resides in /projects/other/rolex, this is what we want to dump. Like this:

svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex > rolex.svndump

Svn wasn’t at all happy with this. For me rolex’ past was long forgotten, svn on the other hand had an elephant’s memory:

svndumpfilter: Invalid copy source path '/projects/rolex'

Now what’s this all about? Some googling again didn’t reveal much (other than some patch-proposals to svndumpfilter). Hmm, apparently svn don’t like to dump a directory that has moved in from somewhere else without also dumping the originating folder. So I tried to include the source folder as well:

svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex /project/rolex > rolex.svndump

This actually worked! Now, remembering the sed-pipe-disaster from the last project, the final command looked like this:

svnadmin dump /group/home/svnrepo/ | svndumpfilter include --drop-empty-revs --renumber-revs /projects/other/rolex /projects/rolex | sed 's//rolex//server/g' | sed 's//other///clockwork//g' | sed 's/projects///g' > rolex.svndump

Now loading the dumpfile into the new repository worked like a charm:

svnadmin load /group/work/svnrepo/ --parent-dir projects < rolex.svndump

And now, I’m going to have a well deserved cup of coffee!



Filed under All software sucks, Linux

22 responses to “Migrating part of repository in SVN

  1. svnadmin load has a –parent-dir argument you can use to put the loaded paths under a specific directory in the new repo. I find it easiest to use that and then, if necessary, do an svn move to get things into the right place in the new repo.

    But you just lucked out on this one. I’m trying to deal with a much more complex situation: a repository that has had a lot of copies done all over it, and has IP that I need to get rid of before I can import what code is mine into my repo.

    It’s pretty inconvenient that svndumpfilter can do only includes or only excludes in a pass, but not both. It’s even more inconvenient that if you do multiple passes where svndumpfilter doesn’t fail, it can still produce a dump that can’t be loaded. I may just have to abandon up most of my change history on this one, which is fairly frustrating.

  2. Darshak

    Thanks a ton for posting this. This was exactly what i was looking for and you really really saved my day.

  3. Bjornar

    I also tried using sed to change file paths in an SVN dump, and ran into the same import error. I initially tried your suggestion to pipe through sed during the original dump command rather than on the saved file, and it didn’t work for me.

    After a little further investigation I realized there is a ‘-b’ flag for sed to treat the input stream as binary. When I added this flag svnadmin load worked perfectly, even after running sed on the saved dump file!

  4. Maki

    Don’t do string substitutions with “sed ‘s//family//department/g'”!
    Cause, you are also modifying actual content, like you discovered binary, but also if some writes an essay about families.

    Just search for “Node-path: family” or “Copyfrom-path: family” and replace this. See SVN book chapter 5 “repository administration” for more details.

  5. I have been visiting this site a lot lately, so i thought it is a good idea to show my appreciation with a comment.

    Jim Mirkalami

  6. Thank you! I’m happy that my random thoughts are of use to someone.

  7. Tim

    HA. you’re hilarious. “svn had an elephants memory.” great hacking, and thanks for posting bud.

  8. I’ve used this tip twice now, much thanks!

    Maki brings up a good point with specific searching for svn node paths. I was bitten by this for about half an hour, each time getting svnadmin: Dumpstream data appears to be malformed. Turns out my sed query found other unrelated items and replaced them as well.

    When I switched to searching for Node-path:, Copyfrom-path: and Node-copyfrom-path:, all was well.

  9. Pingback: Never Tell… Richard Eibrand » Blog Archive » svn - moving part of repository to another repository

  10. Kranthi Polsani

    Hi Guys,
    Try imagining the repository structure by the following
    While getting the dump of the repo i want the dump to be in such a way that if it is loaded into new repository it should have directories “trunk, tags, and branches” at the root.
    And i dont want the directories “Circuits” in the new repository.
    I modified the dump file changing the “Node-path” and “Node-copyfrom-path” but when i tried to load the dump file it says “Checksum error”. It was expecting some checksum value for the file but the actual checksum was different.
    Then i thought of changing content of the dump file before it was written.
    Below is the script i wrote:
    $ svnadmin dump C:/svn_repository/Hardware | svndumpfilter include /Circuits/
    trunk | sed -e “s/Node-path: Circuits/trunk/Node-path: trunk/g” | sed -e “s/Node-copyfrom-path: Circuits/trunk/Nod​e-copyfrom-path: trunk/g” > hardwarecircui
    Even then i was not able to solve the checksum error issue.
    I would appreciate any help. Hope to hear from you soon.

  11. Benjamin Ortuzar

    This works pretty nicely for me:

    cat /backup/svn/svn.dump | svndumpfilter –drop-empty-revs –renumber-revs include frank | sed ‘s/Node-path: FOLDERNAME//Node-path: /g’ | sed ‘s/Node-copyfrom-path: FOLDERNAME//Node-path: /g’ | gzip -9 > FOLDERNAME.dump.gz

  12. Benjamin Ortuzar

    There was an error on my last post, this is correct:

    cat /backup/svn/svn.dump | svndumpfilter –drop-empty-revs –renumber-revs include FOLDERNAME | sed ’s/Node-path: FOLDERNAME//Node-path: /g’ | sed ’s/Node-copyfrom-path: FOLDERNAME//Node-path: /g’ | gzip -9 > FOLDERNAME.dump.gz

  13. One last typo in there, Ben. The last sed should be ’s/Node-copyfrom-path: FOLDERNAME//Node-copyfrom-path: /g’ which makes it:
    cat /backup/svn/svn.dump | svndumpfilter –drop-empty-revs –renumber-revs include FOLDERNAME | sed ’s/Node-path: FOLDERNAME//Node-path: /g’ | sed ’s/Node-copyfrom-path: FOLDERNAME//Node-copyfrom-path: /g’ | gzip -9 > FOLDERNAME.dump.gz

  14. This is probably the most helpfull about svndumpfilter I could came across in the entire internet. Now I got my repository properly dumped and loaded one directory UP.
    Thanks !

  15. Pingback: Subversion repository part migration | Andrejs Cainikovs blog

  16. Igg

    Thanks, you make my daynight )))

  17. Ketan

    Thanks a ton to all, you saved my day/night too!! Best post all around the world to hack repository migration. Thanks once again

  18. sree


    I need 2 migrate a part od svn directory to new one,Can u plz give the command.
    I get the following exception
    D:test>svnadmin dump D:Repositorytestdocuments | svndumpfilter include “Deployment Area” > testnew.dmp
    * Dumped revision 6287.
    Revision 6287 committed as 6287.
    svnadmin: Malformed representation header
    svndumpfilter: Premature end of content data in dumpstream

    and also second method

    D:skyassit>svnadmin dump D:Repositorytestdocuments | svndumpfilter include “/test/trunk/test/Deployment Area” > newtest.dmp
    * Dumped revision 3037.
    Revision 3037 committed as 3037.
    * Dumped revision 3038.
    svndumpfilter: Invalid copy source path ‘/test/trunk/BackUp/DeploymentArea/

    svnadmin: Can’t write to stream: The pipe is being closed.

  19. I want begin my online vendue internet site. I don’t know anything about programming. What kind of programres and what experience do they necessitate to have for me to build a vendue internet site? If i could get a list with all the peoples i ask to charter, and some more information on how everything works would be really apreciated. As well if you could tell me around how much each one of these employees makes would be great too. Thanks in advance!

  20. Pingback: svnでリポジトリを他のリポジトリの配下に入れたくなった | L2TP

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s