Friday, January 28, 2011

How to copy a Confluence space to another instance of Confluence

Confluence is a professional enterprise wiki and recently I was tasked with transferring contents from one Confluence instance to another.
Problem: I didn't have site admin rights on either instance so the natural XML export/import path was closed and I had to find another solution.

Content in Confluence is organized in so called spaces, think of them as topics maintained by lists of users with varying degrees of permissions (admin, read, write, export aso.), each space consisting of a set of pages. The task was to transfer a number of spaces from one instance to the other. On the first glance not difficult (just transfer the raw wiki markup text), on second glance challenging when you think about attachments, comments etc. but the biggest caveat was that the two instances of Confluence used different access mechanisms (one was using the corporate LDAP and the other an access list of its own) i.e. usernames and passwords were different.
Why caveat?
Because there exists a SOAP based command line interface using Confluence's remote API which provides a copySpace functionality to the same instance or to a different instance but in its current revision it requires that one uses the same username and password on both source and target servers.

The solution: I could enhance the CSOAP source code to get the required functionality: different user name and password on the target server. I introduced two new arguments targetUser and targetPassword. Here's what I did:
  • Downloaded and unpacked the CSOAP package (version 1.5) (scroll to where it says Download JAR)
  • Downloaded and unpacked the source code (it's in Java) using the distribution->confluence-cli-1.5.0.source.zip file.
  • Modified the ConfluenceClient.java file and compile a new confluence-cli-1.5.jar file (the compilation and creation of the new jar file was a little trickier than it sounds)
  • Replaced the existing confluence-cli-1.5.jar in the the downloaded package (in directory release)
  • Run the following command to copy a space:
    java -jar release/confluence-cli-1.5.0.jar --verbose --server https://confluence1.foo.com --user 1234 --password XXXXXX --action copySpace --targetServer https://confluence2.foo.com" --targetUser bar@foo.com --targetPassword XXXXXX --space SPACE --newSpace SPACEnew --copyAttachments --copyComments --copyLabels
    (assume that 1234 is my account on the first and bar@foo.com my account on the second Confluence instance)
This was run for all the necessary pages.
As one can see also comments, attachments and labels are copied over.
What the code does is a copy page by page. The new pages are all created by the same username, also all comments appear to have been written by the same user. I added a little note at the top of each page saying which user originally created it in the first Confluence instance to keep a little history in the new instance.

Prerequisites for this whole effort:
sufficient access rights to both Confluence instances to extract pages and to create spaces and a working and reasonable new JDK. I did my development work on Ubuntu with JDK 1.6 and used the resulting jar file on another of our internal servers (on Solaris) sitting closer to both wiki sites in order to speed up the transfer.

As a good web citizen I created an issue and provided my changed code to the author (I did a couple of further enhancements but they weren't production worthy yet so they are not in the code and I was dragged into other things later on).

Note: another caveat was that the Confluence instances used a different set of plugins (plugins are additional functions which can improve the usability of a wiki big time) i.e. if a page author was relying heavily on a particular plugin in instance 1 these pages will be partially broken in the new instance. That was beyond my task and area of influence though (and it turned out to be no issue for the spaces in question).

No comments:

Post a Comment