# in ~/repomigration
git clone https://github.com/mhagger/cvs2svn.git
cd cvs2svn
sudo make install
15 October 2017
This post serves as documentation of the steps to migrate the remaining CVS repositories of the Firebird project from SourceForge to GitHub.
Some steps for the migration are inspired by (or plainly copied from) https://sourceforge.net/p/forge/documentation/CVS/
The most important step, the migration itself, is explicitly not taken from the SourceForge documentation, as this turned out to be lossy (several branches were not included in the migration for unclear reasons).
The migration will be done on Windows 10 using Windows Subsystem for Linux with Ubuntu, but these instruction should work on a 'real' Linux install.
Using sudo apt-get install
:
cvs
(to be able to process the CVS repository)
rcs
(for parsing the log files using rlog
)
git
(obviously)
make
(to 'install' cvs2svn)
In a suitable working directory, install the latest development version of cvs2svn (the last released version might run into problems with multi-line commit messages).
# in ~/repomigration
git clone https://github.com/mhagger/cvs2svn.git
cd cvs2svn
sudo make install
Contrary to its name, cvs2svn also provides a conversion tool called cvs2git.
Initial retrieval of CVS repository:
# in ~/repomigration
mkdir cvsrepo
rsync -av firebird.cvs.sourceforge.net::cvsroot/firebird/* ~/repomigration/cvsrepo
Subsequent updates can be retrieved using just:
# in ~/repomigration/cvsrepo
rsync -av firebird.cvs.sourceforge.net::cvsroot/firebird/* ~/repomigration/cvsrepo
Git uses email addresses as the usernames of committers, while CVS uses just a username. We will first need to obtain all usernames from the repository, and then associate email addresses (and if possible, user names).
For this migration we will associate the original SourceForge usernames with their username@users.sourceforge.net
email address. If users want to associate these commits with their GitHub user account, they will need to associate this email address with their GitHub account (as a secondary address)
Get usernames from CVS logs:
for vfile in `find /home/mark/repomigration/cvsrepo -name '*,v'`; do
rlog $vfile | sed -nr 's/^date:.* author: ([^;]+).*/\1/p'
done | sort -u >~/repomigration/cvs-author-names
Remove the user root
if present, and then transform to email addresses and add name information from SourceForge profile in a format that can be applied in the cvs2git options file:
for uname in `cat ~/repomigration/cvs-author-names`; do
json=`curl https://sourceforge.net/rest/u/$uname/profile`
fname=`echo "$json" | sed -nr 's/\{"username": "[^"]+", "name": "([^"]+)".*/\1/p'`
echo " '$uname' : ('$fname', '$uname@users.sourceforge.net'),"
done >~/repomigration/authors.txt
Review the authors.txt
and make changes were necessary (eg maybe some users indicated they want their commits associated with another email address, real names are not present in the source forge profile, etc).
Before conversion, make sure the local copy of the repository is up-to-date (using rsync). In the description below, I assume migrating the OdbcJdbc
module to git project firebird-odbc-driver
.
The cvs2git tool can only do per module conversions. To make migration easier, it is advisable to use an options file, as documented on http://cvs2svn.tigris.org/cvs2svn.html and http://cvs2svn.tigris.org/cvs2git.html.
For our purpose we took a copy of the cvs2git-example.options
from the cvs2svn
folder created in Tools to install, and made the following modifications. Most of these changes can be used for conversion for all modules, but some settings are per module (or may need some tuning per module).
(optional) Set ctx.tmpdir
to a name specific to the module being converted (eg r'/home/mark/repomigration/cvs2git-OdbcJdbc'
)
Copy the contents of authors.txt
to the author_transform
list.
(optional) Change the entry 'cvs2git' : 'cvs2git <admin@example.com>'
to the domain of your project (in our case I changed the email address to firebird@firebirdsql.org
)
In run_options.set_project
replace r’test-data/main-cvsrepos'
with the path to the module in the repository copy (eg r'/home/mark/repomigration/cvsrepo/OdbcJdbc'
)
In ctx.cvs_log_decoder
uncomment 'latin1'
(and maybe 'utf-8'
) and fallback_encoding='ascii'
(especially if you receive warnings about log parsing)
(optional) Change ctx.symbol_info_filename
from None
to (for example) 'symbol-info.txt'
, this may help in analyzing and fixing problems with name-conflicts between tags and branches
(optional) Enable changeset_database.use_mmap_for_cvs_item_to_changeset_table
(but read the warning in the options file!)
(optional) If you are missing branches, comment out ExcludeTrivialImportBranchRule()
, as an example the OdbcJdbc module had a branch that was equal to the original initial commit of the CVS repository, and was therefor excluded by this heuristic
Download http://www.apache.org/dev/svn-eol-style.txt and http://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/mime.types and make changes if necessary.
Uncomment the AutoPropsPropertySetter
(and related lines) and point it to svn-eol-style.txt
Uncomment the MimeMapper
and point it to mime.types
Uncomment the EOLStyleFromMimeTypeSetter
Add add from cvs2svn_lib.svn_run_options import SVNEOLFixPropertySetter
to the import list at the start, and at the end of ctx.file_property_setters.extend
add SVNEOLFixPropertySetter(),
to normalize up line-endings (test carefully if you really want to do this)
Be sure to read through the options file documentation, there are some settings you might want to tune further (eg the settings in The settings of step 9 have no effect if |
To convert (replace firebird-odbc-driver.options
with your options file)
cvs2git --options=firebird-odbc-driver.options
Conversion can take a while.
Then perform (replace the firebird-odbc-driver
and cvs2git-OdbcJdbc
with your specific names):
git init firebird-odbc-driver.git
cd firebird-odbc-driver.git
cat ../cvs2git-OdbcJdbc/git-blob.dat ../cvs2git-OdbcJdbc/git-dump.dat | git fast-import
# might fail if this branch doesn't exist
git branch -D TAG.FIXUP
python ~/repomigration/cvs2svn/contrib/git-move-refs.py
# delete branches prefixed `unlabelled-` (old branches that had their name deleted from CVS)
git branch --list 'unlabeled-*' | xargs git branch -D
git gc --prune=now
git repack -a -d -f --depth=50 --window=250
Deleting the |
Verifying contents of repository (replace the firebird-odbc-driver
and cvs2git-OdbcJdbc
with your specific names):
mkdir /tmp/compare-firebird/
python ~/repomigration/cvs2svn/contrib/verify-cvs2svn.py \
--git \
~/repomigration/cvsrepo/OdbcJdbc/ \
~/repomigration/firebird-odbc-driver.git/ \
--tmp=/tmp/compare-firebird/ \
--diff
Create an empty repository on GitHub (replace PROJECT
and REPOSITORY
with the right values)
git remote add origin git@github.com:PROJECT/REPOSITORY.git
git config branch.master.remote origin
git config branch.master.merge refs/heads/master
git push origin --mirror
Make sure your SSH key for GitHub is loaded.
Add line NetfraRemote.lib = svn:eol-style=CRLF
to svn-eol-style.txt
in an attempt to fixup an incorrect binary file.
After migration and publication to GitHub, checkout on Windows and perform steps:
In branch master
and B_Release
do:
Create .gitignore
with content:
lib/* !lib/_readme_libs.txt tools/* !tools/_readme_tools.txt !tools/get_tools_linux.sh dist/ inter/
And then do
git add . git commit -m "Add .gitignore" git push
This will ignore the files and folders populated for the build process.
Create .gitattributes
with content:
* text=auto *.xml text *.xsl text *.docbook text *.css text *.bat text eol=crlf *.sh text *.bmp binary *.gif binary *.ico binary *.jar binary *.jpg binary *.jpeg binary *.png binary
And then do
git read-tree --empty git add . git commit -m "Add .gitattributes and update affected files" git push