In a previous blog post I explained how you can substantially improve the performance of git on Windows updating the underlying SSH implementation. This performance improvement is very worthwhile in a standard Unix-style git setup where access to the git repository is done using ssh as the transport layer. For a regular development workstation, this update works fine as long as you keep remembering that you need to check and possibly update the ssh binaries after every git update.
I’ve since run into a couple of other issues that are connected to using OpenSSH on Windows, especially in the context of a Jenkins CI system.
Accessing multiple git repositories via OpenSSH can cause problems on Windows
I’ve seen this a lot on a Jenkins system I administer.
When Jenkins is executing a longer-running git operation like a clone or large update, it can also check for updates on another project. During the check, you’ll suddenly see an “unrecognised host” message pop up on the console you’re running Jenkins from and it’s asking you to confirm the host fingerprint/key for the git server it uses all the time. What’s happening behind the scenes is that the first ssh process is locking .ssh/known_hosts and the second ssh process suddenly can’t check the host key due to the lock.
This problem occurs if you’re using OpenSSH on Windows to access your git server. PuTTY/Pageant is the recommended setup but I personally prefer using OpenSSH because if it is working, it’s seamless the same way it works on a Unix machine. OK, the real reason is that I tend to forget to start pageant and load its keys but we don’t need to talk about that here.
One workaround that is being suggested for this issue is to turn off the key check and make /dev/null “storage” for known_hosts. I don’t personally like that approach much as it feels wrong to me – why add security by insisting on using ssh as a transport and then turn off said security, which results in a somewhat performance challenged git on Windows with not much in the way of security?
Another workaround improves performance, gets rid of the parallel access issue and isn’t much less safe.
Use http/https transport for git on Windows
Yes, I know that git is “supposed” to use ssh, but using http/https access on Windows just works better. I’m using the two interchangeably even though my general preference would be to just use https. If you have to access the server over the public Internet and it contains confidential information, I’d probably still use ssh, but I’d also question why you’re not accessing it over a VPN tunnel. But I digress.
The big advantages of using http for git on Windows is that it works better than ssh simply by virtue of not being a “foreign object” in the world of Windows. There is also the bonus that clones and large updates tend to be faster even compared to a git installation with updated OpenSSH binaries. As an aside, when I tested the OpenSSH version that is shipped with git for Windows against PuTTY/Pageant, the speeds are roughly the same so you’ll be seeing the performance improvements no matter which ssh transport you use.
As a bonus, it also gets rid of the problematic race condition that is triggered by the locking of known_hosts.
It’s not all roses though as it’ll require some additional setup on behalf of your git admin. Especially if you use a tool like gitolite for access control, the fact that you end up with two paths in and out of your repository (ssh and http) means that you essentially have to manage two types of access control as the http transport needs its own set of access control. Even with the additional setup cost, in my experience offering both access methods is worth it if you’re dealing with repositories that are a few hundred megabytes in size or even gigabytes in size. It still takes a fair amount of time to shovel an large unbundled git repo across the wire this way, but you’ll be drinking less coffee while waiting for it to finish.