Improve git performance on Windows without patching your git install

I’ve blogged about improving the performance of Git on Windows in the past and rightly labelled the suggested solution as a bad hack because it requires you to manually replace binaries that are part of the installation. For people who tend to use DVCSs from the command line, manually replacing binaries is unlikely to be a big deal but it’s clunky and should really be a wakeup call for some people to include a newer base system.

By now there is a much easier way to get the same performance improvement and this is to use Git for Windows instead of the default Windows git client from git-scm.com. Not only does the Git for Windows installer include the newer openssl and openssh binaries that I suggested dropping into the git installation directory in my original post, it is also a much newer version of git.

For me, installing the Git for Windows client kills a couple of birds with one stone.

First, it addresses a large part of my complaint that Windows is a second class citizen to the Git developers. Using git on Windows is still a tad clunkier than using it in its native environment (ie, the Unix world) but a dedicated project to improve the official command line client goes a long way to address this issue. Plus, the client is much more up to date compared to the official client from git-scm.com.

Second, addressing the performance issues that the official client has is a big deal, at least to those of us who need to work with git repositories in the multi-gigabyte size class. With repositories of that size, it does make a difference if your clone performance suddenly is an order of magnitude faster. In my case it also finally allows me to use these large git repositories with Mercurial’s hg-git plugin, which simply was not possible before.

I’ve not tried to verify if the newer openssh and openssl binaries address the issue I described in Making git work better on Windows. My assumption is that it’s not the case as I saw the same behaviour with the manually updated binaries. For use with a CI system like Jenkins I still recommend to use http access to the repository.

Adding TLS support to Emacs 24.5 on Windows

The Windows build of Emacs 24.5 doesn’t ship with SSL and TLS support out of the box. Normally that’s not that much of a problem until you are trying to access marmalade-repo or have org2blog talk to your own blog via SSL/TLS.

Adding SSL and TLS support to the Windows builds of Emacs is easy. SSL/TLS support in the official Emacs build for Windows isn’t enabled because it doesn’t ship with the necessary support libraries, but you can get pre-built binaries from the ezwinports project on Sourceforge. Installation is simple – grab the desired binaries (I used gnutls, but there’s also an older openssl build available) and extract them into the root directory of your Emacs install. The directory layout is the same and mimics the standard Unix directory layout so everything ends up in the correct place.

After the next restart of Emacs, a quick

M-: (gnutls-available-p)

should result in ‘t’, showing you that Emacs has found the gnutls binaries. All of a sudden, org2blog can talk to my blog again and I’m finally set up the same way I am on the other OSs I use.

ezwinports has a whole bunch of other useful libraries available as well, like libpng, so check it out.

One caveat – the ezwinports libraries are 32 bit libraries so they work fine with the official 32 bit build of Emacs for Windows, but you need to look into alternatives if you use a 64 bit build.

Git Logo

Making git work better on Windows

In a previous blog post I explained how you can substantially improve the performance of git on Windows updating the underlying SSH implementation. This performance improvement is very worthwhile in a standard Unix-style git setup where access to the git repository is done using ssh as the transport layer. For a regular development workstation, this update works fine as long as you keep remembering that you need to check and possibly update the ssh binaries after every git update.

I’ve since run into a couple of other issues that are connected to using OpenSSH on Windows, especially in the context of a Jenkins CI system.

Accessing multiple git repositories via OpenSSH can cause problems on Windows

I’ve seen this a lot on a Jenkins system I administer.

When Jenkins is executing a longer-running git operation like a clone or large update, it can also check for updates on another project. During the check, you’ll suddenly see an “unrecognised host” message pop up on the console you’re running Jenkins from and it’s asking you to confirm the host fingerprint/key for the git server it uses all the time. What’s happening behind the scenes is that the first ssh process is locking .ssh/known_hosts and the second ssh process suddenly can’t check the host key due to the lock.

This problem occurs if you’re using OpenSSH on Windows to access your git server. PuTTY/Pageant is the recommended setup but I personally prefer using OpenSSH because if it is working, it’s seamless the same way it works on a Unix machine. OK, the real reason is that I tend to forget to start pageant and load its keys but we don’t need to talk about that here.

One workaround that is being suggested for this issue is to turn off the key check and make /dev/null “storage” for known_hosts. I don’t personally like that approach much as it feels wrong to me – why add security by insisting on using ssh as a transport and then turn off said security, which results in a somewhat performance challenged git on Windows with not much in the way of security?

Another workaround improves performance, gets rid of the parallel access issue and isn’t much less safe.

Use http/https transport for git on Windows

Yes, I know that git is “supposed” to use ssh, but using http/https access on Windows just works better. I’m using the two interchangeably even though my general preference would be to just use https. If you have to access the server over the public Internet and it contains confidential information, I’d probably still use ssh, but I’d also question why you’re not accessing it over a VPN tunnel. But I digress.

The big advantages of using http for git on Windows is that it works better than ssh simply by virtue of not being a “foreign object” in the world of Windows. There is also the bonus that clones and large updates tend to be faster even compared to a git installation with updated OpenSSH binaries. As an aside, when I tested the OpenSSH version that is shipped with git for Windows against PuTTY/Pageant, the speeds are roughly the same so you’ll be seeing the performance improvements no matter which ssh transport you use.

As a bonus, it also gets rid of the problematic race condition that is triggered by the locking of known_hosts.

It’s not all roses though as it’ll require some additional setup on behalf of your git admin. Especially if you use a tool like gitolite for access control, the fact that you end up with two paths in and out of your repository (ssh and http) means that you essentially have to manage two types of access control as the http transport needs its own set of access control. Even with the additional setup cost, in my experience offering both access methods is worth it if you’re dealing with repositories that are a few hundred megabytes in size or even gigabytes in size. It still takes a fair amount of time to shovel an large unbundled git repo across the wire this way, but you’ll be drinking less coffee while waiting for it to finish.

I prefer ConEmu over Console2, and so should you…

OK, I admit it – I’m a dinosaur. I still use the command line a lot as I’m subscribing to the belief that I can often type faster than I can move my hand off the keyboard to the mouse, click, and move my hand back. Plus, I grew up in an era when the command line was what you got when you turned on the computer, and Windows 2.0 or GEM was a big improvement.

One of the neat features of the console emulators on both on Linux and Mac OS X was and is that you could run a set of shells in a tabbed single console window. A post on Scott Hanselman’s blog put me onto Console2. That was more like it and I pretty much immediately housed my Windows shells – either cmd.exe or PowerShell – in there. Much better, but over time the pace of development slowed and the last beta release dates from 2011. It’s not like the Beta is buggy or anything – in fact, in my experience it works very nicely indeed – but of course as a software engineer I like shiny new things.

Enter, via another post on Scott Hanselman’s blog, ConEmu – or ConEmu-Maximus5, to give it its full name. If Console2 is the VW Golf to the stock Windows’ console emulator’s 1200cc VW Bug, then ConEmu is the VW Phaeton to Console2’s VW Golf. It’s got a lot more features, it’s actively developed, it works well with Far Manager if you miss the Norton Commander days and it’s highly configurable. Of course, it also can handle transparent backgrounds, but so can Console2.

For me, it has one killer feature – recent versions detect which shells you have installed on your machine and offer you a selection via the green “new tab” button (the one that looks a bit like a French Pharmacy sign), with a choice of running them either as a regular user or admin user:

ConEmu with visible command line processor menu
ConEmu with visible command line processor menu

Why is this such a big deal? Well, it’s neat if you’re using both PowerShell and cmd.exe, but for me it’s a killer feature because I like using TCC/LE, at least at home. TCC/LE is the familiar Windows command prompt at first glance but in the same way that ConEmu is a much expanded console emulator compared to the regular Windows one, TCC/LE is a much expanded command prompt that is a lot more feature rich and has a lot of sensible extensions. And because I’m such a dinosaur, I’ve actually been using its predecessors (4DOS and 4NT) way back when they were distributed as shareware on a floppy disk and you had to buy the manuals for them to get the registration code. And yes, I still have at least the 4DOS manual.

Back to console emulators, though. If I wanted to go nitpicking, both ConEmu and Console2 work less well over an RDP connection than the stock console, which is noticeable if you tend to remote into machines quite frequently. It’s not that they work badly, but Microsoft clearly spent a lot of time optimising the stock console to work well over RDP (or to have RDP work well with the stock console), so there is a bit of lag when scrolling. It doesn’t make either tool unusable but you notice it’s there.

Anyway, if you check out one new tool this week, make it ConEmu.

Running Emacs from a Windows Explorer context menu

It’s one of those days, thanks to a hard disk going south I ended up having to rebuild the system drive on one of my machines. After putting the important software back on there – “Outlook and Emacs”, as one of my colleagues calls it – I had to reapply some of the usual tweaks that make a generic developer workstation my developer workstation.

One of the changes I wanted to make was to have an “Edit in Emacs” type context menu in Windows Explorer. The only reason I was keeping another editor around was because it’s a feature I use regularly but hadn’t got around to setting up for Emacs.

StackOverflow to the rescue, as usual. I used the registry script provided in the first answer and tweaked it slightly. In contrast to a lot of people, I don’t keep Emacs running all the time but only when I’m editing something. For that reason like my Emacs setups to either start a new instance if there is no Emacs running or use an existing instance as a server if there happens to be one running.

With my little tweaks to start an instance of Emacs even if there is no Emacs server running, this is what the updated registry script looks like:

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT*shell]
[HKEY_CLASSES_ROOT*shellopenwemacs]
@="&Edit with Emacs"
[HKEY_CLASSES_ROOT*shellopenwemacscommand]
@="C:\Tools\emacs\bin\emacsclientw.exe -a C:\Tools\emacs\bin\runemacs.exe -n "%1""
[HKEY_CLASSES_ROOTDirectoryshellopenwemacs]
@="Edit &with Emacs"
[HKEY_CLASSES_ROOTDirectoryshellopenwemacscommand]
@="C:\Tools\emacs\bin\emacsclientw.exe -a C:\Tools\emacs\bin\runemacs.exe -n "%1""

There’s another neat little tweak in there, too – the directory “C:toolsemacs is actually a symbolic link to the current installed version of Emacs on this machine so whenever I update my version of Emacs, I don’t have to redo other scripts and settings that try to use the current version of Emacs.

This might be an old hat to most Unixheads, but it’s slightly unusual on Windows so I figured I’ll mention that it’s possible to do something like this on Windows also.

Improving the performance of Git for Windows

Admittedly I’m  not the biggest fan of git – I prefer Mercurial – but we’re using it at work and it does a good job as a DVCS. However, we’re mostly a Windows shop and the out of the box performance of Git for Windows is anything but stellar when you are using ssh as the transport for git. That’s not too much bother with most of our repos but we have a couple of fairly big ones and clone performance with those matters.

I finally got fed up with the performance after noticing that cloning a large repository from the Linux git server to a FreeBSD box was over an order of magnitude faster than cloning the same repository onto a Windows machine. So I decided to start digging around for a solution.

The clone performance left a lot to be desired using either PuTTY or the bundled OpenSSH as the SSH client. I finally settled on using OpenSSH as I find it easier to deal with multiple keys. Well, it might just be easier if you’re a Unixhead.

My search led me to this discussion, which implies that the problem lies with the version of OpenSSH and OpenSSL that comes prepackaged with Git for Windows. The version is rather out of date. Now, I had come across this discussion before and as a result attempt to build my own SSH binary that included the high performance ssh patches, but even after I got those to build using cygwin, I never managed to actually get it to work with Git for Windows. Turns out I was missing a crucial detail. It looks like the Git for Windows binaries ignore the PATH variable when they look for their OpenSSH binaries and just look in their local directory. After re-reading the above discussion, it turned out that the easiest way to get Git for Windows to recognise the new ssh binaries is to simply overwrite the ones that are bundled with the Git for Windows installer.

*** Bad hack alert ***

The simple recipe to improve the Git performance on Windows when using a git+ssh server is thus:

  • Install Git for Windows and configure it to use OpenSSH
  • Install the latest MinGW system. You only need the base system and their OpenSSH binaries. The OpenSSH and OpenSSL binaries that come with this installation are much newer than the ones you get with the Git for Windows installer.
  • Copy the new SSH binaries into your git installation directory. You will need to have local administrator rights for this as the binaries reside under Program Files (or “Program Files (x86)” if you’re on a 64 bit OS). The binaries you need to copy are msys-crypto-1.0.0.dll, msys-ssl-1.0.0.dll, ssh-add.exe, ssh-agent.exe, ssh-keygen.exe, ssh-keyscan.exe and ssh.exe

After the above modifications, the clone performance on my Windows machine went from 1.5MB/s – 2.5MB/s to 10MB/s-28MB/s depending on which part of the repository it was processing. That’s obviously a major speedup for cloning, but another nice side effect is that this change will result in noticeable performance improvements for pretty much all operations that involve a remote git repository. They all feel much snappier.

With Emacs on Windows, make sure you know where your $HOME is

The Gnu Emacs for Windows distribution appears to be pretty good at inferring where a reasonable place for $HOME is, straight out of the box. In my case, said reasonable place was %USERPROFILE%/AppData/Roaming which was an entirely acceptable default.

That is, until several other tools entered the picture and disagreed with Emacs. We’ve recently switched to using git at work and the git ecosystem  needed to have some ideas where its home was. I’m using Git Extensions as the “regular” Windows GUI and TortoiseGit for the Windows Explorer integration, plus the awesome Posh-Git that even made me learn basic PowerShell.

All of this worked fine until I threw magit into the mix. I like being able to interact with the VCS directly from Emacs (who doesn’t?) and magit is probably the greatest VCS integration for Emacs. It worked fine as long as I kept the cheat sheet handy, but a colleague of mine pointed out that my magit commits supposedly came from a really funky user that looked very much like the computer guessed my email address rather than the user configured in my git configuration.

Turns out that both git and Emacs respect and look at the HOME environment variable. After settling on a suitable location and adjusting %HOME%, I moved the various dot files into the correct location and can now commit correctly from Emacs with the correct user details. Phew.

Oh, and for those commits that did have the odd username on them, the following commands came in really handy.

To change the author on the current (well, last) commit:

git commit --amend --author="corrected author"

Or if you just want to update the author on the last commit after updating HOME to point at the correct location:

git commit --amend --reset-author

How to enable (hack) git-p4 in msysgit for Windows

The default installation of msysgit (aka the official git client for Windows) is unfortunately built without python support. There are understandable reasons as to why this is, starting with “where the heck do I find the various python versions on Windows”. For me the problem was that I needed git-p4 to extract some code history out of a Perforce repository and guess what, git-p4 is written in Python. Only solution for me was that I had to find a way to make this work short of throwing Linux in a VM just to get a git import going.

It actually turned out to be fairly simple. The git-p4 that comes with the msysgit installation is a very basic placeholder that the main git executable runs via its shell. Getting the git-p4 plugin to work was a simple case of dropping the “real” git-p4.py from the Linux distribution into an appropriate location and then modifying git-p4 to run my local python with the appropriate command line. Just keep in mind that the shell used by msysgit is a unix shell so you need to make sure that the paths and parameters are in /bin/sh syntax and not in DOS batch syntax. Here’s my current hacked version of git-p4 that seems to do the job:


#!/bin/sh

c:/python27/python “c:/program files (x86)/Git/libexec/git-core/git-p4.py” $1 $2 $3 $4 $5 $6 $7 $8 $9

Improving the Emacs integration in Windows

I was trying to make Windows a little more Emacs-friendly (or was it the other way around?). First step was to enable the emacs server in my .emacs so I could make use of Emacs for quick and dirty editing tasks that require an editor better than Notepad but where the average Emacs startup time was just a little too long to make Emacs a viable alternative. A typical example would be to use Emacs as the editor for commit messages in Mercurial. A quick tweak of my global .hgrc provided me with an appropriate editor setting:

  [ui]
           ... other settings ...
  editor = C:\Emacsen\emacs-24.1\bin\emacsclientw.exe -c

Please note that there are no quotation marks around the emacsclientw command line, adding them will result in an error message rather than an Emacs frame. Guess how I found that out. I would also suggest to extend the command line to include the “alternate editor” parameter -a to either start Emacs or another editor if there is no Emacs server running. Given that I tend to start Emacs right after I start the browser and the email client on most machines, this would be an unnecessarily cluttered command line for my use.

I also set up Emacs as an external editor from Visual Studio as described in this blog post, so now I can hit Tools/Edit in Emacs from Visual Studio 2012. Hooray! The only tweak I made to the emacsclientw invocation described in the blog post was to make “+$(CurLine)” the first parameter of the emacsclient invocation. That way, the Emacs cursor position is synchronised with the cursor position in Visual Studio at the time you invoked “Edit in Emacs”.

Moving to a multi-VHD Windows installation to separate work and personal data

I had been thinking about setting myself up with a way to work from home in a disconnected fashion. Most of the places I’ve worked at in the past required me to remote into the work desktop, which is a good idea if both sides have 100% uptime on their network connection and no issues with them being affected by adverse weather. Which in reality means that the connections tended to be unstable if the weather dictated that one really, really wanted to work from home on a particular day because snowfall was horizontal, for example. My current employer is more enlightened in this matter so my suggestion of locking all the necessary tools and source code inside a VM that would allow me to work from home even if the Internet connection was unavailable at either end was given the go ahead. Given that my desktop here is plenty powerful for most development tasks (it’s an older Intel Mac Pro with dual Xeons), this should be an idea solution.

Only, with the VM software I was trying out, the virtualised disk throughput was lacking a little. The product I’m working on uses Qt and it took a day to build the commercial version of 4.7.4 inside the VM, with one of the Xeons allocated to VM duty. Oops. Some more digging pretty much confirmed that the main issues was the disk throughput or lack thereof. At this point I came across Scott Hanselman’s article on how to boot Windows off a VHD. My understanding is that Bootcamp only supports booting of a single Windows partition so this sounded ideal to me – just put a VHD with all the tools and the source code on the boot partition I already have, then boot from the VHD if I need to. Donn Felker’s blog entry on booting off a VHD on a Bootcamp’d Mac added the one missing piece of information, namely that one should ignore the warning from the Windows 7 installer that the disk (VHD) you’re about to install on isn’t support and that there might be driver issues. Just go ahead and do it anyway.

After the installation and dropping all the tools on the VHD – I’m getting a little too familiar with the Visual Studio installer by now – Qt built pretty much in the expected time and the project itself can also be build within a reasonable amount of time. My guess the build is 5%-10% slower than on the work machine, but the work machine is building on an SSD and obviously hasn’t got a virtualised hard disk to deal with either. On the other hand my own machine has the benefit of 8 real cores.

Why all the effort? I don’t like mixing work projects and my own stuff, for starters. If I can lock work into a VM or at least some kind of a sandbox, there’s less of a chance of accidental cross-pollination between the two and no licensing headaches either. The latter is especially important to me as there are some software licenses that are “duplicated” in the sense that I have both a work and a personal license. And of course there’s the little detail that the work VM data can simply be destroyed by deleting the VM/VHD if it proves necessary.

Even though I did originally intend to only set up a single VHD for work purposes and keep all the personal software and data on my main disk, I’ve ended up creating a second VHD specifically for a couple of car racing simulators that I use (iRacing and rFactor). I’m not a big gamer but I do like track driving in the real world and using the simulators tends to help with familiarising yourself with a track, plus it helps in the off season, too. iRacing had a bit of a problem with the various bits of security software I have installed on my main Windows and given that I had a spare license anyway, it made sense to put it in its own “virgin Windows” sandbox. No issues since. Well, none related to the software…