The Lone C++ Coder's Blog

The Lone C++ Coder's Blog

The continued diary of an experienced C++ programmer. Thoughts on C++ and other languages I play with, Emacs, functional, non functional and sometimes non-functioning programming.

Timo Geusch

5-Minute Read

RHEL 7 – and CentOS 7, which I used for this test – use tuned.conf to set a lot of system settings. Several of the tuned settings affect MongoDB’s performance; some are important enough that mongod actually triggers startup warnings. The main setting is transparent huge pages, which is a setting that does not work very well with databases in general.

The MongoDB documentation already describes how to disable Transparent Huge Pages (aka THP) using tuned.conf, but there are several other settings that mongod tends to warn users about if you run it on an out-of-the-box CentOS 7.

So what are those settings and why are they bad?

The typical set of startup warnings you get after disabling THP on a stock RHEL/CentOS 7 is twofold:

  • mongod usually warns about readahead settings being too large, which is especially important when using the WiredTiger storage engine
  • Also, mongod warns that THP defragmentation is still enabled after THP itself is turned off.

Out of the two, readahead setting is the more important setting as it has a direct impact on the database performance.

Why too much readahead is bad for your database

This is the case for all databases, not only MongoDB. Over time, the distribution of reads in any database approaches random in almost all cases. Readahead is a setting that is intended to speed up large-ish linear reads by pulling more data into the filesystem cache than the user asked for. The standard, fairly large, readahead will result in the kernel loading the pages of data the database asked for, plus a lot of additional pages. These additional pages then take up space in the file system cache even if they are not used. This will result in a lot of unnecessary I/O and file system cache page evictions. All that while your database server could do something useful, like send data to clients.

The MongoDB documentation contains a section on setting the readahead in the production notes. It explains how to set the readahead using set_ra and also lists appropriate values for the readahead.

Readahead is an ephemeral setting and does not survive a reboot. Clearly this is a potential issue as the server administrator has to update the settings manually on every server restart, or use a script to do so.

Setting readahead via tuned.conf

Turns out you can actually set readahead for block devices using tuned.conf, which turns it into a set-and-forget setting rather than an item on someone’s checklist. It also has the advantage of keeping several settings in the same place instead of adding more scripts.

Here’s an example of a tuned.conf based on the one used to disable THP. Please note this is specifically designed for use inside a virtual machine and thus includes virtual-guest. It disables THP and also sets readahead for the specific block device that mongod stores its data on:

    
    [main]
    include=virtual-guest
    
    [vm]
    transparent_hugepages=never
    
    [disk]
    devices=xvda*
    readahead=16
    

The relevant section here is the [disk] section that lists a set of devices and their readahead settings. I created the above example on an AWS instance running CentOS that only had a single block device, so the device setting itself affects the whole system. Unfortunately the disk module that is part of tuned doesn’t appear to be that well documented so some experimentation is in order to arrive at the correct values for the readahead setting. In this case, I chose to set the readahead slightly higher than recommended to avoid slowing down other file system reads too much. If the setting would apply to a device that only contains the databases data directory, I would set readahead to a lower value or 0.

To use this configuration, follow the instructions on the MongoDB page that describes how to disable THP. The tl;dr version of the instructions is:

  • Create a new tuned profile by creating a directory in /etc/tuned that is named after your new profile
  • Save the example configuration in a new file with the name tuned.conf in the newly created directory
  • Activate the newly created profile by running ‘sudo tuned-adm profile

Disabling THP defragmentation

As I showed above, you can disable THP and set the readahead using stock tuned modules and settings. Unfortunately, the same doesn’t apply to disabling THP defragmentation. However, there is a neat way to achieve this. A little research shows that it is possible to run a script from inside a tuned.conf file. The method to do so was kindly described by user vcarel on Serverfault here.

Applying vcarel’s method to our tuned.conf file above gives us the following file:

    
    [main]
    include=virtual-guest
    
    [vm]
    transparent_hugepages=never
    
    [disk]
    devices=xvda*
    readahead=32
    
    [script]
    script=disable-defrag.sh
    

As you can see, the only change to the previous version of the configuration is the addition of the [script] section. This section triggers the execution of a script when the tuned configuration is activated. Obviously we now need the disable-defrag.sh script, too, which looks like this:

    
    #!/bin/sh
    
    . /usr/lib/tuned/functions
    
    start() {
    echo never > /sys/kernel/mm/transparent_hug<wbr />epage/defrag
    return 0
    }
    
    stop() {
    return 0
    }
    
    process $@
    

A pretty simple script that sets THP defrag to ’never'.

Applying the last version of tuned.conf shown above together with the script will turn off both mongod startup warnings regarding THP and also the warning about readahead.

Disclaimer: I work for mongoDB as a consulting engineer, but this is my personal blog. All opinions expressed herein are mine and mine alone. They don’t reflect the opinion of employers past, present or future and don’t constitute endorsements by them either.

Recent Posts

Categories

About

A developer's journey. Still trying to figure out this software thing after several decades.