Category Archives: blog post

Redis + systemd-nspawn

This is me re-visiting something I made a few years back for a production workload using utilities “not ready for production”. What could go wrong?

The goal was thousands of redis processes with some isolation. Previously their was none and everyone was root. This isn’t the entire process, but some of the main configuration points.

## Unmess sh
cd /bin                                                                            
sudo rm sh                                                                         
sudo ln -s bash sh                                                                 
## The usual suspects                                                              
sudo apt-get update                                                                
sudo apt-get install -y \                                                          
    build-essential \                                                              
    git-core \                                                                     
    xfsprogs \                                                                     
    dbus \                                                                         
    debootstrap \                                                                  
    curl \                                                                         

Here we are just making the container mount points and sources. I don’t know why logs are stored in /mnt, it wasn’t my decision.

## Build the container                                                          
sudo mkdir -p /var/lib/machines/redis                                           
sudo mkdir /opt/null                                                            
sudo debootstrap --arch=amd64 jessie /var/lib/machines/redis/                   
## Create redis user                                                            
sudo useradd -u 9000 -s /bin/false -m redis                                     
sudo useradd -u 9000 -s /bin/false -R /var/lib/machines/redis/ redis            
## Modify /mnt permissions                                                      
sudo chmod g+w /mnt                                                             
sudo chown :redis /mnt                                                          
## Modify redis ulimits                                                         
sudo tee /etc/security/limits.d/redistogo.conf <<'EOF'                          
redis   soft    nofile  100000                                                  
redis   hard    nofile  100000                                                  

systemd-nspawn uses D-Bus. D-Bus has a problem where it limits the number of connections a user, including root, can make. Hitting this max causes the system to die and have to be restarted. I was unable to save it from this and I lost the link to the response from developers. It's a hardcoded config but it can be configured.

## Configure DBuS to not suck                                                          
sudo tee /etc/dbus-1/system.d/redisprov.conf <<'EOF'                               
    <limit name="service_start_timeout">120000</limit>                             
    <limit name="auth_timeout">240000</limit>                                      
    <limit name="pending_fd_timeout">150000</limit>                                
    <limit name="max_completed_connections">100000</limit>                      
    <limit name="max_incomplete_connections">10000</limit>                      
    <limit name="max_connections_per_user">100000000</limit>                    
    <limit name="max_pending_service_starts">10000</limit>                      
    <limit name="max_names_per_connection">50000</limit>                        
    <limit name="max_match_rules_per_connection">50000</limit>                  
    <limit name="max_replies_per_connection">50000</limit>                      

Now we write a few unit files that build the container. The symlinking allows you to use a variable startup to launch different processes and use different ports.

# Install Redis systemd unit files                                              
#   Note(crainte): Not certain why but a direct symlink to the service file     
#                  will not allow the dynamic version to start. We have to      
#                  do the include for it to pick up the correct variable.       
#   Examples:                                                                   
#       systemctl status [email protected]                                            
#       systemctl start [email protected]                                             
#       systemctl stop [email protected]                                              
#       machinectl status redis-2.8.21-9000                                     
sudo tee /etc/systemd/system/redisprov.include <<'EOF'                          
.include /etc/systemd/system/redisprov.service                                  
sudo tee /etc/systemd/system/redisprov.service <<'EOF'                          
ExecStart=/usr/bin/systemd-nspawn --user=redis --keep-unit --machine=redis-%p-%i -j --directory=/var/lib/machines/redis/ --bind=/usr/local/bin/redis/%p/:/bin/ --bind=/mnt --bind=/opt/null:/sbin --bind=/home/redis/%i redis-server /home/redis/%i/rprov.conf

Now after "systemctl start [email protected]" & "systemctl start [email protected]" you will have two redis processes running under nspawn. As I mentioned, these are notes from a few years back. If revisiting I could most likely remove the debootstrap option since redis isn't using that. But if you want to run a more full featured application under nspawn, you might need it.

Good luck.

Ubuntu 12 + lxc + Rackspace

First let’s build an Ubuntu 12 server

cloud> servers create
Server name: lxc-host
Image ID: 125
Flavor ID: 3
   "server" : {
      "status" : "BUILD",
      "progress" : 0,
      "name" : "lxc-host",
      "imageId" : 125,
      "addresses" : {
         "private" : [
         "public" : [
      "flavorId" : 3,
      "hostId" : "c395c82dd1963e52098e77819fda2b86",
      "metadata" : {},
      "id" : 20898849

Make sure everything is up-to-date

[email protected]:~# apt-get update
[email protected]:~# apt-get upgrade

Install some necessary utilities for this.

[email protected]:~# apt-get install lxc debootstrap bridge-utils screen

Now we need to make the bridge in /etc/network/interfaces

# lxc bridge
auto lxcbr0
iface lxcbr0 inet static
        post-up echo 1 > /proc/sys/net/ipv4/ip_forward
        post-up iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
        pre-down echo 0 > /proc/sys/net/ipv4/ip_forward
        pre-down iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
        bridge_ports none
        bridge_stp off

Activating the bridge…

[email protected]:~# ifup lxcbr0

Now it’s time to create the first container. The available templates should be listed here:

[email protected]:~# ls /usr/lib/lxc/templates/
lxc-busybox  lxc-debian  lxc-fedora  lxc-opensuse  lxc-sshd  lxc-ubuntu  lxc-ubuntu-cloud

To be special let’s make a fedora guest

[email protected]:~# vim /etc/lxc/auto/fedora.conf = veth = up = lxcbr0 = = eth0
lxc.cgroup.cpu.shares = 512
lxc.cgroup.memory.limit_in_bytes = 1024M
lxc.cgroup.memory.memsw.limit_in_bytes = 3072M

But we might need something first

[email protected]:~# apt-get install yum

The actual lxc-create syntax

[email protected]:~# lxc-create -n fedora-guest -t fedora -f /etc/lxc/auto/fedora.conf 
[email protected]:~# vim /var/lib/lxc/fedora-guest/rootfs/etc/sysconfig/network-scripts/ifcfg-eth0
[email protected]:~# screen -dmS init-fedora-guest lxc-start -n fedora-guest
[email protected]:~# screen -ls
There is a screen on:
	19959.init-fedora-guest	(06/06/2012 09:55:39 AM)	(Detached)

And voila! I’ll admit this fedora install seems somewhat messed up, but that’s a problem with the template itself. You can correct the install or use something else like ubuntu/ubuntu-cloud for better results. Let me know if you have questions.

iSCSI performance in the cloud

This is one of the more entertaining things I’ve done at work and according to the emails i received from co-workers one of the most controversial. Someone asked if this could be done, and i figured i’d give it a shot. This will be in two parts, the first being FileIO since I was having problems trying to get partition re-sizes to work in the cloud. Part two will be BlockIO once I get around to fixing the kernels.

I used CentOS 6 for all of the servers involved in this setup and all network traffic is over the internal service network.

[[email protected] ~]# dd if=/dev/zero of=/root/block.img bs=4M count=1250

The other iSCSI drive is a RAID 0 configuration using a much smaller block size for comparison.

[[email protected] ~]# dd if=/dev/zero of=/root/block.img bs=512K count=30720 
[[email protected] ~]# dd if=/dev/zero of=/root/block.img bs=512K count=30720
[[email protected] ~]# mdadm -Cv /dev/md0 -l0 -n2 -c128 /dev/sdb1 /dev/sdc1

All of the targets are setup the same way:

Target iqn.2012-03.local.node:vsan2 
Lun 0 Path=/root/block.img,Type=fileio,IOMode=wb 
Alias vsan2 
ImmediateData Yes 
MaxConnections 2 
InitialR2T No

I’m just using hdparm for basic tests. I wasn’t really inspired to test further. For comparison here is the actual block device offered by xenserver:

[[email protected] ~]# hdparm -tT /dev/xvda 
Timing cached reads: 1948 MB in 1.99 seconds = 977.19 MB/sec 
Timing buffered disk reads: 82 MB in 3.10 seconds = 26.45 MB/sec

That is actually lower than what I received before during my tests, but it’s what I got when writing this response. Now to test the single drive, 4M block size device:

[[email protected] ~]# hdparm -tT /dev/sda 
Timing cached reads: 1696 MB in 1.99 seconds = 850.29 MB/sec 
Timing buffered disk reads: 16 MB in 3.40 seconds = 4.71 MB/sec

And finally for the RAID0 performance results:

[[email protected] ~]# hdparm -tT /dev/md0 
Timing cached reads: 1968 MB in 1.99 seconds = 987.01 MB/sec 
Timing buffered disk reads: 28 MB in 3.04 seconds = 9.21 MB/sec

Kinda meh results, but that’s what you can expect from fileio. I’m really looking forward to the blockio tests, but the kernels on the gentoo images are 3.0+ and I need 2.6+ for iscsi to work. When I get around to it i’ll finish setting that up.