Redis is an excellent key/value cache that is used across many of Shokunin's customers. While redis is an great piece of software it is often difficult to obtain information about actually running it in production from an operational perspective. This article aims to discuss the necessary steps that ops teams should take before running redis in a production environment.
Receive Packet Steering (RPS) / CPU Preferences
Redis is mostly single threaded application. To ensure that redis is not running on the same CPU as those handling any network traffic, it is highly recommended that RPS is enabled.
To enable RPS on CPUs 0-1:
echo '3' > /sys/class/net/eth1/queues/rx-0/rps_cpus
Redhat has a detailed guide on RPS.
To set the CPU affinity for redis to CPUs 2-8
# config is set to write pid to /var/run/redis.pid $ taskset -pc 2-8 `cat /var/run/redis.pid` pid 8946's current affinity list: 0-8 pid 8946's new affinity list: 2-8
Below is an example of the performance boost from a stock compile on a temporary host
|RPS Status||Get Operations/second||Set Operations/second|
Tuning the kernel network stack
To ensure that redis handles the large number of connections in a high performance environment tuning the following kernel parameters is recommended.
vm.swappiness=0 # turn off swapping net.ipv4.tcp_sack=1 # enable selective acknowledgements net.ipv4.tcp_timestamps=1 # needed for selective acknowledgements net.ipv4.tcp_window_scaling=1 # scale the network window net.ipv4.tcp_congestion_control=cubic # better congestion algorythm net.ipv4.tcp_syncookies=1 # enable syn cookied net.ipv4.tcp_tw_recycle=1 # recycle sockets quickly net.ipv4.tcp_max_syn_backlog=NUMBER # backlog setting net.core.somaxconn=NUMBER # up the number of connections per port net.core.rmem_max=NUMBER # up the receive buffer size net.core.wmem_max=NUMBER # up the buffer size for all connections
Tuning the kernel memory
Under high performance conditions we noticed the occasional blip in performance due to memory allocation. It turns out this was a known issue with transparent hugepages.
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
Set file descriptor limits for the redis user
If you have not set the correct number of file descriptors for the redis user, you may see the following log lines:
 13 Nov 07:24:14.514 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.  13 Nov 07:24:14.514 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.  13 Nov 07:24:14.514 # Current maximum open files is 1024. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
Disable saving redis to disk in redis.conf
Redis will attempt to persist the data to disk. While redis forks for this process, it still slows everything down.
Comment out the lines that start with save
#save 900 1 #save 300 10 #save 60 10000
If you need to persist the data, run a slave and use that to persist data as it will cause less of a slowdown.
Set tcp-backlog in redis.conf
Newer versions of redis have their own backlog set to 511 and you will need this to be higher if you have many connections
# TCP listen() backlog. # In high requests-per-second environments you need an high backlog in order # make sure to raise both the value of somaxconn and tcp_max_syn_backlog tcp-backlog 65536
set slave configs
# serve stale data if the sync is not complete slave-serve-stale-data yes # stop yourself from accidentally writing to the slave slave-read-only yes
The default is 10000 and if you have many connections you may need to go higher.
# Once the limit is reached Redis will close all the new connections sending # an error 'max number of clients reached'. maxclients 10000
By default redis is set to suck up all available memory on the box. We like to set this to 80% of the system memory using a facter fact. When running several instances of redis on a single machine this should be tuned down.
This setting can be changed on a running process.
# memory size in bytes maxmemory 1288490188
Due the the mostly single threaded nature of redis, it is often beneficial to run more than one instance per box. In this case be sure to set the CPU affinity separately for each instance.
Using Twemproxy/Nutcracker in front of several instances is an common way to spread out the keys by using consistent hashing.
The redis server will respond to the PING command when running properly
$ redis-cli -h redis.example.com -p 6379 PING PONG
Since we generally recommend setting the maxmemory size, it is possible to calculate the percentage of memory in use and alert based on result
$ redis-cli INFO |grep used_memory: used_memory:424992 $ redis-cli config get maxmemory 1) "maxmemory" 2) "10000000"
Alert if uptime is less than you expect
$ redis-cli INFO |grep uptime_in_seconds uptime_in_seconds:86514
Cache hit rate
This information can be calculated from the INFO command
$ redis-cli INFO stats |grep keyspace keyspace_hits:1920 keyspace_misses:930
Key eviction and expiration
Eviction occurs when redis has reached its maximum memory and maxmemory-policy in redis.conf is set to something other than volatile-lru.
$ redis-cli INFO stats |grep evicted_keys evicted_keys:11582
Keys in redis can be set with a time to live which is generally a good practice
$ redis-cli SET mykey myvalue EX 600
It is a good idea to keep an eye on the expirations to make sure redis is performing as expected
$ redis-cli INFO stats |grep expired_keys expired_keys:15436
We also recommend graphing the size of the keyspace as a quick drop or spike in the number of keys is a good indicator of issues.
$ redis-cli INFO keyspace # Keyspace db0:keys=1075,expires=1075,avg_ttl=2110
The final two stats that we recommend graphing are the following that indicate the workload place on the redis server
$ ~/tmp/redis-2.8.17/src/redis-cli INFO stats |egrep "^total_" total_connections_received:70725 total_commands_processed:70723
Redis is extremely useful in many types of environments, but can be a little daunting at first. By following the best practices, it can perform as a rock solid caching layer to speed up your application.