LBaaS Performance Benchmarking

Testing/Benchmarking of LBaaS in CPE Cloudfire

We played around with different tools to benchmark our LBaaS (Haproxy) performance, few of them being Tsung, iperf, ab, etc. After some discussion and investigation, we boiled down to providing 2 metrics to our customers for LBaaS:

Requests/second
Throughput

To get optimal performance off our apache servers and LBaaS, we did some TCP optimizations on apache server and the hypervisor where our LBaaS resides

# /etc/sysctl.conf
# Increase system file descriptor limit
fs.file-max = 100000

# Discourage Linux from swapping idle processes to disk (default = 60)
vm.swappiness = 10

# Increase ephermeral IP ports
net.ipv4.ip_local_port_range = 10000 65000

# Increase Linux autotuning TCP buffer limits
# Set max to 16MB for 1GE and 32M (33554432) or 54M (56623104) for 10GE
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Make room for more TIME_WAIT sockets due to more clients,
# and allow them to be reused if we run out of sockets
# Also increase the max packet backlog
net.core.netdev_max_backlog = 50000
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 1

# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0

# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192

# Disable source routing and redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0

# Log packets with impossible addresses for security
net.ipv4.conf.all.log_martians = 1

We also observed during our tests that the default Haproxy limits the maximum concurrent connections on the loadbalancer to around 4000. We modified the haproxy.cfg file to increase the open file limit and the maxconn to cater to large number of concurrent connections:

global
    daemon
    user nobody
    group nogroup
    ulimit-n 200000
    maxconn 50000

We conducted different tests with Apache Benchmarking tool with different payload sizes, the standard "It works" index.html page which is 177 Bytes, 1KB, 10KB and 100KB. We also tweaked the number of parallel/concurrent connections while measuring the performance. Following tables explained the results we achieved. We were running 16 vCPU, 30GB Large Apache2 servers for our testing purpose.

HTTP Testing:

1) Requests/second to apache server(across 1 million requests and 3 iterations):

Parallel/ Response size	10	100	1K	5K	10K	20K
177 bytes (index)	37459	35022	39330	37600	35237	28000
1K	28000	34000	37000	36000	30833	26922
100K	11031	10562	4100	4670	4433	3240

2) Requests/second through LBaaS with 2 backend servers (across 1 million requests and 3 iterations):

Parallel/ Response size	10	100	1K	5K	10K	20K
177 bytes (index)	20812	49200	40354	37000	37000	36000
1K	20082	47800	42000	35069	31404	34415
100K	5026	6471	5682	4716	4781	5148

SSL Testing:

1) HTTPS Requests/second to apache server (across 1 million requests and 3 iterations):

Parallel/ Response size	10	100	1K	5K	10K	20K
177 bytes (index)	21878	24596	24097	21088	20095	19034
1K	19055	22517	20069	18657	18498	17633
100K	2022	2322	2467	1967	1944	1768

2) HTTPS requests/second through LBaaS with 2 backend servers (across 1 million requests and 3 iterations):

Parallel/ Response size	10	100	1K	5K	10K	20K
177 bytes (index)	20071	26510	19210	20633	19435	19735
1K	18647	24895	19849	19344	18678	18498
100K	1987	2452	2322	1988	2010	2068

Throughput Testing:

With LBaaS sitting in middle of client and server, there is definitely some impact on the maximum throughput we can get over the wire. Following are the throughputs we observed:

1) Throughput between client and server without LBaaS (measured across 30s of iperf traffic):

Parallel/	1	10	100	1000	10000
Throughput	8.55 Gbps	8.81 Gbps	7.38 Gbps	6 Gbps	4.6Gbps

2) Throughput between client and server with LBaaS (measured across 30s of iperf traffic):

Parallel/	1	10	100	1000	10000
Throughput	3.60 Gbps	3.43 Gbps	3.42 Gbps	3.09 Gbps	3.90 Gbps

We did some tweaking with haproxy config (haproxy.cfg) and figured out that increasing the number of haproxy processes increases throughput significantly. The config change we did was:

global
    daemon
    user nobody
    group nogroup
    ulimit-n 200000
    maxconn 50000
    nbproc 4

Parallel/	1	10	100	1000	10000
Throughput	3.23 Gbps	7.28 Gbps	6.81 Gbps	6.78 Gbps	6.33 Gbps

Please note that this config change is still not in the code, we would require some change in vrouter agent to handle this new config.

HTTP Testing:

SSL Testing:

Throughput Testing:

Latest Images

Trending Articles

Latest Images