Oh Noes, my is load high! Part 2

We have seen what a server load consists of and now we will look at how to reduce that load. How do we define high? Please see the Server load link here. Server load can be elevated for any reasons and we will look at a few ways to determine where the increased load is coming from and how to deal with it.

As a rule of thumb, you should have 1Gig of RAM for each processor on the server.

First, lets make sure we have the tools to investigate the problem:

htop: install it now, it will save you many headaches in the long run by allowing you to differentiate where the load is coming from and how to deal with it accordingly.
wget http://downloads.sourceforge.net/htop/htop-0.8.1.tar.gz?use_mirror=superb-east tar -xzvf htop-0.8.1.tar.gz cd htop-0.8.1 ./configure make make install

Lets start with determining what service is causing the load; run htop, then press F6 to sort by Memory usage, CPU usage and finally by Time. Make a note of the top processes that are listed under each search function. We will be using these momentarily.

If you see that apache (httpd) is using the most resources, then lets fire up a ssh session to search for the culprit.
netstat -tn 2>/dev/null | grep :80
This will show the number of connections to the webserver in total, but the output is a bit large and unwieldy so lets refine the output;
netstat -tn 2>/dev/null | grep :80 | awk '{print $5}' | cut -f1 -d: | sort | uniq -c | sort -rn | head
This will give a more concise view of who is connecting to the server and how many connections they have open.
122 69.252.110.237 2 98.241.142.113 1 81.184.31.154 1 71.173.235.103 1 208.80.194.39
If you see a large number of connections from a specific IP, temporarily APF the IP and see if the load starts dropping. Many times I have seen this happen as someone tries to scan or download the site. This can also happen when a search bot (aka googlebot) scans the site. Use a nslookup on the ip and see where it’s coming from to be sure.

if the number of connections seems reasonable, lets look for the site with the most connections as the load may be related to a PHP script;
/usr/bin/lynx -dump -width 500 http://127.0.0.1/whm-server-status | awk 'BEGIN { FS = " " } ; { print $12 }' | sed '/^$/d' | sort | uniq -c
this will give you an output something like;
1 Port 1 VHost 96 g33kinfo.com 100 host.server.com 1 process 1 slot
Now lets take a look at the busiest script and see if it matches to the site;
/usr/bin/lynx -dump -width 500 http://127.0.0.1/whm-server-status | grep GET | awk '{print $14}' | sort | uniq -c | sort -rn | head
will output something like
25 /info/?p=327 17 /info/?p=298 13 /info/?p=1024 10 /info/?p=327 3 /whm-server-status?auto 1 /whm-server-status
and it looks like it does. Take a look at the coding on that site “/info/?p=327” and see what is causing it. You may need a developer to look at this to determine what is going wrong. You can also ask your host to optimize the apache conf file for you. There are certain generic configurations for the amount of RAM available/CPU’s available per server. This should be well documented on their end if they are worth their salt.
/usr/bin/lynx -dump -width 500 http://127.0.0.1/whm-server-status
will also give more information regarding apache status.

If MySQL is causing the issue, and is spawning hundreds of child processes, your queries may be malformed or loops may be present. We will not delve into the intricacies of mysql here because it is too broad a topic to cover in so small a space. Lets start by gathering some mysql information;
mysqladmin proc stat
output will look something like;
Uptime: 457478 Threads: 1 Questions: 165272 Slow queries: 0 Opens: 322 Flush tables: 1 Open tables: 128 Queries per second avg: 0.361
if you see alot of slow queries, then it would be time to setup a slow query log to log those queries which are causing the issue.
mysqladmin processlist |wc -l
the output here will let you know the current number of connections to mysql. If this number is also large, it could be that the coding may be the issue. You will need to check the current number of apache connections and compare it to the number of mysql connections. Let me also state here that using a firefox addon like yslow or Google’s Page Speed will be invaluable in determining where the issue may be.
ps xa | grep mysqld
this will show the type and number of mysql processes currently running on the server.
netstat -lntpe | grep mysql
shows something like
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 100 3477227538 5559/mysqld
The possible solution to this may also be to optimize /etc/my.cnf to better conform to your specific standards. Your host should have basic my.cnf configurations to work from and they will tweak from there.

Enable Worker as opposed to Prefork

To install worker, remove the comment from line 9 in /etc/sysconfig/httpd and then restart apache
Note: CentOS won’t let you run PHP as a DSO with worker.

Most PHP developers express a valid concern with their reasoning as to why its better to run PHP with MPM_prefork instead of MPM_worker. They say that without having separate execution threads and the separate memory segments, PHP could cause crashes. I believe this is a valid concern, especially in a test environment, but in a high traffic production situation, I feel that these risks are outweighed by the benefits of running MPM_worker.

Prefork is older technology, based off of the stable Apache 1.3. Prefork is also the default MPM for Apache 2.x. It works, but in my experience it doesn’t handle large traffic loads as well as Worker. Servers that were running prefork and were choking on the traffic load, run more stable with Worker and I often see an increase in the amount of requests Apache can serve out per second. Also, from my experiences, if PHP is going to cause a server to crash due to bad code or what have you, the type webserver is matters very little. I’ve seen PHP crash a box running lighttpd because the PHP code being ran on it is was poor.

I feel that MPM_worker will be better suited for a server if it is expecting to receive higher levels of traffic. If you are concerned with potential memory leaks with Worker, I have two recommendations. One is to run PHP not as a Apache module, but as either CGI, Fast CGI or suPHP. All of these options allow PHP to run as their own process, as the PHP developers recommend. Another suggestion would be to set the Apache MaxRequestsPerChild directive to a low, non-zero number. MaxRequestsPerChild sets a limit to how many requests an Apache process will handle before killing it and spawning a new process. Setting this to a low value would force Apache to respawn processes more quickly, preventing a potential memory-sink in the process.

If you are using a VPS, another instance may be causeing the parent to swap by using more memory that it is allowed and this will affect your instance and slow down the server overall.

To benchmark your server, test apache from remote server:
ab -n 1000 -c 5 https://g33kinfo.com/info/
from localhost:
ab -kc 10 -t 30 http://localhost/
More information on benchmarking your server can be found here:
http://www.cyberciti.biz/
http://www.devside.net/

Final words:
IMHO, If you are running a thousand sites on a low end production server (1-2 cpu’s) with 2Gigs of RAM, please do not wonder why your load is high, use that large protrusion on top of your neck and realize that spending 52 dollars a month and having constant headaches with a poorly preforming server is worse than spending 98 dollars a month with a server you do not need to worry about high loads on!
Just my 2 cents worth…