Note: I have written a python script to check a system and report back on any potential Out-Of-Memory issues: curl -s https://raw.githubusercontent.com/LukeShirnia/out-of-memory-investigation.py/master/oom.py | python \\ \\ == Step 1 == did apache hit max clients? CentOS/RHL grep -i maxc /var/log/httpd/error_log Ubuntu/Debian grep -i maxc /var/log/apache2/error_log Note: Could use error.log rather than error_log == Step 2 == Did the server run out of memory and kill processes that killed a vital service? CentOS/RHL egrep -i 'killed process|invoked oom' /var/log/messages Ubuntu/Debian egrep -i 'killed process|invoked oom' /var/log/syslog \\ Summarise date and times the server ran out of memory (CentOS/RHEL): zgrep -i 'killed process\|invoked oom' /var/log/messages* | awk '/Killed process/ {print $1, $2, $3}' | awk -F: '{print $2}' | uniq -c | sort -k2,3r zgrep -i 'killed process\|invoked oom' /var/log/messages* | awk '/Killed process/ {print $1, $2, $11}' | awk -F: '{print $2}' | uniq -c | sort -k2,3 Example output: 2 Apr 25 (driveclient) 1 Apr 27 (driveclient) 2 Apr 29 (driveclient) 1 May 12 (driveclient) 1 May 2 (driveclient) 1 May 22 (driveclient) 1 May 5 (driveclient) 1 May 9 (driveclient) \\ Debian/Ubuntu zgrep -i 'killed process\|invoked oom' /var/log/syslog* | awk '/Killed process/ {print $1, $2, $3}' | awk -F: '{print $2}' | uniq -c \\ \\ ==Step 3 == Check to see if the website received traffic during the period of 'down time' Investigating a 'site down' issue. Best command to see check all log is: cat /var/log/httpd/*access*log* | grep "27/Aug/2015:04:[3..5][0..9]" | awk {' print $1 '} | sort | uniq -c | sort -rn cat /var/log/httpd/*access*log* | grep "22/Aug/2015:04:[3-5]" | awk {' print $1 '} | sort | uniq -c | sort -n Checking old log files (compresses) \\ zcat /var/log/httpd/*access*log*.gz | grep "22/Aug/2015:04:[3..5][0..9]" | awk {' print $1 '} | sort | uniq -c | sort -n \\ grep 24/Apr/2015:02:[3-6][0-9]:* /var/log/httpd/access_log | awk '{print "IPaddress", $1,"Time",$4, $9}' This will grep for: date 24/April/2015 02:30 - 02:60 \\ E.g: IPaddress x.x.x.x Time 24/Apr/2015:02:36:01 200 IPaddress x.x.x.x Time 24/Apr/2015:02:36:11 200 IPaddress x.x.x.x Time 24/Apr/2015:02:36:11 200 IPaddress x.x.x.x Time 24/Apr/2015:02:36:12 200 \\ \\ The following can summaries all of the IP addresses hitting the access logs during a certain date/time grep "12/Jun/2015:12" access_log | awk {'print $1'} | sort | uniq -c | sort -n | tail -n 20 E.g: 492 x.x.x.1 497 x.x.x.15 501 x.x.x.158 504 x.x.x.19 517 x.x.x.201 518 x.x.x.122 \\ \\ See what a specific IP address is doing (access logs) during a specific date/time grep "12/Jun/2015:12" access_log | grep x.x.x.11 | tail -n 20 \\ x.x.x.11 - - [12/Jun/2015:12:43:05 +0500] "POST /newsite/stats/recording.php HTTP/1.1" 200 407 "-" "-" x.x.x.11 - - [12/Jun/2015:12:42:50 +0500] "POST /newsite/api2/index.php HTTP/1.1" 200 1238 "-" "Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0 REQUESTED FROM: http://www.example.com/profolio/index.php?tabs=2§ion=add_property REF: http://www.example.com/profolio/index.php?tabs=2§ion=add_property API KEY: x.x.x.11 - - [12/Jun/2015:12:43:05 +0500] "POST /newsite/api/index.php HTTP/1.1" 200 3697 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5" x.x.x.11 - - [12/Jun/2015:12:43:06 +0500] "POST /newsite/api2/index.php HTTP/1.1" 200 766 "-" "Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0 REQUESTED FROM: http://www.example.com/profolio/includes/add_property_single.php? REF: http://www.example.com/profolio \\ Following shows what IPs connected to the server during a time period, this can be used to see if the server was down at the time listed. grep "16/Jun/2015:08:[0-2][0-9]:*" /var/log/httpd/access_log | awk '{print $1}' | sort | uniq -c | sort -nr \\ The following command shows all IP addresses accessing all domains on the server for a specific date/time: cat /var/www/vhosts/*/logs/*log | grep "20/Jul/2015:19" | awk {' print $1 '} | sort | uniq -c | sort -n