I recently needed a really, really simple remote monitoring agent to keep track of server availability. Actually, I didn’t even need a record of uptime or anything; I merely needed to be alerted if any of a few web servers had gone down unexpectedly. So, I wrote one.
I know there are lots of other solutions for this problem, many of which are free. You can do this with Nagios, for example. But I don’t have an existing Nagios installation already setup, and I didn’t want to deploy monstrous beast of a platform of which I needed only 0.1% of the capabilities. There are also many web-based services that will do this for you, but most of these have a pretty large “check” interval, or else they cost money. I was feeling both cheap and picky, and I knew everything I needed to do could be accomplished using cron, bash, and wget. It works great, so I wanted to share what I came up with.
Here is a basic feature list:
- Requires bash, wget, and working “mail” command (sendmail, exim, postfix, etc.)
- Monitors any HTTP or HTTPS URL, checking for “200” status returned
- Checks request fulfillment time for slow responses, not just UP/DOWN
- Sends email notification on state change (UP, SLOW, or DOWN)
- Customizable To/From notification settings
- Customizable SLOW latency threshold
- Tracks previous state change and last update to avoid multiple notifications
- Uses single text file for data storage, no DB necessary
It’s pretty basic, but it works. Just save this script somewhere, modify the HOSTS and other settings to taste, and add a cron job definition to run it every five minutes or so. No Nagios or 3rd party solutions required. If this is what you’re looking for, then…awesome!
Sample cron job definition
*/5 * * * * root /home/username/sitemonitor.sh
sitemonitor.sh script source
#!/bin/bash
# Simple HTTP/S availability notifications script v0.1
# March 20, 2012 by Jeff Rowberg - http://www.sectorfej.net
HOSTS=( \
"http://www.yahoo.com" \
"https://www.google.com" \
"http://www.amazon.com" \
)
NOTIFY_FROM_EMAIL="Site Monitoring <monitoring@example.com>"
NOTIFY_TO_EMAIL="Server Admin <admin@example.com>"
STATUS_FILE="/home/username/sitemonitor.status"
SLOW_THRESHOLD=10
OK_STATUSES=( "200" )
################################################################
# NO MORE USERMOD STUFF BELOW THIS, MOST LIKELY #
################################################################
# thanks to stackoverflow!
# stackoverflow.com/questions/3685970/bash-check-if-an-array-contains-a-value
function contains() {
local n=$#
local value=${!n}
for ((i=1; i < $#; i++)) {
if [ "${!i}" == "${value}" ]; then
echo "y"
return 0
fi
}
echo "n"
return 1
}
rm -f /tmp/sitemonitor.status.tmp
for HOST in "${HOSTS[@]}"
do
START=$(date +%s)
RESPONSE=`wget $HOST --no-check-certificate -S -q -O - 2>&1 | \
awk '/^ HTTP/{print \$2}'`
END=$(date +%s)
DIFF=$(( $END - $START ))
if [ -z "$RESPONSE" ]; then
RESPONSE="0"
fi
if [ $(contains "${OK_STATUSES[@]}" "$RESPONSE") == "y" ]; then
if [ "$DIFF" -lt "$SLOW_THRESHOLD" ]; then
STATUS="UP"
else
STATUS="SLOW"
fi
else
STATUS="DOWN"
fi
touch $STATUS_FILE
STATUS_LINE=`grep $HOST $STATUS_FILE`
STATUS_PARTS=($(echo $STATUS_LINE | tr " " "\n"))
CHANGED=${STATUS_PARTS[2]}
if [ "$STATUS" != "${STATUS_PARTS[5]}" ]; then
#if [ -e "${STATUS_PARTS[5]}" ] || [ "$STATUS" != "UP" ]; then
if [ -z "${STATUS_PARTS[5]}" ]; then
STATUS_PARTS[5]="No record"
fi
TIME=`date -d @$END`
echo "Time: $TIME" > /tmp/sitemonitor.email.tmp
echo "Host: $HOST" >> /tmp/sitemonitor.email.tmp
echo "Status: $STATUS" >> /tmp/sitemonitor.email.tmp
echo "Latency: $DIFF sec" >> /tmp/sitemonitor.email.tmp
echo "Previous status: ${STATUS_PARTS[5]}" >> /tmp/sitemonitor.email.tmp
if [ -z "${STATUS_PARTS[2]}" ]; then
TIME="No record"
else
TIME=`date -d @${STATUS_PARTS[2]}`
fi
echo "Previous change: $TIME" >> /tmp/sitemonitor.email.tmp
`mail -a "From: $NOTIFY_FROM_EMAIL" \
-s "SiteMonitor Notification: $HOST is $STATUS" \
"$NOTIFY_TO_EMAIL" < /tmp/sitemonitor.email.tmp`
rm -f /tmp/sitemonitor.email.tmp
#else
# first report, but host is up, so no need to notify
#fi
CHANGED="$END"
fi
echo $HOST $RESPONSE $CHANGED $END $DIFF $STATUS >> /tmp/sitemonitor.status.tmp
done
mv /tmp/sitemonitor.status.tmp $STATUS_FILE
Sample notification email
Time: Fri Mar 23 17:20:02 UTC 2012 Host: http://www.example.com Status: UP Latency: 0 sec Previous status: DOWN Previous change: Fri Mar 23 15:05:20 UTC 2012
Pretty simple stuff, but it totally gets the job done, and you can have it check on any interval and as many servers as you want! That’s hard to beat at this level of simplicity and freedom.