Let’s face it, websites and web servers can experience issues from time to time. Common issues include plugin incompatibilities, expired SSL certificates, a dreaded website hack or maybe just poor server performance. It’s important to keep an eye on sites to make sure they’re performing at their best. There are some great services out there for this task, like Pingdom, New Relic, etc., but today we’re going to do something a little different and write a custom Bash script to keep an eye on things for us.
Let’s start by coming up with a few requirements for our script. At a minimum it must:
- Check a website to see if it is working correctly
- Check the website on a schedule
- Report on the website status so that we can follow up with any issues in a timely manner
For a couple of added bonuses, we’ll add in:
- Check multiple websites
- Multiple alerts – email & text message
- Notification throttling – 1 notification per hour
- Quiet hours – so we don’t get text messages all night
- Slow site monitoring
Let me preface this article by stating that I am a PHP developer, with limited experience with shell scripts. There may be better ways to do this, so use it as a general guide.
The Basic Script
Doing a little Google searching you’ll find a plethora of Bash scripts for doing exactly what we want. We’ll start with a basic version below – I can’t find the original inspiration for this at the moment, so if you have one feel free to drop a comment.
monitor.sh
#!/bin/bash
response=$(curl -s -o /dev/null -w "%{http_code}" https://example.domain)
if [ "${response}" == "200" ]; then
echo "Website is up and working!"
else
echo "Website is down!"
fi
In this example, we’re using cURL on Ubuntu 24.04. If it’s not installed, you can install it with “sudo apt install curl”, assuming you have the correct privileges.
The logic is pretty self explanatory, but here’s a breakdown of the above cURL command:
curl # The cURL command
-s # Tells cURL to run silent
-o /dev/null # Redirects the output to /dev/null (a blackhole that discards it)
-w "%{http_code}" # Writes output following the specified format
URL # The URL we're checking
The cURL command stores the response code, and we check if it matches a status of 200. More on HTTP response status codes here. More on cURL write variables here.
That’s a great start.
Checking On A Schedule
To run our monitor.sh script, we’ll set up a cron job on the server and run it every 10 minutes.
crontab -e
Add the path to your script to a new line in the crontab file.
*/10 * * * * /path/to/your/monitor.sh >/dev/null 2>&1
Save the crontab file and your script should now be running every 10 minutes. The output/stdout will be piped to a blackhole (>/dev/null) and errors/stderr will also be piped to a blackhole (2>&1).
Sending Alerts
So far we’re checking the website every 10 minutes, but the output isn’t being sent anywhere – so it’s pretty useless. Next, I’ll use Postfix/Mailutils to send an alert to a specified email address. I’m not going to go into the weeds on how to set that all up securely (along with SPF/DKIM/DMARC for deliverability) – but you can learn about setting up postfix here. It is VERY important to configure your server correctly to avoid relaying spam email. Note that you can also use a transactional email service like Mailgun or SendGrid if you’re not setting up email on your server.
After you’ve set up and configured Postfix, to install Mailutils, enter the following:
sudo apt install mailutils
This will give us access to the ‘mail’ command, which we’ll use to send alerts to a chosen email address.
Let’s modify the shell script:
monitor.sh
#!/bin/bash
response=$(curl -s -o /dev/null -w "%{http_code}" https://example.domain)
if [ "${response}" == "200" ]; then
# Everything looks good. No need for a notification.
echo "Website is up and working!"
else
# Oh crap!
echo "Website is down!"
mail -s "Site Status Alert!" -a "From: #ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "Website https://example.domain is down!"
fi
That’s it! Now your cron job will run the shell script every 10 minutes, which checks https://example.domain and emails a simple alert to #ob#lbh#at#lbheqbznva.pbz#ob# so that you can investigate. Great job!
This shouldn’t need stating – but replace https://example.domain with the site you’re checking, server.yourdomain.com with the server hostname and #ob#lbh#at#lbheqbznva.pbz#ob# with the email address to send notifications to.
Making The Script BETTER!
Alright, our simple website monitoring script is cool and all — but not particularly helpful in a real-world environment. Do you really want to be spammed every 10 minutes with notifications? Would you like to check more than one website? What if you don’t see your email until the next day?!
Let’s do better.
First, let’s create a new folder to hold a file with multiple sites to check and import it into our script.
mkdir includes
nano includes/sites
includes/sites
https://example.domain
https://app.example.domain
https://www.another.tld
Now we’ll create a function to loop through and check each of the domains. I’ll add in some comments to explain the different parts.
monitor.sh
#!/bin/bash
# Use a variable to store the current path
dir="$(dirname "$0")"
# Include sites file and map it to an array
declare -a sites
mapfile -t sites < "${dir}/includes/sites"
# Define a function to check the status of a website and echo a status
check_website() {
local url="$1"
# Check the specified URL and add a nocache parameter to get fresh results
local response=$(curl -s -o /dev/null -w "%{http_code}" "${url}?nocache=1")
if [ "${response}" == "200" ]; then
# Everything looks good. No need for a notification.
# No longer echo a response here, or it will trigger the notifications (but you can use it for testing)
#echo "${url} is up and working!"
else
# Oh crap!
if [ "${response}" == "403" ]; then
echo "${url} responded with a 403 Forbidden Error!"
elif [ "${response}" == "404" ]; then
echo "${url} responded with a 404 Not Found Error!"
elif [ "${response}" == "495" ]; then
echo "${url} responded with a 495 Certificate Error!"
elif [ "${response}" == "500" ]; then
echo "${url} responded with a 500 Internal Server Error!"
else
echo "${url} appears to be down with a ${response} Error!"
echo "Exit code: $?"
fi
fi
}
i=0
output=""
# Loop through the sites, checking for problems
while [ ${i} -lt ${#sites[@]} ]; do
echo "Checking website #{sites[$i]}"
result="$(check_website ${sites[$i]})"
# If the result isn't empty, add it to the output with a couple of newlines
if [[ ! -z "${result}" ]]; then
output+=$'\n'"${result}"$'\n'
fi
# Make sure to increment i
(( i++ ))
done
# Check if the output is empty
if [ -z "${output} ]; then
# No problems were found
echo "All sites are online and running well."
else
# If output was returned, that means there's a problem
echo "${output}"
mail -s "Site Status Alert!" -a "From: #ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "${output}"
fi
Now we’re cooking with gas!
But wait… what if you’re not constantly checking email?
Adding an Additional Notification
Here, we’ll add a second notification that will go directly to your email to text address. Most US cellphone carriers have mail to text. AT&T uses the format #ob#10-qvtvg-jveryrff-ahzore#at#zzf.ngg.arg#ob#. Verizon uses #ob#10-qvtvg-jveryrff-ahzore#at#igrkg.pbz#ob#. Look up your carrier and they’ll likely have that option.
What we DON’T want are tons of spam notifications.. or text notifications while we’re sleeping. You can define reasonable business hours, or just block off some time in the middle of the night when notifications won’t go through to your phone. For throttling of the notifications, we can create timestamped flag files and check their modified timestamp against the current time and only send notifications once per hour.
Let’s look into setting that up:
monitor.sh
#!/bin/bash
...
skipping ahead
...
# Check if the output is empty
if [ -z "${output}" ]; then
# No problems were found
echo "All sites are online and running well."
else
# If output was returned, that means there's a problem
echo "${output}"
# Get the current time and set the timezone you're in to make it easier to read
currenttime=$(TZ=America/New_York date '+H:%M')
# For clarity, generate a timestamp to add to the notification
datetime=$(TZ=America/New_York date '+%m/%d/%Y %H:%M:%S')
# Set a file to compare against
touch -d '-1 hour' "${dir}/tmp/limit"
# Make sure notifications are only sent once per hour
if [ "${dir}/tmp/limit" -nt "${dir}/tmp/last_email_notification" ]; then
# It's been over an hour, send the notification again
echo "Sending email notification."
mail -s "Site Status Alert!" -a "From: Server Alerts <#ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#>" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "Report time: ${datetime}"$'\n'"${output}"
# Update the timestamp on the notification flag file
touch "${dir}/tmp/last_email_notification"
fi
# For text messages, make sure it's outside quiet hours... let's say 11:00PM-6:00AM
if [[ "$currenttime" > "06:00" ]] && [[ "$currenttime" < "23:00" ]]; then
# Make sure notifications are only sent once per hour
if [ "${dir}/tmp/limit" -nt "${dir}/tmp/last_text_notification" ]; then
echo "Sending text notification."
mail -s "Site Status Alert!" -a "From: Server Alerts <#ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#>" #ob#10-qvtvg-jveryrff-ahzore#at#pneevre.gkg.pbz#ob# <<< "Report time: ${datetime}"$'\n'"${output}"
# Update the timestamp on the notification flag file
touch "${dir}/tmp/last_text_notification"
fi
fi
fi
Awesome! Now we’re only being notified once per hour and the text messages aren’t waking us up in the middle of the night. I added the “Report time” before the output because it can be a little confusing when you’re testing and waiting for notifications to come through. It’s best to be able to tie it directly to the time that you sent the test.
To test, you can always type:
./monitor.sh
or
/path/to/your/monitor.sh
The reason I’m using the ${dir} variable is so that it can work both ways and the cron job will need absolute paths to work correctly.
Monitoring Slow Sites
Sometimes it’s not enough to know if the site is returning a 200 OK status code. To fix issues with sites loading slowly, due to poor coding, excessive plugins or server issues, we can also keep an eye on the time it takes to load a site and log it if it’s taking too long.
In addition, the basic curl function can be modified with some additional connection timeout and retry parameters so that if there’s an initial communication issue, it can try again and not report false positives.
monitor.sh
#!/bin/bash
...
check_website() {
...
# --connect-timeout how long in seconds to attempt a connection
# --retry how many times to retry a connection if it fails
# --retry-delay how long in seconds to wait between retries
local response=$(curl --connect-timeout 20 --retry 3 --retry-delay 5 -s -o /dev/null -w "%{http_code}, %{time_total}" "{$url}?nocache=1")
# Split the response into a result array
IFS=', ' read -r -a result <<< "$response"
if [ "${result[0]}" == "200" ]; then
# Everything looks good. No need for a notification.
#echo "${url} is up and working!"
# The site is working.. but is it slow?
# Strip the https:// off of the URL to use as the filename for this site
# to store slow entries
local slow_log="${dir}/slow/${url//https:'//'}"
# Check if it took longer than 4 seconds to load
if [ "`echo "${result[1]} > 4" | bc`" -eq 1 ]; then
# Yep, it's sllooooowwww
# Put the current date as yyyy-mm-dd HH:MM:SS in the variable timestamp
printf -v timestamp '%(%Y-%m-%d %H:%M:%S)T' -1
# Make sure the slow log file exists
touch "${slow_log}"
# Append the timetamp and site load time to the slow log
echo "${timestamp} ${result[1]}" >> "${slow_log}"
# Check if it has been slow for a while by mapping the slow log to an array
# and counting how many items it has
declare -a slow
mapfile -t slow < "${slow_log}"
if [ "`echo "${#slow[@]} > 5" | bc`" -eq 1 ]; then
# With more than 5 consecutive slow loads, send the notifications
echo "$url is loading SLOW!"
# Append the log entries
echo "$(<${slow_log})"
# Clean up the file so it doesn't get too out of hand
rm "${slow_log}"
fi
else
# The site is not slow, so remove the slow log if it exists
if [ -f "${slow_log}" ]; then
rm "${slow_log}"
fi
fi
else
# Oh crap!
...
Note that we’ve added %{time_total} to our cURL response, so it’s no longer a single number. We split the result into an array and now result[0] is the response code and result[1] is the total time it took to load the URL. In all comparisons used within the function, ${response} will need to be changed to ${result[0]}.
When any of the sites are slow for more than 5 consecutive checks (every 10 minutes), you’ll now be notified that something is amiss.
Conclusion
And there we have it! All the fun things wrapped up into a neat little script. Does it work? Yes – I’m using a similar script in production now. Could it be better? Absolutely! Think logging performance/uptime to a database and accessing through a web interface. The production version I’m using also applies an HTML header & footer to the email, sending it in a nice HTML format for easy reading. That’s a nice touch. The sky’s the limit.
I hope this was a good read. Feel free to leave a comment if there’s something I missed.
Cheers!