Let’s face it, websites and web servers can experience issues from time to time. Common issues include plugin incompatibilities, expired SSL certificates, a dreaded website hack or maybe just poor server performance. It’s important to keep an eye on sites to make sure they’re performing at their best. There are some great services out there for this task, like Pingdom, New Relic, etc., but today we’re going to do something a little different and write a custom Bash script to keep an eye on things for us.
Let’s start by coming up with a few requirements for our script. At a minimum it must:
- Check a website to see if it is working correctly
- Check the website on a schedule
- Report on the website status so that we can follow up with any issues in a timely manner
For a couple of added bonuses, we’ll add in:
- Check multiple websites
- Multiple alerts – email & text message
- Notification throttling – 1 notification per hour
- Quiet hours – so we don’t get text messages all night
- Slow site monitoring
Let me preface this article by stating that I am a PHP developer, with limited experience with shell scripts. There may be better ways to do this, so use it as a general guide.
The Basic Script
Doing a little Google searching you’ll find a plethora of Bash scripts for doing exactly what we want. We’ll start with a basic version below – I can’t find the original inspiration for this at the moment, so if you have one feel free to drop a comment.
monitor.sh
#!/bin/bash
response=$(curl -s -o /dev/null -w "%{http_code}" https://example.domain)
if [ "${response}" == "200" ]; then
    echo "Website is up and working!"
else
    echo "Website is down!"
fiIn this example, we’re using cURL on Ubuntu 24.04. If it’s not installed, you can install it with “sudo apt install curl”, assuming you have the correct privileges.
The logic is pretty self explanatory, but here’s a breakdown of the above cURL command:
curl # The cURL command
-s # Tells cURL to run silent
-o /dev/null # Redirects the output to /dev/null (a blackhole that discards it)
-w "%{http_code}" # Writes output following the specified format
URL # The URL we're checking
The cURL command stores the response code, and we check if it matches a status of 200. More on HTTP response status codes here. More on cURL write variables here.
That’s a great start.
Checking On A Schedule
To run our monitor.sh script, we’ll set up a cron job on the server and run it every 10 minutes.
crontab -eAdd the path to your script to a new line in the crontab file.
*/10 * * * * /path/to/your/monitor.sh >/dev/null 2>&1Save the crontab file and your script should now be running every 10 minutes. The output/stdout will be piped to a blackhole (>/dev/null) and errors/stderr will also be piped to a blackhole (2>&1).
Sending Alerts
So far we’re checking the website every 10 minutes, but the output isn’t being sent anywhere – so it’s pretty useless. Next, I’ll use Postfix/Mailutils to send an alert to a specified email address. I’m not going to go into the weeds on how to set that all up securely (along with SPF/DKIM/DMARC for deliverability) – but you can learn about setting up postfix here. It is VERY important to configure your server correctly to avoid relaying spam email. Note that you can also use a transactional email service like Mailgun or SendGrid if you’re not setting up email on your server.
After you’ve set up and configured Postfix, to install Mailutils, enter the following:
sudo apt install mailutilsThis will give us access to the ‘mail’ command, which we’ll use to send alerts to a chosen email address.
Let’s modify the shell script:
monitor.sh
#!/bin/bash
response=$(curl -s -o /dev/null -w "%{http_code}" https://example.domain)
if [ "${response}" == "200" ]; then
    # Everything looks good. No need for a notification.
    echo "Website is up and working!"
else
    # Oh crap!
    echo "Website is down!"
    mail -s "Site Status Alert!" -a "From: #ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "Website https://example.domain is down!"
fiThat’s it! Now your cron job will run the shell script every 10 minutes, which checks https://example.domain and emails a simple alert to #ob#lbh#at#lbheqbznva.pbz#ob# so that you can investigate. Great job!
This shouldn’t need stating – but replace https://example.domain with the site you’re checking, server.yourdomain.com with the server hostname and #ob#lbh#at#lbheqbznva.pbz#ob# with the email address to send notifications to.
Making The Script BETTER!
Alright, our simple website monitoring script is cool and all — but not particularly helpful in a real-world environment. Do you really want to be spammed every 10 minutes with notifications? Would you like to check more than one website? What if you don’t see your email until the next day?!
Let’s do better.
First, let’s create a new folder to hold a file with multiple sites to check and import it into our script.
mkdir includes
nano includes/sitesincludes/sites
https://example.domain
https://app.example.domain
https://www.another.tldNow we’ll create a function to loop through and check each of the domains. I’ll add in some comments to explain the different parts.
monitor.sh
#!/bin/bash
# Use a variable to store the current path
dir="$(dirname "$0")"
# Include sites file and map it to an array
declare -a sites
mapfile -t sites < "${dir}/includes/sites"
# Define a function to check the status of a website and echo a status
check_website() {
    local url="$1"
    # Check the specified URL and add a nocache parameter to get fresh results
    local response=$(curl -s -o /dev/null -w "%{http_code}" "${url}?nocache=1")
    if [ "${response}" == "200" ]; then
        # Everything looks good. No need for a notification.
        # No longer echo a response here, or it will trigger the notifications (but you can use it for testing)
        #echo "${url} is up and working!"
    else
        # Oh crap!
        if [ "${response}" == "403" ]; then
            echo "${url} responded with a 403 Forbidden Error!"
        elif [ "${response}" == "404" ]; then
            echo "${url} responded with a 404 Not Found Error!"
        elif [ "${response}" == "495" ]; then
            echo "${url} responded with a 495 Certificate Error!"
        elif [ "${response}" == "500" ]; then
            echo "${url} responded with a 500 Internal Server Error!"
        else
            echo "${url} appears to be down with a ${response} Error!"
            echo "Exit code: $?"
        fi            
    fi
}
i=0
output=""
# Loop through the sites, checking for problems
while [ ${i} -lt ${#sites[@]} ]; do
    echo "Checking website #{sites[$i]}"
    result="$(check_website ${sites[$i]})"
    # If the result isn't empty, add it to the output with a couple of newlines
    if [[ ! -z "${result}" ]]; then
        output+=$'\n'"${result}"$'\n'
    fi
    # Make sure to increment i
    (( i++ ))
done
# Check if the output is empty
if [ -z "${output} ]; then
    # No problems were found
    echo "All sites are online and running well."
else
    # If output was returned, that means there's a problem
    echo "${output}"
    mail -s "Site Status Alert!" -a "From: #ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "${output}"
fi
Now we’re cooking with gas!
But wait… what if you’re not constantly checking email?
Adding an Additional Notification
Here, we’ll add a second notification that will go directly to your email to text address. Most US cellphone carriers have mail to text. AT&T uses the format #ob#10-qvtvg-jveryrff-ahzore#at#zzf.ngg.arg#ob#. Verizon uses #ob#10-qvtvg-jveryrff-ahzore#at#igrkg.pbz#ob#. Look up your carrier and they’ll likely have that option.
What we DON’T want are tons of spam notifications.. or text notifications while we’re sleeping. You can define reasonable business hours, or just block off some time in the middle of the night when notifications won’t go through to your phone. For throttling of the notifications, we can create timestamped flag files and check their modified timestamp against the current time and only send notifications once per hour.
Let’s look into setting that up:
monitor.sh
#!/bin/bash
...
skipping ahead
...
# Check if the output is empty
if [ -z "${output}" ]; then
    # No problems were found
    echo "All sites are online and running well."
else
    # If output was returned, that means there's a problem
    echo "${output}"
    # Get the current time and set the timezone you're in to make it easier to read
    currenttime=$(TZ=America/New_York date '+H:%M')
    # For clarity, generate a timestamp to add to the notification
    datetime=$(TZ=America/New_York date '+%m/%d/%Y %H:%M:%S')
    # Set a file to compare against
    touch -d '-1 hour' "${dir}/tmp/limit"
    # Make sure notifications are only sent once per hour
    if [ "${dir}/tmp/limit" -nt "${dir}/tmp/last_email_notification" ]; then
        # It's been over an hour, send the notification again
        echo "Sending email notification."
        mail -s "Site Status Alert!" -a "From: Server Alerts <#ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#>" #ob#lbh#at#lbheqbznva.pbz#ob# <<< "Report time: ${datetime}"$'\n'"${output}"
        # Update the timestamp on the notification flag file
        touch "${dir}/tmp/last_email_notification"
    fi
    # For text messages, make sure it's outside quiet hours... let's say 11:00PM-6:00AM
    if [[ "$currenttime" > "06:00" ]] && [[ "$currenttime" < "23:00" ]]; then
        # Make sure notifications are only sent once per hour
        if [ "${dir}/tmp/limit" -nt "${dir}/tmp/last_text_notification" ]; then
            echo "Sending text notification."
            mail -s "Site Status Alert!" -a "From: Server Alerts <#ob#ab-ercyl#at#freire.lbheqbznva.pbz#ob#>" #ob#10-qvtvg-jveryrff-ahzore#at#pneevre.gkg.pbz#ob# <<< "Report time: ${datetime}"$'\n'"${output}"
            # Update the timestamp on the notification flag file
            touch "${dir}/tmp/last_text_notification"
        fi
    fi
fiAwesome! Now we’re only being notified once per hour and the text messages aren’t waking us up in the middle of the night. I added the “Report time” before the output because it can be a little confusing when you’re testing and waiting for notifications to come through. It’s best to be able to tie it directly to the time that you sent the test.
To test, you can always type:
./monitor.shor
/path/to/your/monitor.shThe reason I’m using the ${dir} variable is so that it can work both ways and the cron job will need absolute paths to work correctly.
Monitoring Slow Sites
Sometimes it’s not enough to know if the site is returning a 200 OK status code. To fix issues with sites loading slowly, due to poor coding, excessive plugins or server issues, we can also keep an eye on the time it takes to load a site and log it if it’s taking too long.
In addition, the basic curl function can be modified with some additional connection timeout and retry parameters so that if there’s an initial communication issue, it can try again and not report false positives.
monitor.sh
#!/bin/bash
...
check_website() {
    ...
    # --connect-timeout how long in seconds to attempt a connection
    # --retry how many times to retry a connection if it fails
    # --retry-delay how long in seconds to wait between retries
    local response=$(curl --connect-timeout 20 --retry 3 --retry-delay 5 -s -o /dev/null -w "%{http_code}, %{time_total}" "{$url}?nocache=1")
    # Split the response into a result array
    IFS=', ' read -r -a result <<< "$response"
    if [ "${result[0]}" == "200" ]; then
        # Everything looks good. No need for a notification.
        #echo "${url} is up and working!"
        
        # The site is working.. but is it slow?
        
        # Strip the https:// off of the URL to use as the filename for this site
        # to store slow entries
        local slow_log="${dir}/slow/${url//https:'//'}"
        # Check if it took longer than 4 seconds to load
        if [ "`echo "${result[1]} > 4" | bc`" -eq 1 ]; then
            # Yep, it's sllooooowwww
            # Put the current date as yyyy-mm-dd HH:MM:SS in the variable timestamp
            printf -v timestamp '%(%Y-%m-%d %H:%M:%S)T' -1
            # Make sure the slow log file exists
            touch "${slow_log}"
            # Append the timetamp and site load time to the slow log
            echo "${timestamp}    ${result[1]}" >> "${slow_log}"
            # Check if it has been slow for a while by mapping the slow log to an array
            # and counting how many items it has
            declare -a slow
            mapfile -t slow < "${slow_log}"
            if [ "`echo "${#slow[@]} > 5" | bc`" -eq 1 ]; then
                # With more than 5 consecutive slow loads, send the notifications
                echo "$url is loading SLOW!"
                # Append the log entries
                echo "$(<${slow_log})"
                # Clean up the file so it doesn't get too out of hand
                rm "${slow_log}"
            fi
        else
            # The site is not slow, so remove the slow log if it exists
            if [ -f "${slow_log}" ]; then
                rm "${slow_log}"
            fi
        fi
    else
        # Oh crap!
        ...
Note that we’ve added %{time_total} to our cURL response, so it’s no longer a single number. We split the result into an array and now result[0] is the response code and result[1] is the total time it took to load the URL. In all comparisons used within the function, ${response} will need to be changed to ${result[0]}.
When any of the sites are slow for more than 5 consecutive checks (every 10 minutes), you’ll now be notified that something is amiss.
Conclusion
And there we have it! All the fun things wrapped up into a neat little script. Does it work? Yes – I’m using a similar script in production now. Could it be better? Absolutely! Think logging performance/uptime to a database and accessing through a web interface. The production version I’m using also applies an HTML header & footer to the email, sending it in a nice HTML format for easy reading. That’s a nice touch. The sky’s the limit.
I hope this was a good read. Feel free to leave a comment if there’s something I missed.
Cheers!
Update: 02/05/2025
I added some more specific error messages to elaborate on errors that would come through as HTTP error: 000 and Exit code: 0 – which wasn’t very helpful.
The latest version changes the writeout to:
"%{http_code}, %{exitcode}, %{ssl_verify_result}, %{time_total}"
More info: https://everything.curl.dev/usingcurl/verbose/writeout.html
Because there are more variables in the result array, the previous usage of ${result[1]} is updated to ${result[3]} to line up with time_total. Then in the section where we’re checking for exit codes, I’ll add in both a resolver issue check and an SSL check. This came about after debugging a locally cached resolver after moving a site… super annoying, but fixed with the command ‘sudo resolvectl flush-caches’ on Ubuntu.
Change the final ‘else’ in the errors section to check as many unique error codes as you’d like. Codes can be found at https://curl.se/docs/manpage.html#EXIT.
else
    if [ "${result[1]}" == '6' ]; then
        echo "Website $url failed to load! Could not resolve host."
    elif [ "${result[1]}" == '91']; then
        echo "Website $url is using an invalid SSL certificate. SSL verify result: ${result[2]}."
    else
        echo "Website $url appears to be down!"
        echo "HTTP code: ${result[0]}. Exit code: ${result[1]}."
    fi
fi 
					
