Today I’m going to talk a little bit about spam submissions in a WordPress comment form. There are plenty of options out there to try. Some free, some paid. Akismet is installed by default with any WordPress installation, and though it’s a great solution, it does cost $5/mo. for commercial sites (or a $50/mo. for an unlimited sites license.) You may think I’m cheap, but forking out hundreds of dollars for premium plugins already along with monthly software costs for Design, SEO, billing, project management and more.. I think we’ll try a custom solution for blocking spam.

For client websites, we routinely install Gravity Forms with it’s honeypot option and an extra plugin called Gravity Forms Zero Spam. While this greatly reduces spam from Gravity Forms submissions (though not entirely), it doesn’t help us with the standard comment forms in WordPress. It does however give some ideas on what we can do for those.

This article is for developers and anyone comfortable enough to modify their theme’s functions.php file or create a custom WordPress plugin. If you’re not a developer, a couple simple steps that may help are:

  • Require that all comments are manually approved under Settings Discussion
  • Decreasing the number of links required for comment moderation under Settings Discussion
  • Maintain a list in the Disallowed Comment Keys field under Settings Discussion. Have a look at this article by Perishable Press for a good starting list.

Read on if you’d like to add some basic protections without a 3rd party plugin.

Creating Our Own Honeypot

What is a honeypot? Is it simply a container for Pooh Bear’s favorite yellow stuff? Well yes.. and no. In our context, a honeypot is like dangling something sweet in front of a spam bot, asking it to fill out a form field and then get caught in our trap!

Bots are smart though, and getting smarter every day.. so we need to employ some sneaky techniques. We don’t want to just add in a hidden field or display:none. Many bots will skip over those fields by default. Instead hide them with CSS as the bot might not be parsing CSS rules but users are.

Creating Custom Fields

We’ll start by adding some additional fields to our comment form using the comment_form_default_fields hook. Add the following to your functions.php file or custom plugin.

function custom_comment_fields( $fields ) {
	// store the originals
	$original_fields = $fields;

	// empty existing values
	$fields = [];

	// if "Comment author must fill out name and email" is selected in the Discussion Setting, it's not possible to
	// fill out a comment without $_POST['author'] or $_POST['email'] fields

	// base the new "real" field on the original author field, keeping the required attribute for users
	$fields['marginal-way'] = str_replace( array( 'author', 'comment-form-marginal-way' ), array( 'marginal-way', 'comment-form-author hp101' ), $original_fields['author'] );
	// fill in the author field with a default value to be checked later
	$fields['author'] = str_replace( array( 'value=""', 'comment-form-author' ), array( 'value="Your Name"', 'comment-form-author hp102' ) , $original_fields['author'] );

	// base the new "real" field on the original email field, keeping the required attribute for users
	$fields['forest-avenue'] = str_replace( array( 'email', 'comment-form-forest-avenue' ), array( 'forest-avenue', 'comment-form-email hp103' ), $original_fields['email'] );
	// use a default email address that'll pass a syntax check
	$fields['email'] = str_replace( array( 'value=""', 'comment-form-email' ), array( 'value="#ob#unaqyr#at#qbznva.pbz#ob#"', 'comment-form-email hp104' ), $original_fields['email'] );

	// add an extra honeypot & label it Nickname
	$fields['commercial-street'] = '<p class="comment-form-url hp105"><label for="commercial-street">Nickname</label> <input id="commercial-street" name="commercial-street" type="text" value="" size="30" maxlength="200" /></p>';
	
	// only add a class to the default url field
	$fields['url'] = str_replace( 'comment-form-url', 'comment-form-url hp106', $original_fields['url'] );	

	return $fields;
}
add_filter( 'comment_form_default_fields', 'custom_comment_fields' );

The label and field names can be whatever you want on the new fields. We’ll be swapping the data before the form is actually submitted to the database, so keep track of the values used. At this point the form should be showing duplicate fields.

Duplicate fields on the WordPress comment form for blocking spam

A couple things to note:

  • The comments form posts directly to the wp-comments-post.php file in the root of your WordPress installation and there are no hooks where we could use the default names for ‘author’ and ’email’ as true honeypots if the “Comment author must fill out name and email” under Settings Discussion, so we’ll need to get a little clever on how we use them. In the example above, I’m adding some syntactically valid values for ‘author’ and ’email’ in order to make it past the default checks.
  • The new fields are based on the current settings for ‘author’ and ’email’, with the name and id being updated.
  • Classes are given to each form field so that we can hide the honeypot fields later. Avoid grouping the visible & hidden fields or adding class names like ‘hide’ or ‘hidden’. Some bots might be smart enough to pick up on that.
  • I purposely put the extra honeypot “Nickname” above the URL field to break the pattern of hiding fields.

Hiding Our Honeypot Fields

Now we’ll need to hide the fields that we don’t want the user to fill in. Because they’re hidden, users will not change our default values, but bots scraping the form and attempting to submit it will likely put in their own values. If you felt like it, you could go the extra step of adding a <span> and hiding a message for users with screen readers before the ‘author’ and ’email’ fields, letting them know not to change the values. Smart bots might pick up on this.. or aria-hidden=”true” values, so there’s a certain lack of accessibility that comes with this technique.

Add the following CSS to your theme or plugin to hide the original Name and Email fields:

/* form extras */
.comment-form .hp102,
.comment-form .hp104,
.comment-form .hp105 {
	display: none;
}

Now the comment form will look exactly like the original on the surface.

Wordpress comment fields with hidden honeypot values

Checking The Fields

Now that our sneaky honeypot is in place, we need a way to check if a spammer has tripped our trap. We’ll do this by using the WordPress hook preprocess_comment. Add the following filter and function to your file.

function check_custom_comment_fields( $data ) {
	// the form MUST be submitted with the new fields
	if ( !isset( $_POST['marginal-way'] ) ||
	     !isset( $_POST['forest-avenue'] ) ||
	     !isset( $_POST['commercial-street'] ) ) {
		wp_die(
			'<p>' . __( 'Whoops! Something went wrong. Make sure you\'ve filled out the required fields.' ) . '</p>',
			__( 'Comment Submission Failure' ),
			array(
				'response'  => $data,
				'back_link' => true,
			)
		);
	}

	// real values
	$comment_author = sanitize_text_field( $_POST['marginal-way'] );
	$comment_author_email = sanitize_email( $_POST['forest-avenue'] );

	// validate that our new fields are present if they're required
	// this mimics the default WordPress behavior in wp-includes/comment.php and retains
	// the default language support
	if ( get_option( 'require_name_email' ) && !$data['user_ID'] ) {
		if ( $comment_author == '' || $comment_author_email == '' ) {
			wp_die(
				'<p>' . __( '<strong>Error</strong>: Please fill the required fields (name, email).' ) . '</p>',
				__( 'Comment Submission Failure' ),
				array(
					'response'  => $data,
					'back_link' => true,
				)
			);
		} elseif ( !is_email( $comment_author_email ) ) {
			wp_die(
				'<p>' . __( '<strong>Error</strong>: Please enter a valid email address.' ) . '</p>',
				__( 'Comment Submission Failure' ),
				array(
					'response'  => $data,
					'back_link' => true,
				)
			);
		}
	}

	// honeypot values
	$honeypot = sanitize_text_field( $_POST['commercial-street'] );
	if ( $data['comment_author'] != 'Your Name' || $data['comment_author_email'] != '#ob#unaqyr#at#qbznva.pbz#ob#' || $honeypot !== '' ) {
		$data['spam'] = true;
	}

	// swap the fields back to the defaults now that we've processed them
	$data['comment_author'] = $comment_author;
	$data['comment_author_email'] = $comment_author_email;

	// return the data without the honeypot values
	return $data;
}
add_filter( 'preprocess_comment', 'check_custom_comment_fields');

There are a few things going on here.

  1. We make sure that our new form fields are in fact present in the $_POST data because they contain the real information. In this example ‘marginal-way’ is the ‘author’, ‘forest-avenue’ is ’email’ and ‘commercial-street’ is a honeypot field.
  2. Sanitize the new post values… ALWAYS SANITIZE INPUT
  3. Mimic the existing behavior of WordPress and check that our custom author and email values are valid. If not, we’ll exit with an error just like the original WordPress form.
  4. Check if either one of our original fields has changed from our default value and if the honeypot field was filled out. If any of those are the case, we’ll add a ‘spam’ = true value to the data and pass it along to the next hook.
  5. Finally, since we don’t care about our honeypot fields anymore, we’ll replace the default data with our custom fields and allow WordPress to process them as it normally would. There are a few more filters it’ll go through before we deal with it next.

Marking Comments as Spam

The last thing we’ll do is is check for our ‘spam’ = true value in the pre_comment_approved hook.

function custom_comment_approval( $approved, $data ) {
	// here's where the spam classification takes place based on our previous assessment
	if ( isset( $data['spam'] ) && $data['spam'] ) {
		$approved = 'spam';
	}

	return $approved;
}
add_filter( 'pre_comment_approved', 'custom_comment_approval', '99', 2);

If this function finds the spam value in the data, it simply sets the $approved variable to ‘spam’ and sends it along. WordPress will categorize it as spam and Bob’s your uncle!

One Step Further! – JavaScript

At this point we have a pretty sweet combo of fields to fool bots, but what if they’re even smarter than we’ve given them credit for? JavaScript to the rescue! Now, it’s very possible that some bots will be running JavaScript as well as CSS, so we’ll try and trick them a bit further. Similar to the Gravity Forms Zero Spam plugin mentioned above, we’ll implement a technique discussed by David Walsh and go one step further with a custom nonce value.

First, we’ll either localize our theme’s (or plugin) JavaScript file with a comment_nonce value.

wp_localize_script( 'your-script', 'localized_data', array( 'comment_nonce' => wp_create_nonce('comment_nonce') ) );

Or alternately just add a nonce to the footer (since this is essentially what the localization does anyway).

function your_function() {
    echo '<script>/* <![CDATA[ */
 var localized_data = {"comment_nonce":' . wp_create_nonce('comment_nonce') . '};
 /* ]]> */</script>';
}
add_action( 'wp_footer', 'your_function' );

The nonce is a one time use value that gets generated for the user and we’ll check that it’s correct when the form is submitted. To make sure that it’s not scraped by a bot running JavaScript, we’ll add it to the submit() handler via jQuery.

// Add an extra form value on submit of the comment form
if ( typeof ( localized_data.comment_nonce ) !== undefined ) {
	$('#commentform').submit( function(e) {
		$('#commentform').append( '<input type="hidden" name="nonce" value="' + localized_data.comment_nonce + '" />' );
		return;
	});
}

In this way, the hidden field is added only on submission. We’ll now need to check for it in the functions created earlier. First in the check_custom_comment_fields function, we’ll add it to the list of $_POST values required to be submitted.

function check_custom_comment_fields( $data ) {
	// the form MUST be submitted with the new fields
	if ( !isset( $_POST['marginal-way'] ) ||
		 !isset( $_POST['forest-avenue'] ) ||
		 !isset( $_POST['commercial-street'] ) ||
		 !isset( $_POST['nonce'] ) ) {
			wp_die(
				...

Secondly, right after checking our honeypot values, we’ll also check for the nonce field that was added. We only care about non-logged-in users in this case, so we’ll check if WordPress has already found the logged in user before doing this check.

        // honeypot values
	$honeypot = sanitize_text_field( $_POST['commercial-street'] );
	if ( $data['comment_author'] != 'Your Name' || $data['comment_author_email'] != '#ob#unaqyr#at#qbznva.pbz#ob#' || $honeypot !== '' ) {
		$data['spam'] = true;
	}

	// JavaScript injected value for non-logged-in users
	if ( !$data['user_ID'] ) {
		if ( wp_verify_nonce( $_POST['nonce'], 'comment_nonce' ) !== 1 ) { // 1 means it was created in the last 12 hours
			$data['spam'] = true;
		}
	}

Conclusion

And there we have it. We’ve modified the default WordPress comment form for some additional comment spam blocking capabilities.

Side note: we may actually eliminate a good number of spammers just by requiring additional fields. Because WordPress is so prevalent on the internet these days, there are literally millions of websites using just the default fields and spam bots could blanket sites with the same scripts. Adding additional required values would immediately block those bots. It doesn’t hurt to have extra protection however!

I hope you’ve enjoyed the article. And as always, if you need help with your custom WordPress development projects, feel free to contact us.

Cheers!

Additional Resources: