Hacker Public Radio

HPR3296: Spam Bot Honey Pot


Listen Later

In this episode of Hacker Public Radio, I will describe the method I chose to combat spam bots filling out my company's contact form. About 99% of the submissions we receive are spam, which makes filtering for valid messages painful. After some research into different methods, I decided to go with the honey pot method.
The honey pot method uses an extra text input field to lure the spam bot into filling it out. There are different suggestions for how to hide this extra field from valid users by using either javascript or CSS. With javascript, the honey pot section of the form is removed from the DOM when the page loads, hiding it from your users. The argument for this method is most bots don't implement javascript, so the honey pot field will not be hidden from them. I think that is a valid argument but I didn't want to include extra javascript in my page--so I went with the CSS method.
There are references at the end of the show notes to a couple of the articles I read on implementing the honey pot with either javascript or CSS. My take away was, one, don't use the CSS display property set to the value of none to take the input out of the DOM. Sufficiently smart enough bots may know to scan for this, especially if applied directly to the element. Also don't name your classes something obvious to your intent like "anti-spam-filter". My guess is the majority of the bots out there aren't that sophisticated, but I figured it couldn't hurt to follow those suggestions.
I was already using Bootstrap CSS for our site, so I decided to use Bootstrap's "sr-only" class. This class is used for elements that you only want visible to screen readers. It takes the element and uses a combination of absolute positioning, setting the size and width to 1 pixel, setting a negative left margin, and hiding content overflow to prevent the honey pot showing up visually. I figured if the bot was scanning CSS for classes or properties, this wouldn't trigger any warnings. It does bring up the issue of how to prevent impacting the experience of people using screen readers. I applied the aria-hidden attribute with a value of true to the label element surrounding the honey pot input field. "[this] removes that element and all of its children from the accessibility tree." So we now have the field hidden both visually in the browser and from assistive technologies. Given the short end of the stick accessibility usually gets, I doubt there are any spam bots scanning for that ARIA attribute. For the minority of users who might be viewing with the classic lynx browser, I put 'For office use' as the label text before the honey pot, hoping this would get the message across without tipping off the bot to the intended purpose of the related input field.
The other main issue with this method is the value of the name attribute used for the input field. Some argue to use obfuscated values like "mmxxName" instead of "name", or "sxysPhone" for "phone". Apparently some bots will skip fields they don't recognize. By using more standard names for multiple honey pot fields, it easier to determine if it is a bot. The counter argument to this naming scheme is about the user experience, by obfuscating the name, then browsers won't auto-fill the valid fields of the form. This also brings up the matter of not auto-filling the spam fields by the browser of your users. This is done by setting any of your honey pot input elements' "autocomplete" attributes to "off".
So far this spam filtering method is working nicely. I currently send any messages flagged as spam to a different email address with the subject prepended with the words "[Spam review]". Once I am confident there are not that many false positives, I will just skip sending flagged messages. The one issue I have experienced with this method is when using the tab key to move through the form. Since the input field is only visually hidden, it still receives focus as you tab through.
...more
View all episodesView all episodes
Download on the App Store

Hacker Public RadioBy Hacker Public Radio

  • 4.2
  • 4.2
  • 4.2
  • 4.2
  • 4.2

4.2

34 ratings


More shows like Hacker Public Radio

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

290 Listeners

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec by Jerry Bell and Andrew Kalat

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec

372 Listeners

LINUX Unplugged by Jupiter Broadcasting

LINUX Unplugged

268 Listeners

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast) by Johannes B. Ullrich

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast)

651 Listeners

Curious Cases by BBC Radio 4

Curious Cases

820 Listeners

The Strong Towns Podcast by Strong Towns

The Strong Towns Podcast

422 Listeners

Late Night Linux by The Late Night Linux Family

Late Night Linux

164 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

8,061 Listeners

Cybersecurity Today by Jim Love

Cybersecurity Today

179 Listeners

CISO Series Podcast by David Spark, Mike Johnson, and Andy Ellis

CISO Series Podcast

189 Listeners

TechCrunch Daily Crunch by TechCrunch

TechCrunch Daily Crunch

42 Listeners

Strict Scrutiny by Crooked Media

Strict Scrutiny

5,794 Listeners

2.5 Admins by The Late Night Linux Family

2.5 Admins

98 Listeners

Cyber Security Headlines by CISO Series

Cyber Security Headlines

139 Listeners

What the Hack? by DeleteMe

What the Hack?

228 Listeners