Danger Will Robinson: All Bots are not as friendly as me

I am probably dating myself in saying one of my favorite TV shows when I was a kid was “Lost In Space” which premiered in 1965 and ran for three seasons. For whatever reason, I was drawn into the show, its storyline, setting and characters. As a young boy, I identified strongly with the kid, Will Robinson, and even more so, his protective friend, The Robot. From that show, I learned that some BOTS are benevolent, and some are fiendish.

Fast forward about 55 years (and a Netflix “Lost In Space” reboot later) to present day, we all know some bots are "good." For example, search engine spiders, while others can be used to launch malicious attacks in the online survey research business.

Increasingly we hear about how artificial intelligence and robots are going to change our lives, and as market researchers we’ve been promised a whole host of benefits. However, while robots have yet to take over our jobs, they seem to have taken over the jobs of our survey respondents. We have discussed in earlier articles how to minimize traffic from these bots: some of the methods we’ve suggested include using reCAPTCHA, adding quality control questions to your survey, and communicating with sample providers. However, as time passes we find bots are getting smarter, and they may be able to bypass some or all these measures.

But have no fear: here are 4 ways to identify bots in your data so you can eradicate these false respondents and insure the quality of the data you collect for your clients.

Atypical open-ended responses: open-ended questions act as a sort of Turing test for bots. Real respondents usually have no difficulty interpreting a question and answering correctly, but bots will often give generic answers or seemingly answer a slightly tangential question. More sophisticated bots can copy and paste a bit of text from within the survey or from recent news articles to make their answers seem more legitimate. As we know all too well, real respondents often like to give the absolute minimum effort required, so sometimes an unnecessarily complex open-ended response can tip you off.
Nonsensical demographic information: bots coming from the same source often display patterns of demographic information (i.e., all the “respondents” tend to answer the same on gender, ethnicity, and income questions). Running demographic crosstabulations can be extremely useful in spotting strange, bot-heavy outlier populations. For example, if you see a sudden influx of Gen Z millionaires (or some other Census-unlikely population) in your sample, there is a good chance that you have a bot issue that warrants further investigation.
10/10 on scale questions, 10/10 times? 10/10 times it’s a bot: we would love it if in a satisfaction survey participants genuinely were 10/10 – extremely satisfied across every metric, but this usually indicates bad data. Speedsters and bots alike may answer in this fashion, muddying your data. If the time to completion for these responses is impossibly low, you may want to disqualify these respondents to maintain the integrity of your data.
Bots strike at night, when your real participants are asleep: picture this, you’re struggling to wrap up field by the deadline tomorrow, and you go to bed praying for a miracle in the morning. You wake up and discover that your prayers were answered, and that you got all the responses you needed! However, after experiencing this scenario firsthand, we’ve found that some things are too good to be true. Bots often strike overnight (perhaps from different time zones) while your real participants are sound asleep. A flurry of activity late at night often indicates bots.

Unless you take these measures to carefully assess your data, you may be performing analyses and generating insights based on false responses. You might not even be aware of it, but it is highly likely. It is estimated that over half of all web traffic comes from bots (source), and a simple internet search of the term “survey bots” yields links to sites advertising their automatic survey-taking platforms. Blissfully unaware of such issues, you may even send data and other deliverables to your clients that are riddled with bad responses. While it may take time and effort to clean your data, the energy expended pales in comparison to the possible disastrous outcome of your clients discovering the bots before you do. Reputation is everything in this industry, and one such mishap could ruin years of building trust.

At Accelerant Research, we take all the steps listed above, and more, to insure that our data are of the highest quality. We give our clients peace of mind, and they trust that our insights are based on the responses of real participants. We invite you to request a cost estimate from us as a first step. Simply give us a call (704-206-8500) or send us an email (info@accelerantresearch.com).