Block everything, or just bad bots?

Block everything, or just bad bots?
Diana Kamkina • March 26, 2021

The question might seem simple - just block everything to be on the safe side, right? Well not quite, there are a few nuances to be considered.

This article explores bad and good bots' characteristics to help you decide which bots and if any at all should be blocked for protection of your business.

Bad bots are taking over

According to the GlobalDots 2019 Bad Bot Report almost 39% of all online traffic is from bots. 20.4% is generated by bad bots and 17.5% - by good ones. The bot traffic share has increased dramatically over the last few years, plus bad bots are growing faster than good ones. Due to the pandemic, these traffic and growth ratios are only becoming more evident and are increasing rapidly.

Source: GlobalDots 2019 Bad Bot Report

Which bots are good?

Typically, online resource owners face numerous bad bots: click fraud bots that imitate human clicks, botnets for DDoS attacks, web scrapers for stealing price lists and content, spy bots that collect critical business data, zombie bots that search for vulnerabilities, sneakers bots to buy limited edition products and tickets, farming bots found in games, spambots, and many others.

Whilst most of the bad bots are well known within various business industries, good bots aren’t quite as famous. It’s always the bad guys who get all the glory, huh? Let’s change that and explore some good bots.

In this article, we have highlighted only those bot groups that actively participate in online traffic and requests. For example, chatbots would be a large group that is not included for this reason.

Here are several main groups to consider:

Search engine bots (web crawlers)
They are built by search engines to index pages and deliver them properly in search results. These bots crawl websites and put them in a database to check things like images, content and plagiarism / copyright. Crawlers generate about 30% of global online traffic. The most obvious example would be Googlebot.

Media robots
They focus on accessing and saving important information like news, weather, currency rate, etc. Examples: Amazon Echo, Google Home, Siri, Alice, news agency bots.

Copyright bots
These bots check web content for plagiarism (articles, videos, photos without source references) and can be found in social media and other platforms with user-generated content. A good example is the YouTube Content ID that explores videos for copyrighted content.

Trading bots
They can work both for buyers and sellers, helping them find the best deals and offers (like Google Shopping service). Browser extensions - for marketplaces and for searching discount coupons - are also in this category.

These are just some of the biggest groups. There are also bots that offer a faster reaction than human's - that’s why we have gaming bots, bots for online auctions, etc.

In addition, in Variti’s whitelist, we have banks and payment systems bots that help automate payment procedures, bots that build web page previews in social networks and messengers, bots that watch servers' statuses or service availability, bots to search for restricted content and AdSense bots that scan pages to assign relevant advertising.

Giving way to good bots

The good bots aren’t good just because they don’t commit fraud, there are clear benefits to having them. For instance, a crawler helps website pages index faster and be ranked higher in search engine results. Without this bot, customers may not find you online.

Therefore, a practice called whitelisting is used. We add the above-listed good bots to the global whitelist, which can be enabled and disabled for our users as they wish. Having this whitelist grants good bots the access to your online resources without inspection and allows them the necessary actions.

In addition to the global whitelist, we also have a local whitelist where users can add various IP addresses, locations, devices and even specific bots from being blocked, grant 3rd party services access, make a database backup or even collect data.

Good bots can misbehave

Good bots in attackers’ hands can be dangerous. One example is the two-factor authentication system which many organisations now use. For this, one of the authorisation methods is to send an access code via SMS. Each SMS has some cost to the organisation and so by bots sending a big volume of requests, organisations budget can be drained.

Another example would be bulk mailing of messages with images. For security reasons, email services pre-process images from emails via proxy servers and transcode them before delivery. This is done to protect them from being tracked by IP address, device, geolocation and so on. A storm of requests for static content, in this case, may be estimated as a DDoS attack. You need to know your mailing service's specifics and prepare for bulk mailings by getting recommendations from the service's support.

Picking the bad bots out of the crowd

It’s quite easy to distinguish a bot from a human if you have the right skills, experience and technology. But distinguishing a bad bot from a good one is much more complex due to bad bots using different camouflage methods.

The GlobalDots 2019 Bad Bot Report has found 523 different types of bot disguise. Most of the bad bots (55.4%) pretend to be Google Chrome, with Firefox being second, and the Android mobile browser coming in third. The list also includes Safari, Internet Explorer, Safari Mobile, Opera mobile browser, Googlebot and Bingbot crawlers, and many others.

The process of bot identification and segregation is also tricky because almost 74% of bad bots are advanced persistent bots (APBs) - complex bots that use a mixture of technologies and attack methods. They are the hardest to detect because they come from different subnetworks, change IP addresses randomly, and hide behind anonymous proxy servers, Java scripts and peer-to-peer networks. These sophisticated bots can automatically search for necessary information and vulnerabilities and often mimic human behavior successfully.

In addition, botnet managers usually have access to settings and can modify the environment to launch bot attacks. Therefore, it is extremely ineffective to identify bots only by request logs monitoring, as some antiviruses do.

At Variti, we believe it is important to compare data with other information, including IP address, statistics, technical metrics, behavior patterns, and many other factors. For example, we pay attention to the peculiarities of code execution and various browser extensions since bots do not work exactly the same as real browsers.

In general, to identify an advanced and sophisticated bad bot, a combination of technical and behavioral analysis, statistics and reputation data is needed... and of course a little bit of magic.

So should I block all, or just bad bots?

In short, we strongly suggest not to block everything - only block them pesky bad bots.

As explored in this article, good bots have numerous benefits and should not be blocked. However we also talked about how sophisticated bad bots can be which would make it very difficult to block them without expert help. Work with a solution that will block all bad bots whilst ensuring that genuine and good bot traffic gets through smoothly, like us here at Variti, for example.

To add, the assumption might be that small one or two-page websites are not of interest to cyber criminals and thus are safe. Quite the opposite, the statistics show that small websites attract 22.9% of bad bot traffic, while large online resources: 17.9%. So even a small business should think about having a risk management strategy and block the bad bots.

To understand more about how automated attacks can affect your business and how Variti solutions can help you combat them, get in touch today.