4 Effective Technical Methods to Spam Proof Your Google Analytics Data

Ray Wang

To gain actionable insights from your Google Analytics, you must first ensure that your data is clean and not skewed by bots and spammers. If your Google Analytics data is affected by bots and spammers, website traffic and other vital data will be inflated and will not provide an accurate reflection of your website performance.

To spam proof your Google Analytics data, here are four technical methods you should use:

1.  Block all known bots

Google Analytics has a built-in feature that blocks website traffic from all known bots. The list of known bots is from IAB/ABC International Spiders and Bots List. Members who sit on the Spiders & Robots Policy Board consists of technology experts from Google, Microsoft, Adobe and other leading global technology companies.

To use Google Analytics’ built-in feature, go to the admin section and click on View Settings. Once clicked, you will see a check box under Bot Filtering which says Exclude all hits from known bots and spiders. Click on that and save the change and your Google Analytics account will block all traffic from known bots.

2. Block Language Spam
Another way to identify and block spam traffic by going to the Language section in your Google Analytics account and blocking languages that are spammy.

Here’s an example of a spammy language:

google-analytics-language-spam-2

The easiest way to identify spammy languages is by finding languages that have spaces in the middle because non-spammy languages are displayed using dashes and/or numbers. Example: en-us.

To block traffic from spammy languages, create a filter using this regular expression: s+. (Creating a filter using this regular expression will let you block traffic from all languages that have space in the middle).

To create the filter, go to your admin section, click on Filter and click on +ADD FILTER.

Google+Analytics+Spammy+Language+Filter

Then, enter a Filter Name, click on Custom, select Language Settings under Filter Field, and enter s+ in the Filter Pattern field. Once you have taken these steps, click on Save.

Google+Analytics+Spam+Language_v2

3. Block Crawler Spam

Crawlers sometime act like a browser and they visit your website through “external websites” and their traffic is considered as referral website in Google Analytics. The crawlers’ objective is to get you to visit their webpages because you may visit those websites when you see them in your Google Analytics account.

It is difficult to block all crawler spam traffic because it is difficult for you and other webmasters to have comprehensive lists of spammy referral websites but you can block traffic from websites that contain keywords that are common for spammy websites in their domains.

To block such sites, use the following regular expression:

(videos|buttons)-for-your|share-?button|buttons-for(-your)?-website|semalt|ranksonic|timer4web|anticrawler|dailyrank|sitevaluation|forum69|profit.xyz|checkpagerank|keywords-monitoring|kings-analytics|responsive-test|fix-website-|top10-way

For more expressions you can use to block crawler spam, visit: https://carloseo.com/removing-google-analytics-spam/.

To use regular expression to block crawler spam, add a new filter and select custom filter type, select Campaign Source as Filter Field and enter the expression above in the filter pattern field.

Google Analytics Spam Comment

4. Exclude Unknown Hostname

A hostname is the unique name given to a computer connected to the Internet. For example, my hostname is www.rwdigital.ca. The hostname consists of two parts. The first part is the local name, which is www. The second part is the domain name which is rwdigital.ca.

In your Google Analytics account, you can see website traffic based on hostname by going to Audience > Technology > Network.

Google Analytics Host Name Filter

You should only see your hostname in this section such as www.rwdigital.ca or rwdigital.ca and you should not see external domains such as www.xyz.co or www.seo-hacker.in. If you see external domains, it means that other websites have generated random Google Analytics codes and one of them matches your Google Analytics code and the code has been added to their websites. If other websites have implemented your Google Analytics code on your website, your Google Analytics data will be skewed.

To prevent the skew, create a filter such as the ones above by selecting Hostname as the Filter Field and enter your hostname expression in the Filter Pattern field.

By using the four technical spam proof way above to ensure your data is clean and accurate, your data will be clean and accurate so you can use them to produce actionable insights!

If you want an audit of your Google Analytics to ensure that it is spam proof, please email us at raywang@rwdigital.ca or submit a form below.

The Contact Form 7 plugin is not activated

ABOUT SWIFT SHOP

E-commerce requires the company to have the ability to satisfy multiple needs of different customers and provide them with wider range of products.

Copyright ©2019 SwiftShop Limited. All rights reserved.