Leave a comment

With so many bots, who are you marketing to?

  • RR

    Roy Rosenfeld

    over 5 years ago #

    I've spent the last 2 years helping marketers clean out bots as the VP Product of DoubleVerify, a VC-backed ad-tech company that authenticates advertising transactions. I recently left to start my own venture and was amazed to see how bots affected our experiments with Facebook ads. In fact, when we cleaned the bots out our true conversions rates almost doubled and it was much easier to 1)understand experiments and 2)better optimize ads & budget. More details below and happy to answer any questions or share some of the code we used to solve this issue.

    Using both Google Analytics & Mixpanel and didn't really help to begin with. Yes, GA might be able to filter out some bots if you're using ‘Exclude all hits from known bots and spiders’ as @dylan mentioned, but keep in mind that's just the "known" ones which is a *very partial* list the IAB cuerates. Here are some of the additional steps we took:

    1 - Look at the user agent string. Anything that contains crawler/bot can be filtered out.
    2 - Look at "easy to track" behaviours on-site. @dylan mentioned 0 engagement which is a great start.
    3 - [this made the biggest difference for us] add some code to filter more sophisticated bots.

    #3 is important so I'll dig a little bit more into that: bots exist not just to scrape content, but also to generate fake engagement on social networks, and create fake impressions on shady websites. We call them "Fraudulent Bots" and it's relevant because in order to look like a real users, those bots also click on ads and end up on your landing pages, from Facebook and elsewhere (they also affect re-targeting).
    Because it's a cat-and-mouse game between bots and bot-hunters, bots got more sophisticated over time (the bot operators are motivated by the huge amounts of money they can make if they can evade detection). This sophistication is presented by bots that don't bounce right of your landing page but instead actually interact with the site or at least stay on it for a while. They might scroll, or even click certain elements on the page, which makes them even harder to detect.

    What you can do to identify those will depend on how sophisticated you wish to get and how much time you want to spend on this. Because we had the methodology to do this from our previous experience we implemented the following measurements: Is the browser tab active? How much time was the user on the page? How far down was the page scrolled? How much time was the mouse moving? We also added invisible links and measured if they were clicked.

    We added custom events going to Mixpanel for all of those measurements and then easily created a new type of user we called "human" based on the data we saw. At that point we were finally able to experiment & optimize according to how real users engaged with our site.

    • DL

      Dylan La Com

      over 5 years ago #

      Thanks for sharing @rosroy. Were you worried about filtering out false positives? Also would love to hear what the gating factors were for being included as human in your mixpanel setup. Were they static factors across your whole site, or dynamic?

      • RR

        Roy Rosenfeld

        over 5 years ago #

        Good points there, @dylan. To weed out false positives (as much as possible) and figure out the right engagement threshold for the "bot or not" decision we did the following:
        1 - We constructed a control group of humans. We had a lightbox show up for some users to check if they're human (i.e. click the frog image and there were a frog, cat, and dog, to choose from). We also had actual users to sign up which we knew were human.
        2 - We constructed a control group of bots. Using the same lightbox mentioned earlier but with the reverse behavior (or no behavior but the user stays on the page for a while), and also by identifying clicks on invisible links.

        We then looked at a wide variety of engagement stats which I mentioned earlier - and the distinctions between the groups were big enough for us to set thresholds without much concern for false positives.

        We only had a landing page at the time so can't attest to different behaviors on different pages, but generally speaking bots are likely to exhibit similar characteristics after landing on your site, regardless of the page.

  • MA

    Matt Ackerson

    over 5 years ago #

    Anyone know how much of that bot traffic Google Analytics filters out?

    • SE

      Sean Ellis

      over 5 years ago #

      I don't think GA does a very good job automatically filtering the bot traffic. We've manually set up a bunch of filters on GA to try to stop counting the bot traffic for GrowthHackers.com. We were probably hitting about 30% of our uniques being bots. @dylan any additional context on this?

      • MH

        Mark Hayes

        over 5 years ago #

        Agreed with @sean on this, I am having to filter out a ton of bot traffic at the moment in GA for one client.

    • DL

      Dylan La Com

      over 5 years ago #

      Sure! @sean @matt_petovera

      For us, the only bot traffic I've seen to evade GA's built in filters and significantly impact visits can be identified by looking for visits under the 'c' language. You'll see these visits are: 100% unique visitors, 100% direct, and 100% bounce rate and 0 seconds time on site. Pretty obvious it's bot traffic once you see those metrics.

      GA has an option in the admin settings called 'Exclude all hits from known bots and spiders'. This actually does a decent job of filtering out bot traffic, but I can't say exactly how effective it is. Even so, the 'c' language visits still need to be filtered out. You can read more about doing that here http://radianweb.com.au/blog/the-importance-of-removing-c-language-results-from-google-analytics.html

      If you want to go bot-hunting in GA for any smaller perpetrators, I'd start by looking at visits that have a 100% bounce rate or 0 seconds time on site. And if you find any, please report back! :)

    • DL

      Dylan La Com

      over 5 years ago #

      Sorry about the @ mention - seems not to work with underscores. Filing a bug report for that.

    • GG

      Gail Gardner

      over 5 years ago #

      It could have improved, but when I first installed WordFence on my blog and started blocking fake Google bots, comment spammers, hackers, and scrapers and then password protected my login page the traffic Google counted dropped by 80%.

    • GG

      Gail Gardner

      over 4 years ago #

      I know I'm late to this party - I was searching for the new bot traffic report and found this one instead. I can tell you that when we put a password pop-up on my site and tightened up WordFence settings, my traffic dropped 80% in Google Analytics. So I'd say it does a really poor job of filtering out bots.

  • TM

    Taylor Miles

    over 5 years ago #

    I have had great experience with Cloudflare.com combined with the Non Default setting in GA to exclude known bots. It would be an interesting test to see before and after on both Cloudflare, and GA bot.

  • DL

    David Leonhardt

    over 5 years ago #

    In anticipation of The Matrix!

SHARE
32
32