If you are using Advanced Analytics Vanilla collects traffic statistics for your site using a service called Keen.io. The rules for Vanilla Analytics are very simple:
- Any page that is requested that responds with a http code 200.
- The initiator of the page load can be any user or guest.
- The initiator can even be a bot.
You might be comparing you Vanilla Analytics statistics with, for example, Google Analytics and finding some discrepancies between the two. Google Analytics can be configured in multiple ways to return different data. We have no control over that. Most likely Google filters out crawlers, most notably their own.
As stated, Vanilla's Analytics tool does not filter robots. It can be easily under-estimated how much traffic comes from crawlers like GoogleBot. Here is a typical picture of a 15 day period of a site where we query of all the traffic in our access logs excluding web crawlers:
And here is the same period but showing ONLY crawlers:
Together that makes almost one million hits, but almost a quarter million are coming from bots. The majority of those bots are coming from Google and so would be excluded from Google Analytics.
In this example you can see that crawler traffic is fairly consistant. It is not always like that. There are often periods of intense crawler activity followed by relatively little crawler traffic.
Google Analytics
There are other factors that can skew Google Analytics statistics. From the Wikipedia article on Google Analytics:
[There are] many ad filtering programs and extensions such as Firefox's Enhanced Tracking Protection,[32] the browser extension NoScript and the mobile phone app Disconnect Mobile can block the Google Analytics Tracking Code. This prevents some traffic and users from being tracked and leads to holes in the collected data.
Users with privacy concerns can delete or block tracking cookies which would affect Google's ability to collect accurate data.
Using Vanilla Analytics API
You can get more granular data from your Vanilla Analytics by our API. Create a POST request to see what browser traffic is contributing to your statistics.
curl --location --request POST 'https://forum.yoursite.com/api/v2/analytics/query' \
--header 'Authorization: Bearer your-access-token' \
--header 'Content-Type: text/plain' \
--data-raw '{
"collection": "page",
"end": "2020-05-15T12:00:00.000Z",
"filters": [
{
"prop": "userAgentParsed.browser.family",
"op": "eq",
"val": "Googlebot"
}
],
"start": "2020-05-01T12:00:00.000Z",
"type": "count"
}'