How does Insites pick the pages it tests?
When running a report, the Insites spider will pick the pages it downloads based upon our proprietary prioritisation algorithm. The exact way this works is complicated and commercially sensitive, however, we can share some high-level details that can help you understand why it picks the pages it does.
The spider is “seeded” with the homepage provided to trigger the audit, as well as the URLs from the XML sitemap (if there is one).
Each time a page is downloaded, the URLs found on that page (for example in links) are added to the list of potential pages to check.
Factors used in prioritisation
When Insites picks the next page to check, here are some of the factors it considers:
- How did we find the URL? (sitemap, link from another page etc)
- If we found the URL on another page, how far up the page did we find it?
- How often have we seen that URL on the pages we already downloaded?
- Is the page a “priority page” (see below)
In order to obtain a “fair” report, the spider will try and check at least one page that appears to be the following:
- A blog post
- A contact page
- A product page
- A “service” page
Note that these are only biases – if there are only 5 pages on the site and one of those is a terms page, then the terms page will be checked as part of the audit.