The following table outlines reasons why Sitemap Creator may skip the copy of a given URL.

CodeDescriptionResolution
Above RootThe URL is outside the path of the starting URLChange the starting URL, or enable the Crawl above the root URL option
Below Max DepthThe URL depth is above the user defined maximumChange the Limit crawl depth setting
Distance Too LargeThe URL is too far from the starting URLChange the Limit distance from root URL setting
Excluded By RuleThe URL is excluded by a user defined ruleDisable, remove or edit the appropriate rule
ExternalThe URL is on a different domain to the starting URLEither change the crawl mode, or enable the Download all resources setting
FailedThe URL failed to download
File Too LargeThe content is larger than the user defined maximumReconfigure the Maximum file size setting
File Too SmallThe content is smaller than the user defined minimumReconfigure the Minimum file size setting
ForbiddenThe server returned a 403 responseProvide appropriate credentials1
Invalid URLA data URI could not be parsedRaise support ticket with Cyotek
Not AcceptableThe server returned a 406 response
Not FoundThe server returned a 404 response
Not ModifiedThe server returned a 304 responseNo resolution required. The resource matches previously downloaded content
RedirectThe server returned a response in the range 300 - 399No resolution required. Sitemap Creator will automatically follow the redirect
Too Many FilesThe number of files downloaded is larger than the user defined maximumReconfigure the Maximum number of files setting
Too Many RedirectsThe server returned a redirect response multiple times
Trust FailureA certificate could not be validatedReconfigure SSL/TLS settings may allow copying to continue
Unsupported ProtocolThe URL protocol is not supported by Sitemap CreatorNo resolution. Sitemap Creator currently supports the http, https and data protocols
Ignored During Scan2The URL represents a file that can't be scanned for more links, and the current operation is an analyseThe file will be downloaded when performing a site copy, unless excluded for other reasons

Notes

  1. Sometimes a webserver may return a 403 response when it detects a non-standard user agent. Reconfiguring the user agent to mimic a web browser can sometimes resolve this. KB article.
  2. This status is always returned when doing a scan (not a download) of a website and non-HTML content is encountered.

In this article