The following table outlines reasons why WebCopy may skip the copy of a given URL.
Code | Description | Resolution |
---|---|---|
Above Root | The URL is outside the path of the starting URL | Change the starting URL, or enable the Crawl above the root URL option |
Below Max Depth | The URL depth is above the user defined maximum | Change the Limit crawl depth setting |
Distance Too Large | The URL is too far from the starting URL | Change the Limit distance from root URL setting |
Excluded By Rule | The URL is excluded by a user defined rule | Disable, remove or edit the appropriate rule |
External | The URL is on a different domain to the starting URL | Either change the crawl mode, or enable the Download all resources setting |
Failed | The URL failed to download | |
File Too Large | The content is larger than the user defined maximum | Reconfigure the Maximum file size setting |
File Too Small | The content is smaller than the user defined minimum | Reconfigure the Minimum file size setting |
Forbidden | The server returned a 403 response | Provide appropriate credentials1 |
Invalid URL | A data URI could not be parsed | Raise support ticket with Cyotek |
Not Acceptable | The server returned a 406 response | |
Not Found | The server returned a 404 response | |
Not Modified | The server returned a 304 response | No resolution required. The resource matches previously downloaded content |
Redirect | The server returned a response in the range 300 - 399 | No resolution required. WebCopy will automatically follow the redirect |
Too Many Files | The number of files downloaded is larger than the user defined maximum | Reconfigure the Maximum number of files setting |
Too Many Redirects | The server returned a redirect response multiple times | |
Trust Failure | A certificate could not be validated | Reconfigure SSL/TLS settings may allow copying to continue |
Unsupported Protocol | The URL protocol is not supported by WebCopy | No resolution. WebCopy currently supports the http , https and data protocols |
Ignored During Scan2 | The URL represents a file that can't be scanned for more links, and the current operation is an analyse | The file will be downloaded when performing a site copy, unless excluded for other reasons |
Sometimes a webserver may return a 403 response when it detects a non-standard user agent. Reconfiguring the user agent to mimic a web browser can sometimes resolve this. KB article.↩
This status is always returned when doing a scan (not a download) of a website and non-HTML content is encountered.↩