We're no longer updating this content regularly. Recommended Version

Specifying how a website is crawled

As it is impractical for WebCopy to crawl the Internet, by default the crawling is limited to only the domain of the crawl URI, but this can be changed as required.

From the Project Properties dialog, select the General category
Select a mode from the Crawl Mode group
Optionally, check the Download all resources option to always download linked resources

Option	Notes
Site Only	Only crawls URI's that match the host name specified in the crawl URI
Sub domains	Includes any sub domains of the host URI
Sibling domains	Includes both sub domains and sibling domains of the host URI
Everything	Will crawl any HTTP or HTTPS URI detected

Use of the Everything option is not recommended and should only be used on sites which are self contained or where rules are used to explicitly exclude addresses. Use of this option may cause WebCopy to become unstable.

Downloading all resources

While you may not wish to crawl external sites, it is still possible to download any files directly linked from the site you are crawling. When the Download all resources option is set, WebCopy will automatically download any external file, as long as the reported content type is not text/html. The downloaded file will not be crawled, allowing easy downloading of linked images, sounds and other files.

Cyotek WebCopy Help

Specifying how a website is crawled

Downloading all resources

See Also

Customizing Projects

Advanced Project Customization