Including sub and sibling domains

By default, WebCopy will only scan the primary host you specify, for example http://example.com.

If you need to copy non-HTML resources from other domains (e.g. a CDN), this would normally be automatically handled via the use of the Download all resources option. WebCopy can automatically crawl HTML located on sub and sibling domains.

Important

Some project settings are ignored when crawling additional domains, for example the crawling above the root URL.

Automatically crawling sub or sibling domains

From the Project Properties dialogue, select the General category
Select a mode from the Crawl Mode group

Option	Notes
Site Only	Only crawls URLs that match the host name specified in the crawl URL
Sub domains	Includes any sub domains of the host URL
Sibling domains	Includes both sub domains and sibling domains of the host URL
Everything	Will crawl any discovered HTTP or HTTPS URL unless excluded via other settings

Regardless of the setting above, if the Download all resources option is checked then WebCopy will still query resources on other domains and download any non-HTML content, unless the URL is excluded by custom rules.

Use of the Everything option is not recommended and should only be used on sites which are self contained or where rules are used to explicitly exclude addresses. Use of this option may cause WebCopy to become unstable.

Cyotek WebCopy Help

Including sub and sibling domains

Important

Automatically crawling sub or sibling domains

See Also

Configuring the Crawler

Working with local files

Controlling the crawl

JavaScript

Security

Modifying URLs

Creating a site map

Advanced

Deprecated features