WebCopy will attempt to download any document it can find on a given website. Supported documents, such as HTML pages or style sheets will also be scanned in order to try and detect additional resources, such as images, video's, and file downloads. The tables below describe the default rules that WebCopy uses to scan content.

HTML Scanning Rules

These rules apply to any document with a content type of text/html.

ElementAttributeNotes
(Any)hrefThe href attribute of the base element is not directly scanned.
(Any)src
imgsrcset
sourcesrcsetOnly if source is a child of picture
metacontentOnly meta elements containing a http-equiv attribute with the value refresh will be scanned.
objectdata
objectcodebase
(Any)styleContent is parsed according the CSS Scanning Rules in the next section.
stylen/aContent is parsed according the CSS Scanning Rules in the next section.
parammovieOnly if param is a child of object

In addition to the above rules, you can configure your own using either simple attributes or more complex XPath expressions.

CSS Scanning Rules

These rules apply to any document with a content type of text/css. Note that content within CSS comments (/* ... */) is currently ignored.

Directive / SelectorValueNotes
@imports(Any)Supports URL's wrapped in url() or just a standalone URL.
(Any)url()Any property which uses the url() syntax will be scanned. The inner value can be wrapped in single quotes, double quotes or unquoted.

Additional content types can be supported via plugins.