WebCopy will attempt to download any document it can find on a given website. Supported documents, such as HTML pages or style sheets will also be scanned in order to try and detect additional resources, such as images, video's, and file downloads. The tables below describe the default rules that WebCopy uses to scan content.

HTML Scanning Rules

These rules apply to any document with a content type of text/html.

ElementAttributeNotes
(Any)hrefThe href attribute of the base element is not directly scanned.
(Any)src
(Any)styleContent is parsed according the CSS Scanning Rules in the next section.
imgsrcset
metacontentOnly meta elements containing a http-equiv attribute with the value refresh will be scanned.
objectcodebase
objectdata
parammovieOnly if param is a child of object
sourcesrcsetOnly if source is a child of picture
stylen/aContent is parsed according the CSS Scanning Rules in the next section.
videoposter

In addition to the above rules, you can configure your own using either simple attributes or more complex XPath expressions.

CSS Scanning Rules

These rules apply to any document with a content type of text/css. Note that content within CSS comments (/* ... */) is currently ignored.

Directive / SelectorValueNotes
@imports(Any)Supports URLs wrapped in url() or just a standalone URL.
(Any)url()Any property which uses the url() syntax will be scanned. The inner value can be wrapped in single quotes, double quotes or unquoted.

Additional content types can be supported via plugins.

© 2010-2024 Cyotek Ltd. All Rights Reserved.
Documentation version 1.10 (buildref #186.15944), last modified 2024-08-18. Generated 2024-08-18 08:01 using Cyotek HelpWrite Professional version 6.20.0