Each time a URL is to be processed, WebCopy performs a number of different checks, including rule execution. However, rule execution is currently "quirky" and may not execute as expected - essentially rules are processed as two independent groups, although the UI currently only shows a single list.
Important
The behaviour of content type based rules may change in future versions of WebCopy to reduce confusion.
- WebCopy enumerates each rule
- If the rule is disabled, it is ignored
- If the rule is configured to test against a content type, it is ignored
- If the rule has a URI expression and the expression does not match the URI being process, it is ignored
- If the rule is a match, the outcome will be set to the rule condition (e.g. excluded or included)
- If the rule is a match and the Stop Processing flag is configured for the rule, no further rules are processed
- If the rule is not a match, or the Stop Processing flag is not configured, the next rule is tested
The last matched status is then used to determine if the URI should be included or excluded.
If the URL was not excluded via stage 1, a HEAD
or GET
request will be made to get the headers for additional processing. At this point, rules are processed again as follows
- WebCopy enumerates each rule
- If the rule is disabled, it is ignored
- If the rule is configured to test against anything other than a content type, it is ignored
- If the rule has a URI expression and the expression does not match the URI being process, it is ignored
- If the rule is a match for the content type of the URL being processed, the outcome will be set to the rule condition (e.g. excluded or included)
- If the rule is a match and the Stop Processing flag is configured for the rule, no further rules are processed
- If the rule is not a match, or the Stop Processing flag is not configured, the next rule is tested
The last matched status is then used to determine if the URI should be included or excluded.