The third tutorial covers rules. Rules allow you to configure how the web site is downloaded.
This tutorial assumes you have followed the steps in the first tutorial.
When the copy has finished, the Skipped table will show that all URLs containing .gif
were skipped. A yellow icon indicates that the file was skipped due to a rule.
If you copy the website now, you'll get the same results as before. However, the rule is now a little more robust - instead of blindingly ignoring any URL containing .gif, it will only ignore any URL which
http://somewhere.com/test.gif
)http://somewhere.com/test.gif#bookmark
)http://somewhere.com/test.gif?value1=a
)By entering regular expressions as rules, you have powerful control over what content is downloaded and what content is skipped. WebCopy includes a regular expression editor to help build and test rule expressions.
For another example on how use rules to control the crawl, see the how to only copy images example topic.