Editing an existing rule via the Rules Editor
- Select Rules
from the Project menu, or press Control+R to display the rule editor.
- Select a rule from the list
- Edit the rule as desired
- Click OK to save the changes and close the dialogue
Editing an existing rule via the Rules List
- Context click the rule you wish to modify and select Edit
- Edit the rule as desired
- Click OK to save the changes and close the dialogue
Important
If your expression includes any of the ^
, [
, .
, $
, {
, *
, (
, \
, +
, )
, |
, ?
, <
, >
characters and you want them to processed as plain text, you need to "escape" the character by preceding it with a backslash. For example, if your expression was application/epub+zip
this would need to be written as application/epub\+zip
otherwise the +
character would have a special meaning and no matches would be made. Similarly, if the expression was example.com
, this should be written as example\.com
, as .
means "any character" which could lead to unexpected matches.
Compare Options
This table outlines the different compare options available. The example match is based on the following sample address
http://www.example.com/folder/products?sort=name&order=asc
Option | Description | Example |
---|---|---|
Authority | The URL domain | www.example.com |
Authority, Path, and Query String | The domain, path and query string of the URL | www.example.com/folder/products?sort=name&order=asc |
Content Type | The detected content type of the URL | n/a |
Entire URL | The complete URL | http://www.example.com/folder/products?sort=name&order=asc |
Path | The path of the URL, including file names if applicable | folder/products |
Path and Query String | The path and query string of the URL | folder/products?sort=name&order=asc |
Query String | The query string of the URL | sort=name&order=asc |
Operand | Description |
---|---|
Matches | Specifies the rule will be processed if the given input matches the rule expression |
Does Not Match | Specifies the rule will be processed if the given input does not match the rule expression |
Rule Options
Option | Description |
---|---|
Enable this rule | Specifies if the rule is enabled or not. Disabled rules will be ignored |
Exclude | Specifies that the URL should be excluded |
Include | Specifies that the URL should be included. This allows you to have a wider rule to exclude content, and then a narrower rule to include specific content. |
Crawl Content | Specifies that although the URL is excluded, its contents should still be scanned (applies to HTML documents only). This means that although a permanent copy of the URL is not downloaded, a temporary copy is still made in order to scan for additional URLs to crawl. |
Don't Crawl Content | Specifies that although the URL is included, its contents should not be scanned (applies to HTML documents only). This means that while a permanent copy of the URL is created, it will not be scanned for additional URLs to crawl. |
Stop processing more rules | By default all rules are processed sequentially. You can use this flag to control this process; if set and the rule is matched, no further rules will be processed |
Download Priority | Allows the download priority for URLs matching the rule to be changed. High priority will mean the URL will be downloaded immediately, while Low means the URL will be downloaded when all other URLs have been processed1. |
1 The Download Priority options is only supported for rules that match against a URL, it is ignored for rules matching against content types.