If there are sections of your site which do not have any inbound links, but which you still wish crawled, you can specify additional seed URLs for the crawl process. The additional list of URLs can also be read from an external file.
To customise additional root URLs
- From the Project Properties dialogue, select the Additional URLs category
Defining the URL list within the Sitemap Creator project
- Select the Load additional URLs from list option
- Enter the URLs to process, one per line
Tip
Blank lines, or lines beginning with #
or ;
will be ignored
Storing the URL list within an external text file
- Select the Load additional URLs from file option
- Enter or select the name of the external file name
Note
The filename must be full qualified
Tip
Blank lines in the external file, or lines beginning with #
or ;
will be ignored
See Also
Configuring the Crawler
Working with local files
- Extracting inline data
- Remapping extensions
- Remapping local files
- Updating local time stamps
- Using query string parameters in local filenames
Controlling the crawl
- Content types
- Crawling above the root URL
- Including additional domains
- Including sub and sibling domains
- Limiting downloads by file count
- Limiting downloads by size
- Limiting scans by depth
- Limiting scans by distance
- Scanning data attributes
- Setting speed limits
- Working with Rules
JavaScript
Security
- Crawling private areas
- Manually logging into a website
- TLS/SSL certificate options
- Working with Forms
- Working with Passwords
Modifying URLs
Advanced
- Aborting the crawl using HTTP status codes
- Cookies
- Defining custom headers
- Following redirects
- HEAD vs GET for preliminary requests
- HTTP Compression
- Modifying page titles
- Origin reports
- Overwriting read only files
- Saving link data in a Crawler Project
- Setting the web page language
- Specifying a User Agent
- Specifying accepted content types
- Using Keep-Alive