Many files on a remote web server may not behave normally when executed from the local file systems, such as .php
or .aspx
. WebCopy can handle this by changing the extensions of downloaded content to match their mime type. For example, .php
and .aspx
would be renamed to .html
.
By default, WebCopy will only remap HTML files (those with the content type text/html
). All other files will use the original extension of the URL.
Option | Description |
---|---|
Never | No extension remapping is performed. This option is not recommended |
Always | WebCopy will always try and remap extensions |
Only for HTML | (Default) Only URLs with a content type of text/html will be remapped |
Only if no extension present | URLs without an extension will be remapped, those with an extension will not be modified |
Always, except for the specified exclusions | All URLs except those with a content type of application/octet-stream or with user defined extension or content type exclusions will be remapped |
Changing the extension remap mode
- From the Project Properties dialogue, select the Local Files category
- In the Remap file extensions by content type group, select an appropriate option
Specifying exclusions
- From the Project Properties dialogue, select the Local Files category
- In the Remap file extensions by content type group, enter exclusions into the Types to exclude field
Tip
Click Select Types to display a dialogue box for selecting content types either from those detected in the site to be copied, or from a global database
Tip
- You can either enter file extensions, such as
png
into this field, or content types such asimage/png
- This field supports wildcards
Important
When the mode is Always, except for the specified exclusions, the application/octet-stream
exclusion is implicit and cannot be disabled.
Preserving the original extension
When remapping extensions, WebCopy can keep the original extension. For example, if picture.php
had a content type of image/png
, the local filename would be picture.php.png
. To enable or disable this feature
- From the Project Properties dialogue, select the Local Files category
- In the Remap file extensions by content type group, check or uncheck the Keep original extension field
See Also
Configuring the Crawler
Working with local files
- Extracting inline data
- Remapping local files
- Updating local time stamps
- Using query string parameters in local filenames
Controlling the crawl
- Content types
- Crawling multiple URLs
- Crawling outside the base URL
- Downloading all resources
- Including additional domains
- Including sub and sibling domains
- Limiting downloads by file count
- Limiting downloads by size
- Limiting scans by depth
- Limiting scans by distance
- Scanning data attributes
- Setting speed limits
- Working with Rules
JavaScript
Security
- Crawling private areas
- Manually logging into a website
- TLS/SSL certificate options
- Working with Forms
- Working with Passwords
Modifying URLs
Creating a site map
Advanced
- Aborting the crawl using HTTP status codes
- Cookies
- Defining custom headers
- HEAD vs GET for preliminary requests
- HTTP Compression
- Origin reports
- Redirects
- Saving link data in a Crawler Project
- Setting the web page language
- Specifying a User Agent
- Specifying accepted content types
- Using Keep-Alive