WebCopy can create create report files which identify the original location of each downloaded resource.

No origin reports

To exclude origin report generation

  1. From the Project Properties dialog, select the Advanced category
  2. Set the Origin report field to None
  3. Ensure the Add to source HTML option is not checked

Embedded origin reports

To embed the origin in the source document

  1. From the Project Properties dialog, select the Advanced category
  2. Set the Origin report field to None. Alternatively, to create file based reports in addition to embedding, select another option
  3. Check the Add to source HTML option

Note

Currently only HTML documents support embedded origins

Creating one origin report per URL

To create one origin report for unique URL

  1. From the Project Properties dialog, select the Advanced category
  2. Set the Origin report field to Create a single file for each URL
  3. Optionally, check the Add to source HTML option to include embedded origin reports where applicable in addition the file-based report

When a file is downloaded, the origin will be written to a file with the same name as the local file name, but with a .origin.txt suffix. For example, the origin report for index.html would be index.html.origin.txt.

Each report includes the remote URL, the fully qualified local file name, and the content type of the resource.

Creating one origin report for the project

To create one origin report that has all details for the project

  1. From the Project Properties dialog, select the Advanced category
  2. Set the Origin report field to Create a single report for the entire project
  3. Optionally, check the Add to source HTML option to include embedded origin reports where applicable in addition the file-based report

After the site has been downloaded, an origin report containing all processed URLs will be written to a file named webcopy-origin.txt, located in the save folder.

Each entry in the report includes the remote URL, the fully qualified local file name, and the content type of the resource.

See Also

Configuring the Crawler

Working with local files

Controlling the crawl

JavaScript

Security

Modifying URLs

Creating a site map

Advanced

Deprecated features