Working with JavaScript enabled websites

Note

Currently the only supported browser engine is Internet Explorer. Support for Chromium and Gecko will be added in a future update.

WebCopy now includes limited support for crawling websites that are constructed using JavaScript. An external browser engine is used to crawl the website, along with existing WebCopy spider support.

Important

This feature is currently experimental and may not be feature complete, or contain bugs. Please contact Support if you have any feedback on this feature.

The only browser engine currently supported is Internet Explorer
Website crawling may be substantially slower
Custom user agents are not supported
Websites will be able to track you and use system cookies
Malicious script served by compromised servers or rogue adverts will execute
The content that WebCopy downloads may be different if the website uses browser sniffing to reduce functionality for IE users
Multi-threaded crawling cannot be used

Unsupported features

Interactive options are not supported, for example clicking nodes in a dynamic tree or scrolling a page which uses infinite scroll. There are no plans to add support to allow for user interactions to be dynamically performed.

Note that this still does not provide privileged access to a webserver, it is still not possible to download the raw source code of a website or its back-end databases unless the website specifically allows for this.

Future feature support

As noted, this is currently experimental. Future versions of WebCopy will include options for choosing between specific versions of Chromium or Gecko.

Enabling JavaScript support

To crawl websites with support for JavaScript execution enabled

From the Project Properties dialogue, select the Web Browser option group.
Check the Use web browser option

Cyotek WebCopy Help

Working with JavaScript enabled websites

Note

Important

Unsupported features

Future feature support

Enabling JavaScript support

See Also

Configuring the Crawler

Working with local files

Controlling the crawl

Security

Modifying URLs

Creating a site map

Advanced

Deprecated features