The following are changes currently implemented in development builds of Cyotek WebCopy. Please note that these changes are not final and may be removed prior to the next stable release. If you have any comments about these forthcoming changes, please contact us.

Nightly builds featuring these changes are available from the Downloads page.

Due to changes in how WebCopy determines whether or not to process a given URL there could be differences with how WebCopy 1.9 works against previous versions. Please report any inconsistencies to us!

Added

  • It is now possible to read additional URLs to scan from a text file [#281] (User Manual)
  • Added no-directories, max-redirect and header arguments (User Manual)
  • Added proxy, proxy-user and proxy-password arguments [#337] (User Manual)
  • Added input-file argument [#282] (User Manual)
  • Added Redirects To column to the results list
  • Added Local File column to the files list
  • Added Local File, Redirects To, Depth and Distance columns to links lists
  • List views now display a configuration menu when context clicking a column header
  • The GUI client now supports many of the same command line arguments as the CLI [#403] (User Manual)
  • Added a new extension remap mode, Only HTML [#365]. This new option will change the extensions of downloaded files only if the content type is text/html, all other files will be as-is (User Manual). This setting is now also the default for new WebCopy projects
  • Added a new validator to try and detect unsupported websites [#407]
  • Added new URL normalisation options for forcing HTTPS [#383] (User Manual)
  • Added new URL normalisation option for ignoring case [#202] (User Manual)

Changed

  • Adding multiple URLs to scan is now easier using a free text field [#282]
  • Command line tools now report unknown parameters
  • Major reworking of internal decision making logic [#242]
  • New WebCopy projects will default to enabling Brotli compression support
  • New WebCopy projects will default to saving headers
  • The sitemap tree now limits the number of child elements to a maximum of 100 by default [#402]. This setting can be changed in the application options
  • Documentation updates
  • Rule Tester dialogue now includes rule components
  • Reworked setting validation
  • Rule expressions are now validated before crawling a site
  • WebCopy no long treats URLs as case-insensitive for new projects

Removed

  • The Link Checker (GUI and CLI), URI Tester and XPath Tester tools have been removed from distribution due to lack of use

Fixed

  • Only the last argument error was displayed when running command line tools
  • WebCopy will now retry URLs that fail with "The server committed a protocol violation" exceptions
  • If using the default user agent, WebCopy will now try a default browser agent if a 401 response is returned when validating the URL [#382]
  • When issuing a 401 challenge dialogue, WebCopy could include additional header information in the description
  • The Move Down button was incorrectly enabled when adding a new password entry, causing a crash if clicked [#394]
  • Fixed a pair of conditions that could cause site map generation to nest the same tree until it crashes [#391]. This should also resolve a different crash that could occur generating a site diagram [#397]
  • Cookie editor now does a better job of validating entered values
  • Invalid cookies should no longer cause a crash [#396]
  • WebCopy would sometimes remove file extensions that weren't really extensions [#327]
  • Various performance improvements, both major and minor [#399, #404]
  • Last modified date is now read from meta tags if available [#405]
  • Cancelling a crawl should now abort any in-progress downloads
  • Fixed an issue where reading from a hybrid stream returned null bytes up to the stream capacity after exhausting existing data
  • The rule editor could allow you select conflicting options

The following changes are a sneak-peak of features that are currently being experimented with in separate branches from the core product. Due to this they are not available in nightly builds (and there is no guarantee they will be merged into the core branch).

Added

  • Crawl state persistence (pause, resume) [#165]

Changed

  • Crawling now uses breadth first when prioritising enqueued items

Download

Download current and archived versions of Cyotek WebCopy

Download

Minimum Requirements

Donate

Donate