Example: Copying only images
Copy websites locally for offline browsing
In our previous tutorial we described how to define rules. This example follows from this and describes how you can use rules to crawl an entire website - but only save images.
To do an image-only copy of a website, we need to configure a number of rules.
Expression | Options |
---|---|
.* | Exclude, Crawl Content |
\.png | Include, Stop Processing |
\.gif | Include, Stop Processing |
\.jpg | Include, Stop Processing |
The first rule instructs WebCopy to not download any files at all to the save folder, but to still crawl HTML files. This is done by using the expression .*
to match all URLs, and the rule options Exclude and Crawl Content.
Each subsequent rule adds a regular expression to match a specific image extension, for example \.png
. and then uses the Include option to override the previous rule and cause the file to be downloaded. Once a match is made there is no need to continue checking rules, so the Stop Processing option is also set. Alternatively, you could just have a single rule which matched multiple extensions, for example \.(?:png|gif|jpg)
.
With these rules in place when you copy a website it will scan all HTML files but only download to the save folder those matching the specified extensions.
Tip
Add new rules with different extensions to copy different files, for example zip
, exe
or msi
to download programs.
See Also
Project settings
Tutorials and examples
The original location of this content can be found here.
Minimum Requirements
- Windows 10, 8.1, 8, 7, Vista SP2
- Microsoft .NET Framework 4.6
- 20MB of available hard disk space
Donate
This software may be used free of charge, but as with all free software there are costs involved to develop and maintain.
If this site or its services have saved you time, please consider a donation to help with running costs and timely updates.
Donate