Commit 86a0bd2d authored by ale's avatar ale

Mention trickle as a possible bandwidth limiter

Since such bandwidth limiting is not provided by crawl directly, tell
users there is another solution. Once/if crawl implements that on its
own, that notice could be removed.
parent 19863316
Pipeline #1177 passed with stage
in 14 seconds
...@@ -3,8 +3,11 @@ A very simple crawler ...@@ -3,8 +3,11 @@ A very simple crawler
This tool can crawl a bunch of URLs for HTML content, and save the This tool can crawl a bunch of URLs for HTML content, and save the
results in a nice WARC file. It has little control over its traffic, results in a nice WARC file. It has little control over its traffic,
save for a limit on concurrent outbound requests. Its main purpose is save for a limit on concurrent outbound requests. An external tool
to quickly and efficiently save websites for archival purposes. like `trickle` can be used to limit bandwidth.
Its main purpose is to quickly and efficiently save websites for
archival purposes.
The *crawl* tool saves its state in a database, so it can be safely The *crawl* tool saves its state in a database, so it can be safely
interrupted and restarted without issues. interrupted and restarted without issues.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment