Moocher
Web content scraper
Installation
$ npm install --save moocher # or yarn add moocher
Usage
urls options;
urls
{String|Array} a single string url or an array of urls to scrape content from.options
{Object} (optional) the configuration object.limit
{Number} (optional) the number of concurrent requests to make while scraping. Defaults toundefined
which does not enforce a concurrency limit (all requests will be run in parallel).
API
Moocher emits the following events:
"mooch"
: Emits for each response. The callback receives the following arguments:$
: The cheerio-loaded document. This means you can just use jQuery methods on the response document.url
: The original url passed to Moocher.response
: The full response object
"error"
: Emits when a single request fails"complete"
: Emits when the moocher is done mooching.
Example
const mooch = 'https://url-1.com' 'http://url-2.com' 'http://url-3.com' 'https://url-4.com' 'http://url-5.com' limit: 2 // allow only 2 concurrent requests; mooch // emitted for each web page mooched // emitted if any request fails // emitted when all urls have been mooched // start mooching! start;