hcrawler

a hierachical web crawler with concurrency control and server-side jQuery support

npm install hcrawler
8 downloads in the last week
17 downloads in the last month

HCrawler

A hierachical crawler with concurrency control. Provide DOM facility for fetch data from web sites.

Quick Example

crawler.run(

  //href array
  href_array,

  // parse function for each level
  [
    parse_href,
    parse_info
  ],

  // callback function
  function (results) {
    save_csv('info.csv');
  },

  // breadth first strategy
  'breadth'
);

How to

Pls see vessel_crawler.js for detail.

Require

async, cheerio

npm loves you