console-crawler

0.2.0 • Public • Published

Console Crawler

A Node app to crawls a given web site.

npm install -g console-crawler;
console-crawler http://en.wikipedia.org/ --legs=8
console-crawler http://en.wikipedia.org/ --legs=2 --phantom

Quick Set-Up for dev

  1. This is a Node app, so you'll need node/npm to run it.
  2. Clone down the repo
  3. Install the dependencies npm install.
  4. Fire up the crawler.

Or, Copy-Paste


git clone https://github.com/robcolburn/console-crawler;
cd console-crawler;
npm install;
./console-crawler.js http://en.wikipedia.org/ --legs=8;

Notes

  1. On Mac, you'll likely need X-Code Command Line tools installed.

  2. If you'd like to use PhantomJS. You'll need to download PhatomJS, and install it separately since it has it's own binary.

  3. If you need target a different "Host", you may just need to edit your hosts file. For instance, say I wanted to hit 5.5.5.5, but with the host of example.com which isn't ready to go live just yet. I might add the following to my hosts file.

5.5.5.5 example.com

Package Sidebar

Install

npm i console-crawler

Weekly Downloads

4

Version

0.2.0

License

none

Last publish

Collaborators

  • robcolburn