Ruby Web Crawler
Posted in ruby on Apr 19th, 2007
Here's a web crawler I wrote awhile back. It's pretty simple, but does the job. If you need something more, you might try rdig or Nutch.
You can run this as a stand-alone script and just pass in the URL to crawl as an argument.
require 'net/http'
require 'uri'
class SiteCrawler
def initialize(url)
@site_uri = URI.parse(url)
[...]
Scott Nedderman is the founder of Netphase.com, a consulting practice that specializes in building web applications for Internet startups. He is also a vocalist, plays guitar and penny whistle, occasionally performs in musicals, enjoys camping and is a homeschooling father of 6.



