Feed on
Posts
Comments

Monthly Archive for April, 2007

Ruby Web Crawler

Here's a web crawler I wrote awhile back. It's pretty simple, but does the job. If you need something more, you might try rdig or Nutch.
You can run this as a stand-alone script and just pass in the URL to crawl as an argument.

require 'net/http'
require 'uri'
class SiteCrawler
  def initialize(url)
    @site_uri = URI.parse(url)
  [...]

Read Full Post »