skinnymuch + crawlers   10

"All your social networks, blogs and feeds in one convenient place. Read, like, share, crosspost, track, analyze, monitor and collaborate."
social_networks  social_media  crawlers  web_2.0  aggregation  not_sure  text_extraction  web_crawling  web_app 
may 2013 by skinnymuch
URL Search
"Enter a domain to find the location of files in the corpus that have pages from that URL. The output will be an alphabetically ordered list and a JSON file that can be downloaded"
search_engines  open_data  Data  crawlers  web_index  open_source  bootstrap_layout 
may 2013 by skinnymuch
| CommonCrawl
"non-profit foundation dedicated to providing an open repository of web crawl data that can be accessed and analyzed by everyone"
open_data  Google  crawlers  text  Data  open_source  spiders  web_index 
may 2013 by skinnymuch
Anemone - Ruby Web-Spider Framework
"Anemone is a Ruby library that makes it quick and painless to write programs that spider a website. It provides a simple DSL for performing actions on every page of a site, skipping certain URLs, and calculating the shortest path to a given page on a site.
The multi-threaded design makes Anemone fast. The API makes it simple. And the expressiveness of Ruby makes it powerful."
ruby  crawlers  mechanize  automation 
march 2013 by skinnymuch

Copy this bookmark: