Scraping   11499

« earlier    

Google Rival Yelp Claims Search Giant Broke Promise Made to Regulators - WSJ
In 2011, the FTC launched an investigation into Google’s business practices, including pulling content from other sites to augment its own services. Yelp had complained about its reviews being used in Google’s local-business listings. Yelp said it immediately opted out of Google scraping, one of Google’s concessions that led to the FTC closing its investigation.

Then, last month, Yelp said it noticed Google using its images after a North Carolina gym contacted the company because a photo from a...
google  yelp  ftc  scraping 
3 days ago by rstephens
GitHub - emadehsan/thal: Getting started with Puppeteer and Chrome Headless for Web Scraping
Puppeteer is official tool for Chrome Headless by Google Chrome team. Since the official announcement of Chrome Headless, many of the industry standard libraries for automated testing have been discontinued by their maintainers. Including PhantomJS.
Archive  programming  web  scraping 
3 days ago by chrisweiss
Show HN: Getting started with Puppeteer and Chrome Headless for Web Scraping | Hacker News
Correct me if I'm wrong, but if I'm notm mistaken Selenium IDE has been discontinued due to lack of mantainers, and that has little if any relation to Chrome Headless.
Archive  programming  web  scraping 
3 days ago by chrisweiss
NikolaiT/GoogleScraper: A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, Baidu and others) by using proxies (socks4/5, http proxy) and with many different IP's, including asynchronous networking support (very fast).
GoogleScraper - A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, Baidu and others) by using proxies (socks4/5, http proxy) and with many different IP's, including asynchronous networking support (very fast).
google  search  python  library  scraping 
5 days ago by victorfuertes
Mastering Python Web Scraping: Get Your Data Back – Hacker Noon
Do you ever find yourself in a situation where you need to get information out of a website that conveniently doesn’t have an export option? This happened to a client of mine who desperately needed…
web  webdev  testing  data  scraping  python  splinter 
6 days ago by lenards
Page.REST - An HTTP API to extract OpenGraph, oEmbed or any other content from any public web page as JSON.
An HTTP API to extract OpenGraph, oEmbed or any other content from any public web page as JSON.
json  rest  api  parser  scraping  web 
7 days ago by e2b
Import.io | Extract data from the web
The world's leading web data extraction platform for businesses and individuals.
scraping 
8 days ago by roolio
Fucking Search Engines Scraper
Fses is a Python library to scrape urls from search queries. Good for power Google dorking in the command line.
python  search  scraping  Pen_Testing  Reconnaissance  OpenSource  doxxing 
9 days ago by aiefel

« earlier    

related tags

4*  api  archive  archiving  articles  automation  bash  bestpractices  bookmarks  bricolage  browser  browsers  chrome  code  command-line  copy_hm  crawler  crawling  data  data_driven  database  database_driven  databases  development  django  docker  doxxing  elixir  example  facebook  free  freemium  ftc  git  github  google  hacker.news  hacking  haskell  headless  hosted  how  how_to  howto  howtos  httr  ideas  identity  interesting  internet  javascript  js  json  jsonlite  keyboard  laravel  library  linkedin  linux  mass_page_creation  matplotlib  metrics  mongodb  monitoring  news  node.js  node  nodejs  nullprogram  onlinetools  opcit  opensource  packages  page  parser  parsing  pdf  pen_testing  php  pocket  postmortem  pricey  productivity  programming  public  puppeteer  python  qa  r  reconnaissance  report_cards  research  rest  rvest  saas  scrape  scraper  scraping  scraping_api_driven  scrapy  screen  screencapture  scripting  scripts  search  security  selenium  shortcut  sites  slides  social  software  soup  splinter  sql  tableau  tagsoup  technology  testing  text  timestrap  tips  todo  tools  tutorial  twitter  type:application  ui  unittests  video  web  web_apps  webcrawler  webdesign  webdev  webscraping  wegb  www  yelp 

Copy this bookmark:



description:


tags: