jm + google   98

Ryanair drops out of top Google flight search results after website overhaul | Business | theguardian.com
They've done the classic website-redesign screwup -- omitted redirects from the old URLs.
Sam Silverwood-Cope, director of Intelligent Positioning, said: "They've ignored the legacy of the old Ryanair.com. It's quite startling. They are doing it just before their busiest time of the year." A change in [URLs] without proper redirects means many results found by Google now simply return error pages, he added. "Unless redirects get put in pretty soon, the position is going to get worse and worse."
ryanair  inept  fail  funny  via:christinebohan  web  google  search  redirects 
6 days ago by jm
Beefcake
A sane Google Protocol Buffers library for Ruby. It's all about being Buf; ProtoBuf.
protobuf  google  protocol-buffers  ruby  coding  libraries  gems  open-source 
8 days ago by jm
Google's Open Bidder stack moving from Jetty to Netty
Open Bidder traditionally used Jetty as an embedded webserver, for the critical tasks of accepting connections, processing HTTP requests, managing service threads, etc. Jetty is a robust, but traditional stack that carries the weight and tradeoffs of Servlet’s 15 years old design. For a maximum performance RTB agent that must combine very large request concurrency with very low latencies, and often benefit also from low-level control over the transport, memory management and other issue, a different webserver stack was required. Open Bidder now supports Netty, an asynchronous, event-driven, high-performance webserver stack.

For existing code, the most important impact is that Netty is not compatible with the Servlet API. Its own internal APIs are often too low-level, not to mention proprietary to Netty; so Open Bidder v0.5 introduces some new, stack-neutral APIs for things like HTTP requests and responses, cookies, request handlers, and even simple HTML templating based on Mustache. These APIs will work with both Netty and Jetty. This means you don’t need to change any code to switch between Jetty and Netty; on the other hand, it also means that existing code written for Open Bidder 0.4 may need some changes even if you plan to keep using Jetty.

[....] Netty's superior efficiency is very significant; it supports 50% more traffic in the same hardware, and it maintains a perfect latency distribution even at the peak of its supported load.


This doc is noteworthy on a couple of grounds:

1. the use of Netty in a public API/library, and the additional layer in place to add a friendlier API on top of that. I hope they might consider releasing that part as OSS at some point.

2. I also find it interesting that their API uses protobufs to marshal the message, and they plan in a future release to serialize those to JSON documents -- that makes a lot of sense.
apis  google  protobufs  json  documents  interoperability  netty  jetty  servlets  performance  java 
15 days ago by jm
How Gmail Happened: The Inside Story of Its Launch 10 Years Ago Today
the inside story of the great work done by Paul Buchheit, Kevin Fox, and Sanjeev Singh to reinvent email in 2004
history  gmail  email  smtp  mua  paul-buchheit  kevin-fox  launches  google  web 
23 days ago by jm
UPDATED: Google begged Steve Jobs for permission to hire engineers for its new Paris office. Guess what happened next… | PandoDaily
in a field as critical and competitive as smartphones, Google’s R&D strategy was being dictated, not by the company’s board, or by its shareholders, but by a desire not to anger the CEO of a rival company.


This is utterly bananas and anti-competitive. (via Des Traynor)
via:destraynor  wage-fixing  apple  google  tech  paris  r-and-d  steve-jobs  jean-marie-hullot  france  competition  poaching  assholes 
24 days ago by jm
DNS results now being manipulated in Turkey
Deep-packet inspection and rewriting on DNS packets for Google and OpenDNS servers. VPNs and DNSSEC up next!
turkey  twitter  dpi  dns  opendns  google  networking  filtering  surveillance  proxying  packets  udp 
24 days ago by jm
Adrian Cockroft's Cloud Outage Reports Collection
The detailed summaries of outages from cloud vendors are comprehensive and the response to each highlights many lessons in how to build robust distributed systems. For outages that significantly affected Netflix, the Netflix techblog report gives insight into how to effectively build reliable services on top of AWS. [....] I plan to collect reports here over time, and welcome links to other write-ups of outages and how to survive them.
outages  post-mortems  documentation  ops  aws  ec2  amazon  google  dropbox  microsoft  azure  incident-response 
4 weeks ago by jm
Why Google Flu Trends Can't Track the Flu (Yet)
It's admittedly hard for outsiders to analyze Google Flu Trends, because the company doesn't make public the specific search terms it uses as raw data, or the particular algorithm it uses to convert the frequency of these terms into flu assessments. But the researchers did their best to infer the terms by using Google Correlate, a service that allows you to look at the rates of particular search terms over time. When the researchers did this for a variety of flu-related queries over the past few years, they found that a couple key searches (those for flu treatments, and those asking how to differentiate the flu from the cold) tracked more closely with Google Flu Trends' estimates than with actual flu rates, especially when Google overestimated the prevalence of the ailment. These particular searches, it seems, could be a huge part of the inaccuracy problem.

There's another good reason to suspect this might be the case. In 2011, as part of one of its regular search algorithm tweaks, Google began recommending related search terms for many queries (including listing a search for flu treatments after someone Googled many flu-related terms) and in 2012, the company began providing potential diagnoses in response to symptoms in searches (including listing both "flu" and "cold" after a search that included the phrase "sore throat," for instance, perhaps prompting a user to search for how to distinguish between the two). These tweaks, the researchers argue, likely artificially drove up the rates of the searches they identified as responsible for Google's overestimates.


via Boing Boing
google  flu  trends  feedback  side-effects  colds  health  google-flu-trends 
5 weeks ago by jm
Corporate Tax 2014: Irish Government's "flawed premise" on Apple's avoidance
According to our calculation about €40bn or over 40% of Irish services exports of €90bn in 2012 and related national output, resulted from global tax avoidance schemes.

It is true that Ireland gains little from tax cheating but at some point, the US tax system will be reformed and a territorial system where companies are only liable in the US on US profits, would only be viable if there was a disincentive to shift profits to non-tax or low tax countries. The risk for Ireland is that a minimum foreign tax would be introduced that would be greater than the Irish headline rate of 12.5%.

It's also likely that US investment in Ireland would not have been jeopardized if Irish politicians had not been so eager as supplicants to doff the cap. Nevertheless today it would be taboo to admit the reality of participation in massive tax avoidance and the Captain Renaults of Merrion Street will continue with their version of the Dance of the Seven Veils.
apple  tax  double-irish  tax-avoidance  google  investment  itax  tax-evasion  ireland 
6 weeks ago by jm
Sacked Google worker says staff ratings fixed to fit template
Allegations of fixing to fit the stack-ranking curve: 'someone at Google always had to get a low score “of 2.9”, so the unit could match the bell curve. She said senior staff “calibrated” the ratings supplied by line managers to ensure conformity with the template and these calibrations could reduce a line manager’s assessment of an employee, in effect giving them the poisoned score of less than three.'
stack-ranking  google  ireland  employment  work  bell-curve  statistics  eric-schmidt 
6 weeks ago by jm
"Dapper, a Large-Scale Distributed Systems Tracing Infrastructure" [PDF]
Google paper describing the infrastructure they've built for cross-service request tracing (ie. "tracer requests"). Features: low code changes required (since they've built it into the internal protobuf libs), low performance impact, sampling, deployment across the ~entire production fleet, output visibility in minutes, and has been live in production for over 2 years. Excellent read
dapper  tracing  http  services  soa  google  papers  request-tracing  tracers  protobuf  devops 
6 weeks ago by jm
Traffic Graph – Google Transparency Report
this is cool. Google are exposing an aggregated 'all services' hit count time-series graph, broken down by country, as part of their Transparency Report pages
transparency  filtering  web  google  http  graphs  monitoring  syria 
8 weeks ago by jm
A sampling profiler for your daily browsing - Google Groups
via Ilya Grigorik: Chrome Canary now has a built-in, always-on, zero-overhead code profiler. I want this in my server-side JVMs!
chrome  tracing  debugging  performance  profiling  google  sampling-profiler  javascript  blink  v8 
january 2014 by jm
Google Fonts recently switched to using Zopfli
Google Fonts recently switched to using new Zopfli compression algorithm:  the fonts are ~6% smaller on average, and in some cases up to 15% smaller! [...]
What's Zopfli? It's an algorithm that was developed by the compression team at Google that delivers ~3~8% bytesize improvement when compared to gzip with maximum compression. This byte savings comes at a cost of much higher encoding cost, but the good news is, fonts are static files and decompression speed is exactly the same. Google Fonts pays the compression cost once and every clients gets the benefit of smaller download. If you’re curious to learn more about Zopfli: http://bit.ly/Y8DEL4
zopfli  compression  gzip  fonts  google  speed  optimization 
january 2014 by jm
Jeff Dean - Taming Latency Variability and Scaling Deep Learning [talk]
'what Jeff Dean and team have been up to at Google'. Reducing request latency in a network SOA architecture using backup requests, etc., via Ilya Grigorik
youtube  talks  google  low-latency  soa  architecture  distcomp  jeff-dean  networking 
november 2013 by jm
Mike Hearn - Google+ - The packet capture shown in these new NSA slides shows…
The packet capture shown in these new NSA slides shows internal database replication traffic for the anti-hacking system I worked on for over two years. Specifically, it shows a database recording a user login.


This kind of confirms my theory that the majority of interesting traffic for the NSA/GCHQ MUSCULAR sniffing system would have been inter-DC replication. Was, since it sounds like that stuff's all changing now to use end-to-end crypto...
google  crypto  security  muscular  nsa  gchq  mike-hearn  replication  sniffing  spying  surveillance 
november 2013 by jm
Bruce Schneier On The Feudal Internet And How To Fight It
This is very well-put.
In its early days, there was a lot of talk about the "natural laws of the Internet" and how it would empower the masses, upend traditional power blocks, and spread freedom throughout the world. The international nature of the Internet made a mockery of national laws. Anonymity was easy. Censorship was impossible. Police were clueless about cybercrime. And bigger changes were inevitable. Digital cash would undermine national sovereignty. Citizen journalism would undermine the media, corporate PR, and political parties. Easy copying would destroy the traditional movie and music industries. Web marketing would allow even the smallest companies to compete against corporate giants. It really would be a new world order.
Unfortunately, as we know, that's not how it worked out. Instead, we have seen the rise of the feudal Internet:
Feudal security consolidates power in the hands of the few. These companies [like Google, Apple, Microsoft, Facebook etc.] act in their own self-interest. They use their relationship with us to increase their profits, sometimes at our expense. They act arbitrarily. They make mistakes. They're deliberately changing social norms. Medieval feudalism gave the lords vast powers over the landless peasants; we’re seeing the same thing on the Internet.
bruce-schneier  politics  internet  feudal-internet  google  apple  microsoft  facebook  government 
october 2013 by jm
Google: Our Robot Cars Are Better Drivers Than Puny Humans | MIT Technology Review
One of those analyses showed that when a human was behind the wheel, Google’s cars accelerated and braked significantly more sharply than they did when piloting themselves. Another showed that the cars’ software was much better at maintaining a safe distance from the vehicle ahead than the human drivers were. “We’re spending less time in near-collision states,” said Urmson. “Our car is driving more smoothly and more safely than our trained professional drivers.”
google  cars  driving  safety  roads  humans  robots  automation 
october 2013 by jm
Is Google building a hulking floating data center in SF Bay?
Looks pretty persuasive, especially considering they hold a patent on the design
google  data-centers  bay-area  ships  containers  shipping  sea  wave-power  treasure-island 
october 2013 by jm
The New York Review of Bots - @TwoHeadlines: Comedy, Tragedy, Chicago Bears
What is near-future late-capitalist dystopian fiction but a world where there is no discernible difference between corporations, nations, sports teams, brands, and celebrities? Adam was partly right in our original email thread. @TwoHeadlines is not generating jokes about current events. It is generating jokes about the future: a very specific future dictated by what a Google algorithm believes is important about humans and our affairs.
google-news  google  algorithms  word-frequency  twitter  twoheadlines  bots  news  emergent  jokes 
october 2013 by jm
"High Performance Browser Networking", by Ilya Grigorik, read online for free
Wow, this looks excellent. A must-read for people working on systems with high-volume, low-latency phone-to-server communications -- and free!
How prepared are you to build fast and efficient web applications? This eloquent book provides what every web developer should know about the network, from fundamental limitations that affect performance to major innovations for building even more powerful browser applications—including HTTP 2.0 and XHR improvements, Server-Sent Events (SSE), WebSocket, and WebRTC.

Author Ilya Grigorik, a web performance engineer at Google, demonstrates performance optimization best practices for TCP, UDP, and TLS protocols, and explains unique wireless and mobile network optimization requirements. You’ll then dive into performance characteristics of technologies such as HTTP 2.0, client-side network scripting with XHR, real-time streaming with SSE and WebSocket, and P2P communication with WebRTC.

Deliver optimal TCP, UDP, and TLS performance;
Optimize network performance over 3G/4G mobile networks;
Develop fast and energy-efficient mobile applications;
Address bottlenecks in HTTP 1.x and other browser protocols;
Plan for and deliver the best HTTP 2.0 performance;
Enable efficient real-time streaming in the browser;
Create efficient peer-to-peer videoconferencing and low-latency applications with real-time WebRTC transports


Via Eoin Brazil.
book  browser  networking  performance  phones  mobile  3g  4g  hsdpa  http  udp  tls  ssl  latency  webrtc  websockets  ebooks  via:eoin-brazil  google  http2  sse  xhr  ilya-grigorik 
october 2013 by jm
Google swaps out MySQL, moves to MariaDB
When we asked Sallner to quantify the scale of the migration he said, "They're moving it all. Everything they have. All of the MySQL servers are moving to MariaDB, as far as I understand."

By moving to MariaDB, Google can free itself of any dependence on technology dictated by Oracle – a company whose motivations are unclear, and whose track record for working with the wider technology community is dicey, to say the least. Oracle has controlled MySQL since its acquisition of Sun in 2010, and the key InnoDB storage engine since it got ahold of Innobase in 2005.

[...] We asked Cole why Google would shift from MySQL to MariaDB, and what the key technical differences between the systems were. "From my perspective, they're more or less equivalent other than if you look at specific features and how they implement them," Cole said, speaking in a personal capacity and not on behalf of Google. "Ideologically there are lots of differences."


So -- AWS, when will RDS offer MariaDB as an option?
google  mysql  mariadb  sql  open-source  licensing  databases  storage  innodb  oracle 
september 2013 by jm
How To Buffer Full YouTube Videos Before Playing
summary - turn off DASH (Dynamic adaptive streaming) using a userscript.
chrome  youtube  google  video  dash  mpeg  streaming 
september 2013 by jm
_MillWheel: Fault-Tolerant Stream Processing at Internet Scale_ [paper, pdf]
from VLDB 2013:

MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of the framework’s fault-tolerance guarantees.

This paper describes MillWheel’s programming model as well as its implementation. The case study of a continuous anomaly detector in use at Google serves to motivate how many of MillWheel’s features are used. MillWheel’s programming model provides a notion of logical time, making it simple to write time-based aggregations. MillWheel was designed from the outset with fault tolerance and scalability in mind. In practice, we find that MillWheel’s unique combination of scalability, fault tolerance, and a versatile programming model lends itself to a wide variety of problems at Google.
millwheel  google  data-processing  cep  low-latency  fault-tolerance  scalability  papers  event-processing  stream-processing 
august 2013 by jm
Information on Google App Engine's recent US datacenter relocations - Google Groups
or, really, 'why we had some glitches and outages recently'. A few interesting tidbits about GAE innards though (via Bill De hOra)
gae  google  app-engine  outages  ops  paxos  eventual-consistency  replication  storage  hrd 
august 2013 by jm
al3x/sovereign
'Sovereign is a set of Ansible playbooks that you can use to build and maintain' your own GMail/Google calendar/etc. on a VPS. Some up-to-date hosting tips, basically
sovereign  gmail  google  vps  ansible  al3x  hosting 
august 2013 by jm
The NSA Is Commandeering the Internet - Bruce Schneier
You, an executive in one of those companies, can fight. You'll probably lose, but you need to take the stand. And you might win. It's time we called the government's actions what it really is: commandeering. Commandeering is a practice we're used to in wartime, where commercial ships are taken for military use, or production lines are converted to military production. But now it's happening in peacetime. Vast swaths of the Internet are being commandeered to support this surveillance state.

If this is happening to your company, do what you can to isolate the actions. Do you have employees with security clearances who can't tell you what they're doing? Cut off all automatic lines of communication with them, and make sure that only specific, required, authorized acts are being taken on behalf of government. Only then can you look your customers and the public in the face and say that you don't know what is going on -- that your company has been commandeered.
nsa  america  politics  privacy  data-protection  data-retention  law  google  microsoft  security  bruce-schneier 
august 2013 by jm
ICO’s Tame Investigation Of Google Street View Data Slurping
“People will yet again be asking whether Google has been let off without the kind of full and rigorous investigation that you would expect after this kind of incident,” Nick Pickles, director of the Big Brother Watch, told TechWeekEurope. “Let’s not forget that information was collected without permission from thousands of people’s Wi-Fi networks, in a way that if an individual had done so they would have almost certainly have been prosecuted. It seems strange that ICO [the UK's Data Protection regulatory agency] did not want to inspect the [datacenter] cages housing the data, while it is also troubling that Google’s assurances were taken at face value, despite this not being the first incident where consumers have seen their privacy violated by the company.”
privacy  google  ico  regulation  data-protection  snooping  wifi  sniffing  network-traffic  street-view 
july 2013 by jm
packetdrill - network stack testing tool
[Google's] packetdrill scripting tool enables quick, precise tests for entire TCP/UDP/IPv4/IPv6 network stacks, from the system call layer down to the NIC hardware. packetdrill currently works on Linux, FreeBSD, OpenBSD, and NetBSD. It can test network stack behavior over physical NICs on a LAN, or on a single machine using a tun virtual network device.
testing  networking  tun  google  linux  papers  tcp  ip  udp  freebsd  openbsd  netbsd 
july 2013 by jm
Google Cloud Messaging for Android
GCM is a service that allows you to send data from your server to your users' Android-powered device, and also to receive messages from devices on the same connection. The GCM service handles all aspects of queueing of messages and delivery to the target Android application running on the target device. GCM is completely free no matter how big your messaging needs are, and there are no quotas.
gcm  messaging  android  google  push 
july 2013 by jm
Google Translate of "Lorem ipsum"
The perils of unsupervised machine learning... here's what GTranslate reckons "lorem ipsum" translates to:
We will be sure to post a comment. Add tomato sauce, no tank or a traditional or online. Until outdoor environment, and not just any competition, reduce overall pain. Cisco Security, they set up in the throat develop the market beds of Cura; Employment silently churn-class by our union, very beginner himenaeos. Monday gate information. How long before any meaningful development. Until mandatory functional requirements to developers. But across the country in the spotlight in the notebook. The show was shot. Funny lion always feasible, innovative policies hatred assured. Information that is no corporate Japan
lorem-ipsum  boilerplate  machine-learning  translation  google  translate  probabilistic  tomato-sauce  cisco  funny 
june 2013 by jm
stuff Google has learned from their hiring data
A. On the hiring side, we found that [interview] brainteasers are a complete waste of time. How many golf balls can you fit into an airplane? How many gas stations in Manhattan? A complete waste of time. They don’t predict anything. They serve primarily to make the interviewer feel smart.

Instead, what works well are structured behavioral interviews, where you have a consistent rubric for how you assess people, rather than having each interviewer just make stuff up. Behavioral interviewing also works — where you’re not giving someone a hypothetical, but you’re starting with a question like, “Give me an example of a time when you solved an analytically difficult problem.” The interesting thing about the behavioral interview is that when you ask somebody to speak to their own experience, and you drill into that, you get two kinds of information. One is you get to see how they actually interacted in a real-world situation, and the valuable “meta” information you get about the candidate is a sense of what they consider to be difficult.

This makes sense, and matches what I learned in Amazon. Bad news for Microsoft though! (Correction: Adam Shostack got in touch to note that MS haven't done this for 10+ years either.)

Also, I like this:

A. One of the things we’ve seen from all our data crunching is that G.P.A.’s are worthless as a criteria for hiring, and test scores are worthless — no correlation at all except for brand-new college grads, where there’s a slight correlation. Google famously used to ask everyone for a transcript and G.P.A.’s and test scores, but we don’t anymore, unless you’re just a few years out of school. We found that they don’t predict anything. What’s interesting is the proportion of people without any college education at Google has increased over time as well. So we have teams where you have 14 percent of the team made up of people who’ve never gone to college.
google  hiring  interviewing  interviews  brainteasers  gpa  microsoft  star  amazon 
june 2013 by jm
Schneier on Security: Blowback from the NSA Surveillance
Unintended consequences on US-focused governance of the internet and cloud computing:
Writing about the new Internet nationalism, I talked about the ITU meeting in Dubai last fall, and the attempt of some countries to wrest control of the Internet from the US. That movement just got a huge PR boost. Now, when countries like Russia and Iran say the US is simply too untrustworthy to manage the Internet, no one will be able to argue. We can't fight for Internet freedom around the world, then turn around and destroy it back home. Even if we don't see the contradiction, the rest of the world does.
internet  freedom  cloud-computing  amazon  google  hosting  usa  us-politics  prism  nsa  surveillance 
june 2013 by jm
Don’t Overuse Mocks
hooray, sanity from the Google Testing blog. this has been a major cause of pain in the past, dealing with tricky rewrites of mock-heavy unit test code
mocking  testing  tests  google  mocks  unit-testing 
may 2013 by jm
Hermetic Servers
'What is a Hermetic Server? The short definition would be a “server in a box”. If you can start up the entire server on a single machine that has no network connection AND the server works as expected, you have a hermetic server! This is a special case of the more general “hermetic” concept which applies to an isolated system not necessarily on a single machine.

Why is it useful to have a hermetic server? Because if your entire [system under test] is composed of hermetic servers, it could all be started on a single machine for testing; no network connection necessary! The single machine could be a physical or virtual machine.'

These also qualify as "fakes", using the terminology Martin Fowler suggests at http://martinfowler.com/bliki/TestDouble.html , I think
google  testing  hermetic-servers  test  test-doubles  unit-testing 
may 2013 by jm
Hollywood Studios [attempt to censor] Pirate Bay Documentary
Probably not deliberate, but pretty damn inept.
Over the past weeks several movie studios have been trying to suppress the availability of TPB-AFK [the Pirate Bay documentary] by asking Google to remove links to the documentary from its search engine. The links are carefully hidden in standard DMCA takedown notices for popular movies and TV-shows.

The silent attacks come from multiple Hollywood sources including Viacom, Paramount, Fox and Lionsgate and are being sent out by multiple anti-piracy outfits. Fox, with help from six-strikes monitoring company Dtecnet, asked Google to remove a link to TPB-AFK on Mechodownload. Paramount did the same with a link on the Warez.ag forums. Viacom sent at least two takedown requests targeting links to the Pirate Bay documentary on Mrworldpremiere and Rapidmoviez. Finally, Lionsgate jumped in by asking Google to remove a copy of TPB-AFK from a popular Pirate Bay proxy.
funny  inept  hollywood  lionsgate  fox  viacom  paramount  dtecnet  tpb-afk  piratebay  piracy  copyright  movies  google 
may 2013 by jm
Breaking the 1000 ms Time to Glass Mobile Barrier [slides]
Great presentation from Google on HTML5 CSS+JS render speed, 3G/4G network latency, etc. (via John G)
google  slides  3g  4g  lte  networking  telcos  telecom  css  js  html5  web  via:jg  mobile 
april 2013 by jm
google-http-java-client
Written by Google, this library is a flexible, efficient, and powerful Java client library for accessing any resource on the web via HTTP. It features a pluggable HTTP transport abstraction that allows any low-level library to be used, such as java.net.HttpURLConnection, Apache HTTP Client, or URL Fetch on Google App Engine. It also features efficient JSON and XML data models for parsing and serialization of HTTP response and request content. The JSON and XML libraries are also fully pluggable, including support for Jackson and Android's GSON libraries for JSON.


Not quite as simple an API as Python's requests, sadly, but still an improvement on the verbose Apache HttpComponent API. Good support for unit testing via a built-in mock-response class. Still in beta
google  beta  software  http  libraries  json  xml  transports  protocols 
april 2013 by jm
Google Drive SDK
realtime collaboration API. nifty! but can it collaborate on a per-app shared doc, or does it require that the app user auth to Google and access their own docs?
collaboration  api  realtime  google  javascript 
march 2013 by jm
By the numbers: How Google Compute Engine stacks up to Amazon EC2
Scalr's thoughts on Google's EC2 competitor.
with Google Compute Engine, AWS has a formidable new competitor in the public cloud space, and we’ll likely be moving some of Scalr’s production workloads from our hybrid aws-rackspace-softlayer setup to it when it leaves beta. There’s a strong technical case for migrating heavy workloads to GCE, and I’ll be grabbing popcorn to eagerly watch as the battle unfolds between the giants.
gce  cloud  ec2  amazon  aws  google  scalr 
march 2013 by jm
Opinion: The Internet is a surveillance state
Bruce Schneier op-ed on CNN.com.
So, we're done. Welcome to a world where Google knows exactly what sort of porn you all like, and more about your interests than your spouse does. Welcome to a world where your cell phone company knows exactly where you are all the time. Welcome to the end of private conversations, because increasingly your conversations are conducted by e-mail, text, or social networking sites.
And welcome to a world where all of this, and everything else that you do or is done on a computer, is saved, correlated, studied, passed around from company to company without your knowledge or consent; and where the government accesses it at will without a warrant.
Welcome to an Internet without privacy, and we've ended up here with hardly a fight.
freedom  surveillance  legal  privacy  internet  bruce-schneier  web  google  facebook 
march 2013 by jm
Jeff Dean's list of "Numbers Everyone Should Know"
from a 2007 Google all-hands, the list of typical latency timings from ranging from an L1 cache reference (0.5 nanoseconds) to a CA->NL->CA IP round trip (150 milliseconds).
performance  latencies  google  jeff-dean  timing  caches  speed  network  zippy  disks  via:kellabyte 
march 2013 by jm
Compress data more densely with Zopfli - Google Developers Blog
New compressor from Google, gzip/zip-compatible, slower but slightly smaller results
compression  gzip  zip  deflate  google 
march 2013 by jm
C++ B-Tree
a new C++ template library from Google which implements an in-memory B-Tree container type, suitable for use as a drop-in replacement for std::map, set, multimap and multiset. Lower memory use, and reportedly faster due to better cache-friendliness
c++  google  data-structures  containers  b-trees  stl  map  set  open-source 
february 2013 by jm
Fox DMCA Takedowns Order Google to Remove Fox DMCA Takedowns
Chilling Effects is setup to stop the ‘chilling effects’ of Internet censorship. Google sees this as a good thing and sends takedown requests it receives to be added to the database. Fox sends takedown requests to Google for pages which the company says contain links to material it holds the copyright to. Those pages include those on Chilling Effects which show which links Fox wants taken down. Google delists the Chilling Effects pages from its search engine, thus completing the circle and defeating the very reason Chilling Effects was set up for in the first place.
chilling-effects  copyright  internet  legal  dmca  google  law 
january 2013 by jm
Authentication is machine learning
This may be the most insightful writing about authentication in years:
<p>
From my brief time at Google, my internship at Yahoo!, and conversations with other companies doing web authentication at scale, I’ve observed that as authentication systems develop they gradually merge with other abuse-fighting systems dealing with various forms of spam (email, account creation, link, etc.) and phishing. Authentication eventually loses its binary nature and becomes a fuzzy classification problem.</p><p>This is not a new observation. It’s generally accepted for banking authentication and some researchers like Dinei Florêncio and Cormac Herley have made it for web passwords. Still, much of the security research community thinks of password authentication in a binary way [..]. Spam and phishing provide insightful examples: technical solutions (like Hashcash, DKIM signing, or EV certificates), have generally failed but in practice machine learning has greatly reduced these problems. The theory has largely held up that with enough data we can train reasonably effective classifiers to solve seemingly intractable problems.
</p>


(via Tony Finch.)
passwords  authentication  big-data  machine-learning  google  abuse  antispam  dkim  via:fanf 
december 2012 by jm
GMail partial outage - Dec 10 2012 incident report [PDF]
TL;DR: a bad load balancer change was deployed globally, causing the impact. 21 minute time to detection. Single-location rollout is now on the cards
gmail  google  coe  incidents  postmortems  outages 
december 2012 by jm
Weathering the Unexpected - ACM Queue
Failures happen, and resilience drills help organizations prepare for them.


Good write-up on Google's DiRT (Disaster Recovery Test) procedures, clearly based on Amazon's Gameday exercises. ;) See also http://queue.acm.org/detail.cfm?id=2371297 for a moderated discussion including Jesse Robbins and John Allspaw
game-day  tests  disaster-recovery  dirt  exercises  history  amazon  google  etsy  resilience  acm 
september 2012 by jm
Spanner: Google's Globally-Distributed Database [PDF]

Abstract: Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner.

To appear in:
OSDI'12: Tenth Symposium on Operating System Design and Implementation, Hollywood, CA, October, 2012.
database  distributed  google  papers  toread  pdf  scalability  distcomp  transactions  cap  consistency 
september 2012 by jm
New UK Conservative Party Co-Chair Grant Shapps Founded Google Spamming Business
Wow. Scummy stuff.

Shapps founded HowToCorp in 2005, a site that, among other products, pitches the TrafficPaymaster software. The software apparently “scrapes” or copies content from all over the web, from RSS feeds to even sets of search results, to automatically generate pages that probably make little sense to the human visitor but which may pick up some traffic from Google and, in turn, generate clicks on Google AdSense or other ads.


Google are not happy:


On Sunday sources at Google confirmed TrafficPaymaster was in “violation” of its policies and that its search engine’s algorithms had been equipped to drop the ranking of any webpages created using HowToCorp’s software. Officially, Google said it does not comment on individual cases.

“We have strict policies in place to ensure web users are presented with useful ads when browsing sites in our content network and to ensure our advertisers reach an engaged audience. If we are alerted to a site which breaks our AdSense policies, we will review it and can remove it from our network.”
grant-shapps  uk  politics  tories  spammers  spamming  spinning  adsense  google  spam  trafficpaymaster 
september 2012 by jm
NASA's Mars Rover Crashed Into a DMCA Takedown
An hour or so after Curiosity’s 1.31 a.m. EST landing in Gale Crater, I noticed that the space agency’s main YouTube channel had posted a 13-minute excerpt of the stream. Its title was in an uncharacteristic but completely justified all caps: “NASA LANDS CAR-SIZE ROVER BESIDE MARTIAN MOUNTAIN.”

When I returned to the page ten minutes later, [...] the video was gone, replaced with an alien message: “This video contains content from Scripps Local News, who has blocked it on copyright grounds. Sorry about that.” That is to say, a NASA-made public domain video posted on NASA’s official YouTube channel, documenting the landing of a $2.5 billion Mars rover mission paid for with public taxpayer money, was blocked by YouTube because of a copyright claim by a private news service.
dmca  google  fail  nasa  copyright  false-positives  scripps  youtube  video  mars 
august 2012 by jm
Practical machine learning tricks from the KDD 2011 best industry paper
Wow, this is a fantastic paper. It's a Google paper on detecting scam/spam ads using machine learning -- but not just that, it's how to build out such a classifier to production scale, and make it operationally resilient, and, indeed, operable.

I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
classification  via:codeslinger  training  machine-learning  google  ops  kdd  best-practices  anti-spam  classifiers  ensemble  map-reduce 
july 2012 by jm
'You are shrunk to the height of a nickel and thrown into a blender. Your mass is reduced so that your density is the same as usual. The blades start moving in 60 seconds. What do you do?'
Brilliant responses to this stereotypically-annoying Google interview question:

"Since being shrunk down like this is impossible, I can only assume this is happening inside a dream or nightmare of some kind. I sit down and meditate, summoning up my Siddartha/Neo like mental powers and realise that there is no blender, and that this terrible dream was created by the ego of a sadistic Google employee. As the kundalini fire races up my spine, and my spirit is liberated, I open my third eye and bathe said Google employee in the light of love. I forgive him, for he knows not what he does."
funny  interviewing  google  blenders  reddit 
july 2012 by jm
SSTable and Log Structured Storage: LevelDB
good writeup of LevelDB's native storage formats; the Sorted String Table (SSTable), Log Structured Merge Trees, and Snappy compression
leveldb  nosql  data  storage  disk  persistence  google 
july 2012 by jm
_Building High-level Features Using Large Scale Unsupervised Learning_ [paper, PDF]
"We consider the problem of building highlevel, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images using unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art."
algorithms  machine-learning  neural-networks  sgd  labelling  training  unlabelled-learning  google  research  papers  pdf 
june 2012 by jm
Danish Police Censor Google, Facebook and 8,000 Other Sites by Accident | TorrentFreak
'Lundberg said that his organization was sorry for the mistake and has now adopted a new system whereby blocked sites have to now be approved by two employees instead of one, although why that was not the case already for such a serious process is up for debate. The other question is how at the flick of a switch do 8,000 sites suddenly get added to a blacklist – for whatever reason – without any kind of oversight. Denmark’s IT-Political Association is critical and has called for ISPs to cease cooperation with the voluntary scheme which operates without any kind of judicial review. “Today’s story shows that the police are not able to secure against manual errors that could escalate into something that actually works as a ‘kill switch’ for the Internet,” the group said in a statement.'
censorship  denmark  internet  filtering  review  google  facebook  blocking 
march 2012 by jm
YouTube Identifies Birdsong As Copyrighted Music - Slashdot
'So I asked some questions, and it appears that the birds singing in the background of my video are Rumblefish's exclusive intellectual property."' Major problems with how YouTube is now policing IP infringement, it seems
birdsong  absurd  google  fail  youtube  rumblefish  copyfight 
february 2012 by jm
Google App Engine Price Hike Stuns Developers - - Platform as a Service - Informationweek
'Now that Google has begun offering App Engine users a way to calculate the new rate and compare it with the old rate, developers are realizing their bills will rise, by a factor of 10 or 100 or more in some cases, when the pricing change takes effect in a few months.' - ouch
google  gae  appengine  costs  pricing  paas 
september 2011 by jm
LevelDB Benchmarks
nice results, particularly for sequential ops. will be a Riak backend vs InnoDB
leveldb  riak  databases  files  disk  google  storage  benchmarks 
july 2011 by jm
The first Irish case on defamation via autocomplete
Google Instant has picked up people searching for 'Ballymascanlon hotel receivership' and is now offering this as an autocomplete option -- cue defamation lawsuit. Defamation via machine learning
machine-learning  defamation  google  google-instant  search  ballymascanlon  hotels  autocomplete  law-enforcement 
june 2011 by jm
GTA4 Google Map
wow, very impressive -- as far as I can tell, it really _is_ using GMaps infrastructure to some degree
google-maps  google  maps  gta4  grand-theft-auto  via:nelson  games 
june 2011 by jm
Rumor: Google “Disgusted” With Record Labels
'Once again, Warner is the fly in the ointment, the same company that praises Spotify one day, renews their licenses for the rest of the world and then the next day doesn’t want to license them in the US.'
google  music  cloud  licensing  music-industry  record-labels  warner-music  streaming  from delicious
april 2011 by jm
What Larry Page really needs to do to return Google to its startup roots
massively detailed critique of Google's corporate culture -- lots of internals exposed
google  management  culture  aws  corporate-culture  gossip  from delicious
march 2011 by jm
snappy - A fast compressor/decompressor
'On a single core of a Core i7 processorin 64-bit mode, it compresses at about 250 MB/sec or more and decompresses atabout 500 MB/sec or more. (These numbers are for the slowest inputs in ourbenchmark suite; others are much faster.) In our tests, Snappy usuallyis faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ,etc.) while achieving comparable compression ratios.'  Apache-licensed, from Google
snappy  google  compression  speed  from delicious
march 2011 by jm
Contracts for Java
'Preconditions, postconditions, and invariants are added as Java boolean expressions inside annotations.'  nice
java  google  coding  open-source  contracts  eiffel  preconditions  invariants  annotations  from delicious
february 2011 by jm
Chromium Blog: HTML Video Codec Support in Chrome
'we are supporting the WebM (VP8) and Theora video codecs, and will consider adding support for other high-quality open codecs in the future. Though H.264 plays an important role in video, as our goal is to enable open innovation, support for the codec will be removed and our resources directed towards completely open codec technologies.'
google  chrome  video  webm  h264  open-source  swpats  from delicious
january 2011 by jm
Jon Rafman
fantastic collection of Google Street View gems
jon-rafman  google  photography  tumblr  art  street-view  funny  from delicious
december 2010 by jm
good investigation into an Android WebKit exploit
already fixed in Froyo, but still -- interesting write-up from Sophos. good to see Google have chosen to separate all apps into individual uids, too
froyo  google  apps  phones  smartphones  android  webkit  exploits  security  from delicious
november 2010 by jm
http://www.2600.com/googleblacklist/
extensive. the NSFW words that Google Instant won't search for (via Waxy)
nsfw  censorship  filtering  google  keywords  search  blacklist  google-instant  from delicious
september 2010 by jm
Fried Androids? :: The Future of the Internet — And How to Stop It
scary stuff. East Texas patent-troll court has ruled that EchoStar must remotely disable customers' DVRs due to patent infringement, which they are (thankfully) refusing to do and are now held in contempt for $200M -- the blog suggests this could happen due to the Google-Oracle suit, to Android phones
google  via:tieguy  law  east-texas  dvr  remote-disabling  internet  oracle  swpats  from delicious
august 2010 by jm
Overclocking SSL
techie details from Adam Langley on how Google's been improving TLS/SSL, with lots of good tips. they switched in January to HTTPS for all Gmail users by default, without any additional machines or hardware
certificates  encryption  google  https  latency  speed  ssl  tcp  tls  web  performance  from delicious
july 2010 by jm
Network Advertising Initiative: Opt-Out of Behavioural Advertising
'developed for the express purpose of allowing consumers to "opt out" of the behavioral advertising delivered by our member companies' -- opt out of the top 50 or so ad programs with a couple of clicks, via Jordan Sissel. great stuff
ads  advertising  browser  cookies  via:jordansissel  google  marketing  opt-out  privacy  tracking  web  behavioral  from delicious
june 2010 by jm
SEO Is Mostly Quack Science
'There is no hypothesis being tested here. It's just graphs, and misleading graphs at that. The sad part is, SEOMoz is as close as the SEO industry comes to real science. They may be presenting specious results in hopes of looking like they know what they're talking about, but at least they are collecting some sort of data. Everything else in the field is either anecdotal hocus-pocus or a decree from Matt Cutts. When you hire an SEO consultant, what you are really paying for is domain experience in the not-failing-at-web-design field.'
seo  ted-dziuba  rants  science  seomoz  quality  correlation  statistics  google  from delicious
june 2010 by jm
WebM
open audio/video for the web, from Google; VP8 video codec, Ogg for audio, and a subset of Matroska as the container format. still a patents minefield, though, I'd guess
codec  foss  google  open-source  patents  audio  video  vp8  webm  standards  mozilla  open  web  from delicious
may 2010 by jm
How to get Google Voice working in Ireland
hacky, but I'm very tempted -- GV looks nifty and there's no indication they're bothering to roll it out on this side of the pond
google  google-voice  phone  ireland  hacks  skype  from delicious
march 2010 by jm
« earlier      
per page:    204080120160

related tags

3g  4g  absurd  abuse  acm  adobe  ads  adsense  advertising  ai  al3x  algorithms  amazon  america  android  annotations  ansible  anti-spam  antispam  api  apis  app-engine  appengine  apple  application-shortcuts  apps  architecture  art  assholes  audio  australia  authentication  autocomplete  automation  aws  azure  b-trees  backreferences  backup  ballymascanlon  bay-area  behavioral  bell-curve  belle-du-jour  benchmarks  best-practices  beta  big-data  birdsong  blacklist  blenders  blink  blocking  blogging  blogs  boards  boilerplate  book  bots  brainteasers  browser  browsers  bruce-schneier  c++  caches  cap  cars  censorship  cep  certificates  chilling-effects  chrome  cisco  classification  classifiers  cloud  cloud-computing  codec  coding  coe  colds  collaboration  competition  compression  computer-science  consistency  containers  content  contracts  cookies  copyfight  copyright  corporate-culture  correlation  costs  crypto  css  culture  daily-mail  dapper  dash  data  data-centers  data-processing  data-protection  data-retention  data-structures  database  databases  debugging  defamation  deflate  denmark  design  devops  dirt  disaster-recovery  disk  disks  distcomp  distributed  dkim  dmca  dns  documentation  documents  double-irish  dpi  driving  dropbox  dtecnet  dvr  east-texas  ebooks  ec2  editing  editors  efficiency  eiffel  emacs  email  embed  embedded  embedding  emergent  employment  encryption  encyclopedia-dramatica  enda-kenny  ensemble  eric-schmidt  etsy  event-processing  eventual-consistency  exercises  exploits  exponential-time  extensions  external  facebook  fail  false-positives  fault-tolerance  feedback  feudal-internet  fg  files  filesystems  filtering  fine-gael  firefox  fleets  flu  fonts  foss  fox  france  free-software  freebsd  freedom  froyo  funny  future  gae  game-day  games  gce  gchq  gcm  gems  geolocation  gfs  gmail  google  google-docs  google-flu-trends  google-glass  google-instant  google-maps  google-news  google-voice  googlewhack  gossip  government  gpa  gpl  grand-theft-auto  grant-shapps  graphs  gta4  gzip  h264  hacks  health  hermetic-servers  hiring  history  hollywood  honeypots  hosting  hotels  hrd  hsdpa  html  html5  http  http2  https  hublog  humans  ico  identity  iframe  ilya-grigorik  incident-response  incidents  inept  innodb  internet  interoperability  interviewing  interviews  invariants  investment  ip  ireland  itax  java  javascript  jean-marie-hullot  jeff-dean  jetty  jokes  jon-rafman  journalism  js  json  kdd  kernel  kevin-fox  keywords  labelling  lame  languages  latencies  latency  launches  law  law-enforcement  legal  leveldb  libraries  licensing  life  linear-time  linux  lionsgate  lorem-ipsum  low-latency  lte  mac  machine-learning  mail  mailing-lists  management  map  map-reduce  mapping  maps  mariadb  marketing  mars  messaging  microsoft  mike-hearn  millwheel  mit  mobile  mocking  mocks  monitoring  movies  mozilla  mpeg  mua  multi-dc  muscular  music  music-industry  mysql  nasa  nelson-minar  netbsd  netty  network  network-traffic  networking  neural-networks  news  nosql  nsa  nsfw  on-call  online-backup  open  open-source  openbsd  opendns  ops  opt-out  optimization  oracle  osx  outages  paas  packets  papers  paramount  paris  passwords  patents  paul-buchheit  paxos  pdf  people  performance  persistence  pervasive-computing  phone  phones  photography  piracy  piratebay  pliers  poaching  politics  post-mortems  postmortems  powerpoint  ppt  preconditions  pricing  prism  privacy  probabilistic  profiling  protobuf  protobufs  protocol-buffers  protocols  proxying  push  puzzles  python  quality  questions  r-and-d  rants  re2  realtime  record-labels  reddit  redirects  regexps  regular-expressions  regulation  reliability  remote-disabling  replication  request-tracing  research  resilience  review  riak  roads  robots  ruby  rumblefish  ryanair  safety  sampling-profiler  satire  scalability  scalr  science  scripps  sea  search  security  seo  seomoz  services  servlets  set  sgd  shipping  ships  side-effects  site-specific-browsers  skype  slides  smartphones  smtp  snappy  sniffing  snooping  soa  software  sovereign  spam  spammers  spamming  speed  spinning  spying  sql  sse  ssl  stack-ranking  standards  star  statistics  steve-jobs  stings  stl  stock-photos  storage  stream-processing  streaming  street-view  stupid  surveillance  swpats  sync  syria  talks  tax  tax-avoidance  tax-evasion  tcp  tech  technology  ted-dziuba  telcos  telecom  test  test-doubles  testing  tests  timing  tls  tomato-sauce  toread  tories  tpb-afk  tracers  tracing  tracking  trafficpaymaster  training  transactions  translate  translation  transparency  transports  treasure-island  trends  triangulation  trolling  tumblr  tun  turkey  twitter  twoheadlines  udp  ui  uk  unicode  unit-testing  unlabelled-learning  unladen-swallow  unsubscribe  us-politics  usa  user-scripts  v8  via:AdamMaguire  via:aliverson  via:christinebohan  via:codeslinger  via:destraynor  via:eoin-brazil  via:fanf  via:jg  via:jordansissel  via:kellabyte  via:nelson  via:preddit  via:spoon  via:tieguy  via:waxy  viacom  video  viewer  vim  vogons  vp8  vps  wage-fixing  warner-music  wave-power  web  webkit  webm  webrtc  websockets  wifi  word-frequency  work  xhr  xml  youtube  zip  zippy  zopfli 

Copy this bookmark:



description:


tags: