jm + golang   8

pachyderm
'Containerized Data Analytics':
There are two bold new ideas in Pachyderm:

Containers as the core processing primitive
Version Control for data

These ideas lead directly to a system that's much more powerful, flexible and easy to use.

To process data, you simply create a containerized program which reads and writes to the local filesystem. You can use any tools you want because it's all just going in a container! Pachyderm will take your container and inject data into it. We'll then automatically replicate your container, showing each copy a different chunk of data. With this technique, Pachyderm can scale any code you write to process up to petabytes of data (Example: distributed grep).

Pachyderm also version controls all data using a commit-based distributed filesystem (PFS), similar to what git does with code. Version control for data has far reaching consequences in a distributed filesystem. You get the full history of your data, can track changes and diffs, collaborate with teammates, and if anything goes wrong you can revert the entire cluster with one click!

Version control is also very synergistic with our containerized processing engine. Pachyderm understands how your data changes and thus, as new data is ingested, can run your workload on the diff of the data rather than the whole thing. This means that there's no difference between a batched job and a streaming job, the same code will work for both!
analytics  data  containers  golang  pachyderm  tools  data-science  docker  version-control 
4 weeks ago by jm
How and why the leap second affected Cloudflare DNS
The root cause of the bug that affected our DNS service was the belief that time cannot go backwards. In our case, some code assumed that the difference between two times would always be, at worst, zero. RRDNS is written in Go and uses Go’s time.Now() function to get the time. Unfortunately, this function does not guarantee monotonicity. Go currently doesn’t offer a monotonic time source.


So the clock went "backwards", s1 - s2 returned < 0, and the code couldn't handle it (because it's a little known and infrequent failure case).

Part of the root cause here is cultural -- Google has solved the leap-second problem internally through leap smearing, and Go seems to be fundamentally a Google product at heart.

The easiest fix in general in the "outside world" is to use "ntpd -x" to do a form of smearing. It looks like AWS are leap smearing internally (https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/), but it is a shame they aren't making this a standard part of services running on top of AWS and a feature of the AWS NTP fleet.
ntp  time  leap-seconds  fail  cloudflare  rrdns  go  golang  dns  leap-smearing  ntpd  aws 
11 weeks ago by jm
DHCPLB: An open source load balancer | Engineering Blog | Facebook Code
nice infra hacks from an intern at Facebook Dublin working with @pallotron
facebook  golang  go  intern  dublin  dhcp  dhcplb  load-balancers 
september 2016 by jm
youtube/doorman
Doorman is a solution for Global Distributed Client Side Rate Limiting. Clients that talk to a shared resource (such as a database, a gRPC service, a RESTful API, or whatever) can use Doorman to voluntarily limit their use (usually in requests per second) of the resource. Doorman is written in Go and uses gRPC as its communication protocol. For some high-availability features it needs a distributed lock manager. We currently support etcd, but it should be relatively simple to make it use Zookeeper instead.


From google -- very interesting to see they're releasing this as open source, and it doesn't rely on G-internal services
distributed  distcomp  locking  youtube  golang  doorman  rate-limiting  rate-limits  limits  grpc  etcd 
july 2016 by jm
Go best practices, six years in
from Peter Bourgon. Looks like a good list of what to do and what to avoid
go  golang  best-practices  coding  guidelines 
may 2016 by jm
The Three Go Landmines
'There are three easy to make mistakes in go. I present them here in the way they are often found in the wild, not in the way that is easiest to understand. All three of these mistakes have been made in Kubernetes code, getting past code review at least once each that I know of.'
k8s  go  golang  errors  coding  bugs 
march 2016 by jm
go-kit
Dropwizard for Go, basically:
a distributed programming toolkit for building microservices in large organizations. We solve common problems in distributed systems, so you can focus on your business logic.
microservices  go  golang  http  libraries  open-source  rpc  circuit-breakers 
january 2016 by jm

Copy this bookmark:



description:


tags: