Pinterest Architecture Update - 18 Million Visitors, 10x Growth,12 Employees, 410 TB of Data
4 days ago
"18 million visitors in March, a 50% increase from the previous month, with very little IT infrastructure"
aws
4 days ago
Observations on Errors, Corrections, & Trust of Dependent Systems
4 weeks ago
"Each vertical team knows their component well but nobody understands the interactions of all the components. The two solutions are 1) well-defined and well-documented interfaces between components, be they hardware or software, and 2) and very experienced, highly-skilled engineer(s) on the team focusing on understanding inter-component interaction and overall system operation, especially in fault modes."
consistency
ha
monitoring
4 weeks ago
Architecture Without an End State
4 weeks ago
Michael Nygard outlines 8 rules for dealing with complex systems: Embrace Plurality, Contextualize Downstream, Beware Grandiosity, Decentralize, Isolate Failure Domains, Data Outlives Applications, Applications Outlive Integrations, Increase Discoverability.
distributedgeneral
business
ha
data
4 weeks ago
CS61A, Spring 2012 Online Textbook
9 weeks ago
Derived from the classic textbook Structure and Interpretation of Computer Programs ("SICP") -- s/Scheme/Python
python
functional
9 weeks ago
People Make Poor Monitors for Computers
12 weeks ago
Notable comment: "The logical conclusion is that if a system cannot be 100% automated, then it may be best to deliberately automate less than is technically possible. If human operators are required to routinely solve less than critical problems, they will be better equipped to solve the very rare critical problems."
automation
devgeneral
monitoring
12 weeks ago
The DevOps Transformation - YouTube
12 weeks ago
Highly recommended talk. Keynote Address at the 25th Large Installation System Administration Conference (LISA '11), by Ben Rockwood, Joyent.
automation
configuration
devgeneral
12 weeks ago
Scala -> Bytecode
february 2012
"There are rules how Scala code is compiled to JVM-bytecode. Because of potential name clashes the generated code is not always intuitive to understand but if the rules are known it is possible to get access to the compiled Scala code within Java."
scala
jvm
february 2012
PUT or POST: The REST of the Story « Open Sourcery
january 2012
"Difference between PUT and POST in REST. Essentially, PUTs are idempotent whereas POSTs aren’t."
rest
http
january 2012
corruptmemory/herding-cats - GitHub
january 2012
Scala library for working with Apache Zookeeper. It uses Scalaz (6.x) extensively, in particular Scalaz Promises to eliminate structuring your code as a series of callbacks (i.e.: inversion of control). The result is a much cleaner way to work with ZooKeeper.
scala
configuration
consistency
january 2012
Welcome to the Jungle
january 2012
Henceforth, a single compute-intensive application will need to harness different kinds of cores, in immense numbers, to get its job done.
The free lunch is over. Now welcome to the hardware jungle
concurrency
The free lunch is over. Now welcome to the hardware jungle
january 2012
Netflix/curator - GitHub
january 2012
ZooKeeper client wrapper and rich ZooKeeper framework - from Netflix
configuration
consistency
concurrency
january 2012
Fork Yeah! The Rise and Development of illumos
december 2011
If you are into unix of any kind (or open source or cloud actually), you should watch this one. Gets really good 30 minutes in.
cloudgeneral
io
illumos
joyent
virtualization
december 2011
Scaldi
december 2011
"Goal of the project is to provide more standard and easy way to make dependency injection in Scala projects consuming power of the Scala language. With Scaldi you can define your application modules in pure Scala without any annotations or XML."
scala
configuration
december 2011
A Security Analysis of Amazon’s Elastic Compute Cloud Service
december 2011
"In this paper, we explored the general security risks associated with virtual server images from the public catalogs of cloud service providers. We investigated in detail the security problems of public images that are available on the Amazon EC2 service. Our findings demonstrate that both users and providers of public AMIs may be vulnerable to security risks such as unauthorized access, malware infections, and the loss of sensitive information. The Amazon Web Services Security Team has acknowledged our findings, and has already taken steps to address the security risks we have identified."
cloudgeneral
security
aws
december 2011
The Netflix Tech Blog: Making the Netflix API More Resilient
december 2011
Here are some of the key principles that informed our thinking as we set out to make the API more resilient.
1. A failure in a service dependency should not break the user experience for members
2. The API should automatically take corrective action when one of its service dependencies fails
3. The API should be able to show us what’s happening right now, in addition to what was happening 15-30 minutes ago, yesterday, last week, etc.
ha
messages
monitoring
visualization
1. A failure in a service dependency should not break the user experience for members
2. The API should automatically take corrective action when one of its service dependencies fails
3. The API should be able to show us what’s happening right now, in addition to what was happening 15-30 minutes ago, yesterday, last week, etc.
december 2011
Scala Fresh is alive
december 2011
Regarding Scala's binary compatibility going forward in 2.9.x +
scala
december 2011
Apache Kafka Design
december 2011
There is a small number of major design decisions that make Kafka different from most other messaging systems:
1. Kafka is designed for persistent messages as the common case
2. Throughput rather than features are the primary design constraint
3. State about what has been consumed is maintained as part of the consumer not the server
4. Kafka is explicitly distributed. It is assumed that producers, brokers, and consumers are all spread over multiple machines.
messages
concurrency
ha
1. Kafka is designed for persistent messages as the common case
2. Throughput rather than features are the primary design constraint
3. State about what has been consumed is maintained as part of the consumer not the server
4. Kafka is explicitly distributed. It is assumed that producers, brokers, and consumers are all spread over multiple machines.
december 2011
guava-libraries - Guava: Google Core Libraries for Java 1.5+ - Google Project Hosting
december 2011
several of Google's core libraries that we rely on in our Java-based projects: collections, caching, primitives support, concurrency libraries, common annotations, string processing, I/O, and so forth.
google
jvm
december 2011
The Trouble with Erlang Concurrency
december 2011
(Tim Fox is primary author of https://github.com/purplefox/vert.x )
concurrency
performance
december 2011
Cloud DNS: How to Speed Up Your Cloud Apps at No Extra Charge « Joyeur
november 2011
"leverage the global DNS system to efficiently and accurately route your users to the right global location"
ha
dns
cloudgeneral
networks
performance
november 2011
purplefox/vert.x
october 2011
The next generation polyglot asynchronous application framework
concurrency
io
jvm
performance
october 2011
Single Writer Principle
october 2011
(At heart of one of our new designs; using ZooKeeper to elect the writer)
concurrency
october 2011
ZooKeeper - A Reliable, Scalable Distributed Coordination System
september 2011
High Scalability writeup from 2008
configuration
concurrency
september 2011
I come to use clouds, not to build them... (Netflix)
august 2011
Adrian Cockcroft (Netflix)
cloudgeneral
aws
august 2011
HTML5 Rocks - How Browsers Work: Behind the Scenes of Modern Web Browsers
august 2011
Comprehensive primer on the internal operations of WebKit and Gecko
internet
august 2011
ZooKeeper: Wait-free coordination for Internet-scale systems
august 2011
In this paper, we describe ZooKeeper, a service for coordinating processes of distributed applications. Since ZooKeeper is part of critical infrastructure, ZooKeeper aims to provide a simple and high performance kernel for building more complex coordination primitives at the client. It incorporates elements from group messaging, shared registers, and distributed lock services in a replicated, centralized service. The interface exposed by ZooKeeper has the wait-free aspects of shared registers with an event-driven mechanism similar to cache invalidations of distributed file systems to provide a simple, yet powerful coordination service.
The ZooKeeper interface enables a high-performance service implementation. In addition to the wait-free property, ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state. These design decisions enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers. We show for the target workloads, 2:1 to 100:1 read to write ratio, that ZooKeeper can handle tens to hundreds of thousands of transactions per second. This performance allows ZooKeeper to be used extensively by client applications.
concurrency
messages
performance
configuration
networks
The ZooKeeper interface enables a high-performance service implementation. In addition to the wait-free property, ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state. These design decisions enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers. We show for the target workloads, 2:1 to 100:1 read to write ratio, that ZooKeeper can handle tens to hundreds of thousands of transactions per second. This performance allows ZooKeeper to be used extensively by client applications.
august 2011
SQLAlchemy and You
july 2011
Excellent article introducing main differences between SQLAlchemy and the Django ORM
python
data
july 2011
Replication, atomicity and order in distributed systems
july 2011
"The goal of this post (and future posts on this topic) is to help the reader develop a basic toolkit they could use to reason about distributed systems."
concurrency
messages
cloudgeneral
july 2011
Soft Switching Fails at Scale
july 2011
"The use of software switching in the hypervisor has some good points but, in my view they are heavily outweighed by the bad."
performance
networks
july 2011
Akka and the Java Memory Model
july 2011
Discusses how the Typesafe Stack, and Akka in particular, approaches shared memory in concurrent applications.
concurrency
scala
jvm
july 2011
Comments on “The Good, the Bad, and the Ugly of REST APIs”
june 2011
William Vambenepe - @vambenepe
rest
june 2011
CS 525: Advanced Distributed Systems
june 2011
Nice list of papers and topics
concurrency
data
hadoop
hpccloud
nosql
performance
ha
messages
june 2011
automation
aws
build
business
cache
cloudgeneral
concurrency
configuration
consistency
data
devgeneral
distributedgeneral
dns
functional
google
government
graphdb
ha
hadoop
hpccloud
http
illumos
internet
io
java
joyent
jvm
logging
madison
messages
monitoring
networks
nginx
nosql
paas
performance
postgresql
python
rest
scala
scraping
security
semantic
sql
testing
timf
turk
video
virtualization
visualization