Monitoring the Status of Your EBS Volumes
7 days ago by jm
Page in the AWS docs which describes their derived metrics and how they are computed -- these are visible in the AWS Management Console, and alarmable, but not viewable in the Cloudwatch UI. grr. (page-joshea!)
ebs
aws
monitoring
metrics
ops
documentation
cloudwatch
7 days ago by jm
AWS forum post on interpreting iostat output for EBS
8 days ago by jm
Great post from AndrewC@EBS on interpreting iostat output on EBS volumes -- from 2009, but still looks reasonable enough
iostat
ebs
disks
hardware
aws
ops
8 days ago by jm
Measuring & Optimizing I/O Performance
8 days ago by jm
Another good writeup on iostat and EBS, from Ilya Grigorik
io
optimization
sysadmin
performance
iostat
ebs
aws
ops
8 days ago by jm
ec2-consistent-snapshot
Handy!
ubuntu
ec2
aws
linux
ebs
snapshots
ops
tools
alestic
8 days ago by jm
This program creates an EBS snapshot for an Amazon EC2 EBS volume. To
help ensure consistent data in the snapshot, it tries to flush and
freeze the filesystem(s) first as well as flushing and locking the
database, if applicable.
Filesystems can be frozen during the snapshot. Prior to Linux kernel
2.6.29, XFS must be used for freezing support. While frozen, a
filesystem will be consistent on disk and all writes will block.
There are a number of timeouts to reduce the risk of interfering with
the normal database operation while improving the chances of getting a
consistent snapshot.
If you have multiple EBS volumes in a RAID configuration, you can
specify all of the volume ids on the command line and it will create
snapshots for each while the filesystem and database are locked. Note
that it is your responsibility to keep track of the resulting snapshot
ids and to figure out how to put these back together when you need to
restore the RAID setup.
Handy!
8 days ago by jm
Understanding Elastic Block Store Availability and Performance [slides]
20 days ago by jm
fantastic in-depth presentation on EBS usage; lots of good advice here if you're using EBS volumes with/without PIOPS
piops
ebs
performance
aws
ec2
ops
storage
amazon
presentations
20 days ago by jm
Under the Covers of DynamoDB
4 weeks ago by jm
mostly a DynamoDB puff-piece from last week's Amazon Cloud Connect, but contains some good real-world figures for a 20-billion-GUID deduping table use-case at end. ($4,150 per month, to cut to the chase)
dynamodb
aws
figures
costs
architecture
ec2
dedupe
cloud-connect
slides
4 weeks ago by jm
Latency's Worst Nightmare: Performance Tuning Tips and Tricks [slides]
4 weeks ago by jm
the basics of running a service stack (web, app servers, data stores) on AWS. some good benchmark figures in the final slides
benchmarks
aws
ec2
ebs
piops
services
scaling
scalability
presentations
4 weeks ago by jm
High Scalability - Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
5 weeks ago by jm
wow, Pinterest have a pretty hardcore architecture. Sharding to the max. This is scary stuff for me:
yeah, so, eek ;)
clustering
sharding
architecture
aws
scalability
scaling
pinterest
via:matt-sergeant
redis
mysql
memcached
a [Cassandra-style] Cluster Management Algorithm is a SPOF. If there’s a bug it impacts every node. This took them down 4 times.
yeah, so, eek ;)
5 weeks ago by jm
High Performance MongoDB Clusters with Amazon EBS Provisioned IOPS
6 weeks ago by jm
yeah yeah, Mongo. bookmarking for the good data on EBS+PIOPS
ebs
piops
aws
performance
tips
ops
ec2
mongodb
presentations
6 weeks ago by jm
By the numbers: How Google Compute Engine stacks up to Amazon EC2
9 weeks ago by jm
Scalr's thoughts on Google's EC2 competitor.
gce
cloud
ec2
amazon
aws
google
scalr
with Google Compute Engine, AWS has a formidable new competitor in the public cloud space, and we’ll likely be moving some of Scalr’s production workloads from our hybrid aws-rackspace-softlayer setup to it when it leaves beta. There’s a strong technical case for migrating heavy workloads to GCE, and I’ll be grabbing popcorn to eagerly watch as the battle unfolds between the giants.
9 weeks ago by jm
Sift Science says it can sniff out cyber fraud — before it gets expensive
9 weeks ago by jm
Great idea for a startup. This stuff is complex, right in the heart of every company's ordering pipeline, and I can see a lot of customers for this
sift-science
anti-fraud
fraud
b2b
b2c
ecommerce
startups
aws
9 weeks ago by jm
Denominator: A Multi-Vendor Interface for DNS
11 weeks ago by jm
the latest good stuff from Netflix.
dns
netflix
java
tools
ops
route53
aws
ultradns
dynect
Denominator is a portable Java library for manipulating DNS clouds. Denominator has pluggable back-ends, initially including AWS Route53, Neustar Ultra, DynECT, and a mock for testing. We also ship a command line version so it's easy for anyone to try it out.
The reason we built Denominator is that we are working on multi-region failover and traffic sharing patterns to provide higher availability for the streaming service during regional outages caused by our own bugs and AWS issues. To do this we need to directly control the DNS configuration that routes users to each region and each zone. When we looked at the features and vendors in this space we found that we were already using AWS Route53, which has a nice API but is missing some advanced features; Neustar UltraDNS, which has a SOAP based API; and DynECT, which has a REST API that uses a quite different pseudo-transactional model. We couldn’t find a Java based API that grouped together common set of capabilities that we are interested in, so we created one. The idea is that any feature that is supported by more than one vendor API is the highest common denominator, and that functionality can be switched between vendors as needed, or in the event of a DNS vendor outage.
11 weeks ago by jm
Ironfan
january 2013 by jm
'an expressive toolset for constructing scalable, resilient [service] architectures. It works in the cloud, in the data center, and on your laptop, and it makes your system diagram visible and inevitable. Inevitable systems coordinate automatically to interconnect, removing the hassle of manual configuration of connection points (and the associated danger of human error).' Looks like a pretty neat cluster deployment tool; driven from a single configuration file, using Chef, integrating closely with AWS and providing many useful additional features
chef
deployment
clusters
knife
services
aws
ec2
ops
ironfan
demo
january 2013 by jm
James Hamilton - Failures at Scale & How to Ride Through Them - AWS re:Invent 2012 - Cpn208
december 2012 by jm
mostly an update of his classic USENIX paper, but pretty cool to come across a mention of a network monitoring system we've built on page 21 ;)
amazon
james-hamilton
reliabilty
slides
aws
december 2012 by jm
How Team Obama’s tech efficiency left Romney IT in dust | Ars Technica
november 2012 by jm
The web-app dev and ops best practices used by the Obama campaign's tech team. Some key tools: Puppet, EC2, Asgard, Cacti, Opsview, StatsD, Graphite, Seyren, Route53, Loggly, etc.
obama
campaigns
tools
ops
asgard
ec2
aws
route53
november 2012 by jm
Amazon Web Services Blog: Amazon S3 Performance Tips & Tricks
march 2012 by jm
Doug Grismore provides a very useful S3 performance tip; monotonically increasing keys will hurt performance, and describes a clean-enough way to avoid the problem
s3
performance
aws
march 2012 by jm
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
march 2012 by jm
Amazing stuff from Adrian Cockroft at last week's QCon. Faceted object model, lots of Cassandra automation
cassandra
api
design
oo
object-model
java
adrian-cockroft
slides
qcon
scaling
aws
netflix
march 2012 by jm
Cloudsmith Stack Hammer
february 2012 by jm
something Chris Horn sent on -- using Puppet to build stacks and deploy to AWS using a simple point-and-click interface. looks cool
github
ec2
aws
puppet
stacks
cloudsmith
stack-hammer
via:chorn
february 2012 by jm
Benchmarking Cassandra Scalability on AWS - Over a million writes per second
november 2011 by jm
NetFlix' benchmarks -- impressively detailed. '48, 96, 144 and 288 instances', across 3 EC2 AZs in us-east, successfully scaling linearly
ec2
aws
cassandra
scaling
benchmarks
netflix
performance
november 2011 by jm
Amazon hiring embedded OS developers
october 2011 by jm
hey, I know a few of those! 'I need more help on a project I’m driving at Amazon where we continue to make big changes in our datacenter network to improve customer experience and drive down costs while, at the same time, deploying more gear into production each day than all of Amazon.com used back in 2000. It’s an exciting time and we have big changes happening in networking. If you enjoy and have experience in operating systems, networking protocol stacks, or embedded systems and you would like to work on one of the biggest networks in the world, [get in touch].' -- James Hamilton
james-hamilton
aws
jobs
amazon
networking
embedded
october 2011 by jm
Building with Legos
august 2011 by jm
Netflix tech blog on how they deploy their services. Notably, they avoid the Puppet/Chef approach, citing these reasons: 'One is that it eliminates a number of dependencies in the production environment: a master control server, package repository and client scripts on the servers, network permissions to talk to all of these. Another is that it guarantees that what we test in the test environment is the EXACT same thing that is deployed in production; there is very little chance of configuration or other creep/bit rot. Finally, it means that there is no way for people to change or install things in the production environment (this may seem like a really harsh restriction, but if you can build a new AMI fast enough it doesn't really make a difference).'
devops
cloud
aws
netflix
puppet
chef
deployment
august 2011 by jm
Amazon EC2 outage: summary and lessons learned
april 2011 by jm
Rightscale CTO on last week's outage; pretty detailed, good round-up of useful commentary from around the web, too
ebs
ec2
aws
cloud
availability
slas
rightscale
amazon
april 2011 by jm
What Larry Page really needs to do to return Google to its startup roots
march 2011 by jm
massively detailed critique of Google's corporate culture -- lots of internals exposed
google
management
culture
aws
corporate-culture
gossip
from delicious
march 2011 by jm
Quora’s Technology Examined
february 2011 by jm
Python, Nginx, Tornado for COMET stuff, MySQL as a data store, memcached, Thrift, haproxy, AWS, Pylons. fantastic, very detailed post (via Nelson)
quora
python
nginx
tornado
comet
mysql
memcached
thrift
haproxy
aws
pylons
via:nelson
from delicious
february 2011 by jm
Netflix: Dev and Ops internals
november 2010 by jm
extensive details on the innards of Netflix' move to AWS, from the legendary Adrian Cockcroft
adrian-cockcroft
aws
netflix
ops
cloud
from delicious
november 2010 by jm
related tags
adrian-cockcroft ⊕ adrian-cockroft ⊕ advent ⊕ alestic ⊕ amazon ⊕ anti-fraud ⊕ api ⊕ architecture ⊕ asgard ⊕ availability ⊕ aws ⊖ b2b ⊕ b2c ⊕ benchmarks ⊕ campaigns ⊕ cassandra ⊕ chef ⊕ cloud ⊕ cloud-connect ⊕ cloudsmith ⊕ cloudwatch ⊕ clustering ⊕ clusters ⊕ comet ⊕ corporate-culture ⊕ costs ⊕ culture ⊕ dedupe ⊕ demo ⊕ deployment ⊕ design ⊕ devops ⊕ disks ⊕ dns ⊕ documentation ⊕ dynamodb ⊕ dynect ⊕ ebs ⊕ ec2 ⊕ ecommerce ⊕ embedded ⊕ figures ⊕ fraud ⊕ gce ⊕ github ⊕ google ⊕ gossip ⊕ haproxy ⊕ hardware ⊕ io ⊕ iostat ⊕ ironfan ⊕ james-hamilton ⊕ java ⊕ jobs ⊕ knife ⊕ linux ⊕ management ⊕ memcached ⊕ metrics ⊕ mongodb ⊕ monitoring ⊕ mysql ⊕ netflix ⊕ networking ⊕ nginx ⊕ obama ⊕ object-model ⊕ oo ⊕ ops ⊕ optimization ⊕ outages ⊕ performance ⊕ pinterest ⊕ piops ⊕ presentations ⊕ puppet ⊕ pylons ⊕ python ⊕ qcon ⊕ quora ⊕ redis ⊕ reliabilty ⊕ rightscale ⊕ route53 ⊕ s3 ⊕ scalability ⊕ scaling ⊕ scalr ⊕ services ⊕ sharding ⊕ sift-science ⊕ slas ⊕ slides ⊕ smugmug ⊕ snapshots ⊕ stack-hammer ⊕ stacks ⊕ startups ⊕ storage ⊕ sysadmin ⊕ thrift ⊕ tips ⊕ tools ⊕ tornado ⊕ ubuntu ⊕ ultradns ⊕ via:chorn ⊕ via:matt-sergeant ⊕ via:nelson ⊕Copy this bookmark: