jm + aws   27

Monitoring the Status of Your EBS Volumes
Page in the AWS docs which describes their derived metrics and how they are computed -- these are visible in the AWS Management Console, and alarmable, but not viewable in the Cloudwatch UI. grr. (page-joshea!)
ebs  aws  monitoring  metrics  ops  documentation  cloudwatch 
8 days ago by jm
AWS forum post on interpreting iostat output for EBS
Great post from AndrewC@EBS on interpreting iostat output on EBS volumes -- from 2009, but still looks reasonable enough
iostat  ebs  disks  hardware  aws  ops 
9 days ago by jm
Measuring & Optimizing I/O Performance
Another good writeup on iostat and EBS, from Ilya Grigorik
io  optimization  sysadmin  performance  iostat  ebs  aws  ops 
9 days ago by jm
ec2-consistent-snapshot
This program creates an EBS snapshot for an Amazon EC2 EBS volume. To
help ensure consistent data in the snapshot, it tries to flush and
freeze the filesystem(s) first as well as flushing and locking the
database, if applicable.

Filesystems can be frozen during the snapshot. Prior to Linux kernel
2.6.29, XFS must be used for freezing support. While frozen, a
filesystem will be consistent on disk and all writes will block.

There are a number of timeouts to reduce the risk of interfering with
the normal database operation while improving the chances of getting a
consistent snapshot.

If you have multiple EBS volumes in a RAID configuration, you can
specify all of the volume ids on the command line and it will create
snapshots for each while the filesystem and database are locked. Note
that it is your responsibility to keep track of the resulting snapshot
ids and to figure out how to put these back together when you need to
restore the RAID setup.


Handy!
ubuntu  ec2  aws  linux  ebs  snapshots  ops  tools  alestic 
9 days ago by jm
Understanding Elastic Block Store Availability and Performance [slides]
fantastic in-depth presentation on EBS usage; lots of good advice here if you're using EBS volumes with/without PIOPS
piops  ebs  performance  aws  ec2  ops  storage  amazon  presentations 
21 days ago by jm
Under the Covers of DynamoDB
mostly a DynamoDB puff-piece from last week's Amazon Cloud Connect, but contains some good real-world figures for a 20-billion-GUID deduping table use-case at end. ($4,150 per month, to cut to the chase)
dynamodb  aws  figures  costs  architecture  ec2  dedupe  cloud-connect  slides 
4 weeks ago by jm
Latency's Worst Nightmare: Performance Tuning Tips and Tricks [slides]
the basics of running a service stack (web, app servers, data stores) on AWS. some good benchmark figures in the final slides
benchmarks  aws  ec2  ebs  piops  services  scaling  scalability  presentations 
5 weeks ago by jm
High Scalability - Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
wow, Pinterest have a pretty hardcore architecture. Sharding to the max. This is scary stuff for me:
a [Cassandra-style] Cluster Management Algorithm is a SPOF. If there’s a bug it impacts every node. This took them down 4 times.


yeah, so, eek ;)
clustering  sharding  architecture  aws  scalability  scaling  pinterest  via:matt-sergeant  redis  mysql  memcached 
5 weeks ago by jm
High Performance MongoDB Clusters with Amazon EBS Provisioned IOPS
yeah yeah, Mongo. bookmarking for the good data on EBS+PIOPS
ebs  piops  aws  performance  tips  ops  ec2  mongodb  presentations 
7 weeks ago by jm
By the numbers: How Google Compute Engine stacks up to Amazon EC2
Scalr's thoughts on Google's EC2 competitor.
with Google Compute Engine, AWS has a formidable new competitor in the public cloud space, and we’ll likely be moving some of Scalr’s production workloads from our hybrid aws-rackspace-softlayer setup to it when it leaves beta. There’s a strong technical case for migrating heavy workloads to GCE, and I’ll be grabbing popcorn to eagerly watch as the battle unfolds between the giants.
gce  cloud  ec2  amazon  aws  google  scalr 
9 weeks ago by jm
Sift Science says it can sniff out cyber fraud — before it gets expensive
Great idea for a startup. This stuff is complex, right in the heart of every company's ordering pipeline, and I can see a lot of customers for this
sift-science  anti-fraud  fraud  b2b  b2c  ecommerce  startups  aws 
9 weeks ago by jm
Denominator: A Multi-Vendor Interface for DNS
the latest good stuff from Netflix.

Denominator is a portable Java library for manipulating DNS clouds. Denominator has pluggable back-ends, initially including AWS Route53, Neustar Ultra, DynECT, and a mock for testing. We also ship a command line version so it's easy for anyone to try it out.
The reason we built Denominator is that we are working on multi-region failover and traffic sharing patterns to provide higher availability for the streaming service during regional outages caused by our own bugs and AWS issues. To do this we need to directly control the DNS configuration that routes users to each region and each zone. When we looked at the features and vendors in this space we found that we were already using AWS Route53, which has a nice API but is missing some advanced features; Neustar UltraDNS, which has a SOAP based API; and DynECT, which has a REST API that uses a quite different pseudo-transactional model. We couldn’t find a Java based API that grouped together common set of capabilities that we are interested in, so we created one. The idea is that any feature that is supported by more than one vendor API is the highest common denominator, and that functionality can be switched between vendors as needed, or in the event of a DNS vendor outage.
dns  netflix  java  tools  ops  route53  aws  ultradns  dynect 
12 weeks ago by jm
Ironfan
'an expressive toolset for constructing scalable, resilient [service] architectures. It works in the cloud, in the data center, and on your laptop, and it makes your system diagram visible and inevitable. Inevitable systems coordinate automatically to interconnect, removing the hassle of manual configuration of connection points (and the associated danger of human error).' Looks like a pretty neat cluster deployment tool; driven from a single configuration file, using Chef, integrating closely with AWS and providing many useful additional features
chef  deployment  clusters  knife  services  aws  ec2  ops  ironfan  demo 
january 2013 by jm
AWS Advent 2012
'an annual exploration of Amazon Web Services.' Some great hacks here
aws  amazon  advent  sysadmin  s3  ec2  chef  puppet  ops 
december 2012 by jm
James Hamilton - Failures at Scale & How to Ride Through Them - AWS re:Invent 2012 - Cpn208
mostly an update of his classic USENIX paper, but pretty cool to come across a mention of a network monitoring system we've built on page 21 ;)
amazon  james-hamilton  reliabilty  slides  aws 
december 2012 by jm
How Team Obama’s tech efficiency left Romney IT in dust | Ars Technica
The web-app dev and ops best practices used by the Obama campaign's tech team. Some key tools: Puppet, EC2, Asgard, Cacti, Opsview, StatsD, Graphite, Seyren, Route53, Loggly, etc.
obama  campaigns  tools  ops  asgard  ec2  aws  route53 
november 2012 by jm
Amazon Web Services Blog: Amazon S3 Performance Tips & Tricks
Doug Grismore provides a very useful S3 performance tip; monotonically increasing keys will hurt performance, and describes a clean-enough way to avoid the problem
s3  performance  aws 
march 2012 by jm
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Amazing stuff from Adrian Cockroft at last week's QCon. Faceted object model, lots of Cassandra automation
cassandra  api  design  oo  object-model  java  adrian-cockroft  slides  qcon  scaling  aws  netflix 
march 2012 by jm
Cloudsmith Stack Hammer
something Chris Horn sent on -- using Puppet to build stacks and deploy to AWS using a simple point-and-click interface. looks cool
github  ec2  aws  puppet  stacks  cloudsmith  stack-hammer  via:chorn 
february 2012 by jm
Benchmarking Cassandra Scalability on AWS - Over a million writes per second
NetFlix' benchmarks -- impressively detailed. '48, 96, 144 and 288 instances', across 3 EC2 AZs in us-east, successfully scaling linearly
ec2  aws  cassandra  scaling  benchmarks  netflix  performance 
november 2011 by jm
Amazon hiring embedded OS developers
hey, I know a few of those! 'I need more help on a project I’m driving at Amazon where we continue to make big changes in our datacenter network to improve customer experience and drive down costs while, at the same time, deploying more gear into production each day than all of Amazon.com used back in 2000. It’s an exciting time and we have big changes happening in networking. If you enjoy and have experience in operating systems, networking protocol stacks, or embedded systems and you would like to work on one of the biggest networks in the world, [get in touch].' -- James Hamilton
james-hamilton  aws  jobs  amazon  networking  embedded 
october 2011 by jm
Building with Legos
Netflix tech blog on how they deploy their services. Notably, they avoid the Puppet/Chef approach, citing these reasons: 'One is that it eliminates a number of dependencies in the production environment: a master control server, package repository and client scripts on the servers, network permissions to talk to all of these. Another is that it guarantees that what we test in the test environment is the EXACT same thing that is deployed in production; there is very little chance of configuration or other creep/bit rot. Finally, it means that there is no way for people to change or install things in the production environment (this may seem like a really harsh restriction, but if you can build a new AMI fast enough it doesn't really make a difference).'
devops  cloud  aws  netflix  puppet  chef  deployment 
august 2011 by jm
Amazon EC2 outage: summary and lessons learned
Rightscale CTO on last week's outage; pretty detailed, good round-up of useful commentary from around the web, too
ebs  ec2  aws  cloud  availability  slas  rightscale  amazon 
april 2011 by jm
What Larry Page really needs to do to return Google to its startup roots
massively detailed critique of Google's corporate culture -- lots of internals exposed
google  management  culture  aws  corporate-culture  gossip  from delicious
march 2011 by jm
Quora’s Technology Examined
Python, Nginx, Tornado for COMET stuff, MySQL as a data store, memcached, Thrift, haproxy, AWS, Pylons.  fantastic, very detailed post (via Nelson)
quora  python  nginx  tornado  comet  mysql  memcached  thrift  haproxy  aws  pylons  via:nelson  from delicious
february 2011 by jm
Netflix: Dev and Ops internals
extensive details on the innards of Netflix' move to AWS, from the legendary Adrian Cockcroft
adrian-cockcroft  aws  netflix  ops  cloud  from delicious
november 2010 by jm

related tags

adrian-cockcroft  adrian-cockroft  advent  alestic  amazon  anti-fraud  api  architecture  asgard  availability  aws  b2b  b2c  benchmarks  campaigns  cassandra  chef  cloud  cloud-connect  cloudsmith  cloudwatch  clustering  clusters  comet  corporate-culture  costs  culture  dedupe  demo  deployment  design  devops  disks  dns  documentation  dynamodb  dynect  ebs  ec2  ecommerce  embedded  figures  fraud  gce  github  google  gossip  haproxy  hardware  io  iostat  ironfan  james-hamilton  java  jobs  knife  linux  management  memcached  metrics  mongodb  monitoring  mysql  netflix  networking  nginx  obama  object-model  oo  ops  optimization  outages  performance  pinterest  piops  presentations  puppet  pylons  python  qcon  quora  redis  reliabilty  rightscale  route53  s3  scalability  scaling  scalr  services  sharding  sift-science  slas  slides  smugmug  snapshots  stack-hammer  stacks  startups  storage  sysadmin  thrift  tips  tools  tornado  ubuntu  ultradns  via:chorn  via:matt-sergeant  via:nelson 

Copy this bookmark:



description:


tags: