jm + ec2   110

cristim/autospotting
'Easy to use tool that automatically replaces some or even all on-demand AutoScaling group members with similar or larger identically configured spot instances in order to generate significant cost savings on AWS EC2, behaving much like an AutoScaling-backed spot fleet.'
asg  autoscaling  ec2  aws  spot-fleet  spot-instances  cost-saving  scaling 
12 weeks ago by jm
GitHub - jorgebastida/awslogs: AWS CloudWatch logs for Humans™
This feature alone is a bit of a killer app:
$ awslogs get /var/log/syslog ip-10-1.* --start='2h ago' | grep ERROR


Nice.
cli  logging  aws  cloudwatch  logs  awslogs  ec2 
august 2017 by jm
EBS gp2 I/O BurstBalance exhaustion
when EBS volumes in EC2 exhaust their "burst" allocation, things go awry very quickly
performance  aws  ebs  ec2  burst-balance  ops  debugging 
july 2017 by jm
awslabs/aws-ec2rescue-linux
Amazon Web Services Elastic Compute Cloud (EC2) Rescue for Linux is a python-based tool that allows for the automatic diagnosis of common problems found on EC2 Linux instances.


Most of the modules appear to be log-greppers looking for common kernel issues.
ec2  aws  kernel  linux  ec2rl  ops 
july 2017 by jm
Top 5 ways to improve your AWS EC2 performance
A couple of bits of excellent advice from Datadog (although this may be a slightly old post, from Oct 2016):

1. Unpredictable EBS disk I/O performance. Note that gp2 volumes do not appear to need as much warmup or priming as before.

2. EC2 Instance ECU Mismatch and Stolen CPU. advice: use bigger instances

The other 3 ways are a little obvious by comparison, but worth bookmarking for those two anyway.
ops  ec2  performance  datadog  aws  ebs  stolen-cpu  virtualization  metrics  tips 
july 2017 by jm
Spotting a million dollars in your AWS account · Segment Blog
You can easily split your spend by AWS service per month and call it a day. Ten thousand dollars of EC2, one thousand to S3, five hundred dollars to network traffic, etc. But what’s still missing is a synthesis of which products and engineering teams are dominating your costs. 

Then, add in the fact that you may have hundreds of instances and millions of containers that come and go. Soon, what started as simple analysis problem has quickly become unimaginably complex. 

In this follow-up post, we’d like to share details on the toolkit we used. Our hope is to offer up a few ideas to help you analyze your AWS spend, no matter whether you’re running only a handful of instances, or tens of thousands.

segment  money  costs  billing  aws  ec2  ecs  ops 
may 2017 by jm
cristim/autospotting: Pay up to 10 times less on EC2 by automatically replacing on-demand AutoScaling group members with similar or larger identically configured spot instances.
A simple and easy to use tool designed to significantly lower your Amazon AWS costs by automating the use of the spot market.

Once enabled on an existing on-demand AutoScaling group, it launches an EC2 spot instance that is cheaper, at least as large and configured identically to your current on-demand instances. As soon as the new instance is ready, it is added to the group and an on-demand instance is detached from the group and terminated.

It continuously applies this process, gradually replacing any on-demand instances with spot instances until the group only consists of spot instances, but it can also be configured to keep some on-demand instances running.
aws  golang  ec2  autoscaling  asg  spot-instances  ops 
may 2017 by jm
Ubuntu on AWS gets serious performance boost with AWS-tuned kernel
interesting -- faster boots, CPU throttling resolved on t2.micros, other nice stuff
aws  ubuntu  ec2  kernel  linux  ops 
april 2017 by jm
Amazon EC2 Container Service Plugin - Jenkins - Jenkins Wiki
neat, relatively new plugin to use ECS as a autoscaling node fleet in Jenkins
ec2  ecs  aws  jenkins  docker  plugins 
february 2017 by jm
Julia Evans reverse engineers Skyliner.io
simple usage of Docker, blue/green deploys, and AWS ALBs
docker  alb  aws  ec2  blue-green-deploys  deployment  ops  tools  skyliner  via:jgilbert 
november 2016 by jm
Auto Scaling for EC2 Spot Fleets
'we are enhancing the Spot Fleet model with the addition of Auto Scaling. You can now arrange to scale your fleet up and down based on a Amazon CloudWatch metric. The metric can originate from an AWS service such as EC2, Amazon EC2 Container Service, or Amazon Simple Queue Service (SQS). Alternatively, your application can publish a custom metric and you can use it to drive the automated scaling.'
asg  auto-scaling  ec2  spot-fleets  ops  scaling 
september 2016 by jm
Running Docker on AWS from the ground up
Advantages/disavantages section right at the bottom is good.
ECS, believe it or not, is one of the simplest Schedulers out there. Most of the other alternatives I’ve tried offer all sorts of fancy bells & whistles, but they are either significantly more complicated to understand (lots of new concepts), take too much effort to set up (lots of new technologies to install and run), are too magical (and therefore impossible to debug), or some combination of all three. That said, ECS also leaves a lot to be desired.
aws  docker  ecs  ec2  schedulers 
april 2016 by jm
Yeobot - CloudNative
'The shared SQL command line for AWS'. it's #chatopsy!
chatops  yeobot  bots  cloudnative  ec2  aws  slack 
march 2016 by jm
VPC NAT gateways : transactional uniqueness at scale
colmmacc introducing the VPC NAT gateway product from AWS, in a guest post on James Hamilton's blog no less!:
you can think of it as a new “even bigger” [NAT] box, but under the hood NAT gateways are different. The connections are managed by a fault-tolerant co-operation of devices in the VPC network fabric. Each new connection is assigned a port in a robust and transactional way, while also being replicated across an extensible set of multiple devices. In other words: the NAT gateway is internally horizontally scalable and resilient.
amazon  ec2  nat  networking  aws  colmmacc 
january 2016 by jm
2016 Wish List for AWS?
good thread of AWS' shortcomings -- so many services still don't handle VPC for instance
vpc  aws  ec2  ops  wishlist 
december 2015 by jm
Why We Chose Kubernetes Over ECS
3 months ago when we, at nanit.com, came to evaluate which Docker orchestration framework to use, we gave ECS the first priority. We were already familiar with AWS services, and since we already had our whole infrastructure there, it was the default choice. After testing the service for a while we had the feeling it was not mature enough and missing some key features we needed (more on that later), so we went to test another orchestration framework: Kubernetes. We were glad to discover that Kubernetes is far more comprehensive and had almost all the features we required. For us, Kubernetes won ECS on ECS’s home court, which is AWS.
kubernetes  ecs  docker  containers  aws  ec2  ops 
december 2015 by jm
Amazon ECS CLI Tutorial - Amazon EC2 Container Service
super-basic ECS tutorial, using a docker-compose.yml to create a new ECS-managed service fleet
ecs  cli  linux  aws  ec2  hosting  docker  tutorials 
october 2015 by jm
Hologram
Hologram exposes an imitation of the EC2 instance metadata service on developer workstations that supports the [IAM Roles] temporary credentials workflow. It is accessible via the same HTTP endpoint to calling SDKs, so your code can use the same process in both development and production. The keys that Hologram provisions are temporary, so EC2 access can be centrally controlled without direct administrative access to developer workstations.
iam  roles  ec2  authorization  aws  adroll  open-source  cli  osx  coding  dev 
october 2015 by jm
AWS re:Invent 2015 Video & Slide Presentation Links with Easy Index
Andrew Spyker's roundup:
my quick index of all re:Invent sessions.  Please wait for a few days and I'll keep running the tool to fill in the index.  It usually takes Amazon a few weeks to fully upload all the videos and slideshares.


Pretty definitive, full text descriptions of all sessions (and there are an awful lot of 'em).
aws  reinvent  andrew-spyker  scraping  slides  presentations  ec2  video 
october 2015 by jm
Rebuilding Our Infrastructure with Docker, ECS, and Terraform
Good writeup of current best practices for a production AWS architecture
aws  ops  docker  ecs  ec2  prod  terraform  segment  via:marc 
october 2015 by jm
EC2 Spot Blocks for Defined-Duration Workloads
you can now launch Spot instances that will run continuously for a finite duration (1 to 6 hours). Pricing is based on the requested duration and the available capacity, and is typically 30% to 45% less than On-Demand.
ec2  aws  spot-instances  spot  pricing  time 
october 2015 by jm
Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region
Painful to read, but: tl;dr: monitoring oversight, followed by a transient network glitch triggering IPC timeouts, which increased load due to lack of circuit breakers, creating a cascading failure
aws  postmortem  outages  dynamodb  ec2  post-mortems  circuit-breakers  monitoring 
september 2015 by jm
How We Use AWS Lambda for Rapidly Intensifying Workloads · CloudSploit
impressive -- pretty much the entire workload is run from Lambda here
lambda  aws  ec2  autoscaling  cloudsploit 
september 2015 by jm
Spot Bid Advisor
analyzes Spot price history to help you determine a bid price that suits your needs.
ec2  aws  spot  spot-instances  history 
september 2015 by jm
Amazon EC2 2015 Benchmark: Testing Speeds Between AWS EC2 and S3 Regions
Here we are again, a year later, and still no bloody percentiles! Just amateurish averaging. This is not how you measure anything, ffs. Still, better than nothing I suppose
fail  latency  measurement  aws  ec2  percentiles  s3 
august 2015 by jm
Revised and much faster, run your own high-end cloud gaming service on EC2!
a g2.2xlarge provides decent Windows GPU performance over the internet, at about $0.53 per hour
gaming  games  ec2  amazon  aws  cloud  windows  hacks 
july 2015 by jm
VPC Flow Logs
we are introducing Flow Logs for the Amazon Virtual Private Cloud.  Once enabled for a particular VPC, VPC subnet, or Elastic Network Interface (ENI), relevant network traffic will be logged to CloudWatch Logs for storage and analysis by your own applications or third-party tools.

You can create alarms that will fire if certain types of traffic are detected; you can also create metrics to help you to identify trends and patterns. The information captured includes information about allowed and denied traffic (based on security group and network ACL rules). It also includes source and destination IP addresses, ports, the IANA protocol number, packet and byte counts, a time interval during which the flow was observed, and an action (ACCEPT or REJECT).
ec2  aws  vpc  logging  tracing  ops  flow-logs  network  tcpdump  packets  packet-capture 
june 2015 by jm
Leveraging AWS to Build a Scalable Data Pipeline
Nice detailed description of an auto-scaled SQS worker pool
sqs  aws  ec2  auto-scaling  asg  worker-pools  architecture  scalability 
june 2015 by jm
awslabs/aws-lambda-redshift-loader
Load data into Redshift from S3 buckets using a pre-canned Lambda function. Looks like it may be a good example of production-quality Lambda
lambda  aws  ec2  redshift  s3  loaders  etl  pipeline 
may 2015 by jm
Cluster-Based Architectures Using Docker and Amazon EC2 Container Service
In this post, we’re going to take a deeper dive into the architectural concepts underlying cluster computing using container management frameworks such as ECS. We will show how these frameworks effectively abstract the low-level resources such as CPU, memory, and storage, allowing for highly efficient usage of the nodes in a compute cluster. Building on some of the concepts detailed in the earlier posts, we will discover why containers are such a good fit for this type of abstraction, and how the Amazon EC2 Container Service fits into the larger ecosystem of cluster management frameworks.
docker  aws  ecs  ec2  ops  hosting  containers  mesos  clusters 
april 2015 by jm
Amazon EC2 Container Service team AmA
a few answers here. Mostly people pointing out shortcomings and the team asking them to start a thread on their forum though :(
ec2  ecs  docker  aws  ops  ama  reddit 
april 2015 by jm
Run your own high-end cloud gaming service on EC2
Using Steam streaming and EC2 g2.2xlarge spot instances -- 'comes out to around $0.52/hr'. That's pretty compelling IMO
aws  ec2  gaming  games  graphics  spot-instances  hacks  windows  steam 
april 2015 by jm
Microservices and elastic resource pools with Amazon EC2 Container Service
interesting approach to working around ECS' shortcomings -- bit specific to Hailo's microservices arch and IPC mechanism though.

aside: I like their version numbering scheme: ISO-8601, YYYYMMDDHHMMSS. keep it simple!
versioning  microservices  hailo  aws  ec2  ecs  docker  containers  scheduling  allocation  deployment  provisioning  qos 
april 2015 by jm
AWS Lambda Event-Driven Architecture With Amazon SNS
Any message posted to an SNS topic can trigger the execution of custom code you have written, but you don’t have to maintain any infrastructure to keep that code available to listen for those events and you don’t have to pay for any infrastructure when the code is not being run. This is, in my opinion, the first time that Amazon can truly say that AWS Lambda is event-driven, as we now have a central, independent, event management system (SNS) where any authorized entity can trigger the event (post a message to a topic) and any authorized AWS Lambda function can listen for the event, and neither has to know about the other.
aws  ec2  lambda  sns  events  cep  event-processing  coding  cloud  hacks  eric-hammond 
april 2015 by jm
(SEC307) Building a DDoS-Resilient Architecture with AWS
good slides on a "web application firewall" proxy service, deployable as an auto-scaling EC2 unit
ec2  aws  ddos  security  resilience  slides  reinvent  firewalls  http  elb 
april 2015 by jm
Can Spark Streaming survive Chaos Monkey?
good empirical results on Spark's resilience to network/host outages in EC2
ec2  aws  emr  spark  resilience  ha  fault-tolerance  chaos-monkey  netflix 
march 2015 by jm
500 Mbps upload to S3
the following guidelines maximize bandwidth usage:
Optimizing the sizes of the file parts, whether they are part of a large file or an entire small file; Optimizing the number of parts transferred concurrently.
Tuning these two parameters achieves the best possible transfer speeds to [S3].
s3  uploads  dataman  aws  ec2  performance 
march 2015 by jm
What Color Is Your Xen?
What a mess.
What's faster: PV, HVM, HVM with PV drivers, PVHVM, or PVH? Cloud computing providers using Xen can offer different virtualization "modes", based on paravirtualization (PV), hardware virtual machine (HVM), or a hybrid of them. As a customer, you may be required to choose one of these. So, which one?
ec2  linux  performance  aws  ops  pv  hvm  xen  virtualization 
february 2015 by jm
Azul Zing on Ubuntu on AWS Marketplace
hmmm, very interesting -- the super-low-latency Zing JVM is available as a commercial EC2 instance type, at costs less than the EC2 instance price
zing  azul  latency  performance  ec2  aws 
february 2015 by jm
AWS Tips I Wish I'd Known Before I Started
Some good advice and guidelines (although some are just silly).
aws  ops  tips  advice  ec2  s3 
january 2015 by jm
EC2 Container Service Hands On
Sounds like a good start, but this isn't great:
There is no native integration with Autoscaling or ELBs.
ec2  containers  docker  ecs  ops 
december 2014 by jm
AWS re:Invent 2014 Video & Slide Presentation Links
Nice work by Andrew Spyker -- this should be an official feature of the re:Invent website, really
reinvent  aws  conferences  talks  slides  ec2  s3  ops  presentations 
november 2014 by jm
How I created two images with the same MD5 hash
I found that I was able to run the algorithm in about 10 hours on an AWS large GPU instance bringing it in at about $0.65 plus tax.


Bottom line: MD5 is feasibly attackable by pretty much anyone now.
crypto  images  md5  security  hashing  collisions  ec2  via:hn 
november 2014 by jm
Zookeeper: not so great as a highly-available service registry
Turns out ZK isn't a good choice as a service discovery system, if you want to be able to use that service discovery system while partitioned from the rest of the ZK cluster:
I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances.  This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones.  What I saw was that the two other instances noticed the first server “going away”, but they continued to function as they still saw a majority (66%).  More interestingly the first instance noticed the other two servers “going away”, dropping the ensemble availability to 33%.  This caused the first server to stop serving requests to clients (not only writes, but also reads).


So: within that offline AZ, service discovery *reads* (as well as writes) stopped working due to a lack of ZK quorum. This is quite a feasible outage scenario for EC2, by the way, since (at least when I was working there) the network links between AZs, and the links with the external internet, were not 100% overlapping.

In other words, if you want a highly-available service discovery system in the fact of network partitions, you want an AP service discovery system, rather than a CP one -- and ZK is a CP system.

Another risk, noted on the Netflix Eureka mailing list at https://groups.google.com/d/msg/eureka_netflix/LXKWoD14RFY/tA9UnerrBHUJ :

ZooKeeper, while tolerant against single node failures, doesn't react well to long partitioning events. For us, it's vastly more important that we maintain an available registry than a necessarily consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that's OK with us.


I guess this means that a long partition can trigger SESSION_EXPIRED state, resulting in ZK client libraries requiring a restart/reconnect to fix. I'm not entirely clear what happens to the ZK cluster itself in this scenario though.

Finally, Pinterest ran into other issues relying on ZK for service discovery and registration, described at http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest ; sounds like this was mainly around load and the "thundering herd" overload problem. Their workaround was to decouple ZK availability from their services' availability, by building a Smartstack-style sidecar daemon on each host which tracked/cached ZK data.
zookeeper  service-discovery  ops  ha  cap  ap  cp  service-registry  availability  ec2  aws  network  partitions  eureka  smartstack  pinterest 
november 2014 by jm
Load testing Apache Kafka on AWS
This is a very solid benchmarking post, examining Kafka in good detail. Nicely done. Bottom line:
I basically spend 2/3 of my work time torture testing and operationalizing distributed systems in production. There's some that I'm not so pleased with (posts pending in draft forever) and some that have attributes that I really love. Kafka is one of those systems that I pretty much enjoy every bit of, and the fact that it performs predictably well is only a symptom of the reason and not the reason itself: the authors really know what they're doing. Nothing about this software is an accident. Performance, everything in this post, is only a fraction of what's important to me and what matters when you run these systems for real. Kafka represents everything I think good distributed systems are about: that thorough and explicit design decisions win.
testing  aws  kafka  ec2  load-testing  benchmarks  performance 
october 2014 by jm
Zonify
'a set of command line tools for managing Route53 DNS for an AWS infrastructure. It intelligently uses tags and other metadata to automatically create the associated DNS records.'
zonify  aws  dns  ec2  route53  ops 
october 2014 by jm
Avoiding Chef-Suck with Auto Scaling Groups - forty9ten
Some common problems which arise using Chef with ASGs in EC2, and how these guys avoided it -- they stopped using Chef for service provisioning, and instead baked AMIs when a new version was released. ASGs using pre-baked AMIs definitely works well so this makes good sense IMO.
infrastructure  chef  ops  asg  auto-scaling  ec2  provisioning  deployment 
september 2014 by jm
Using spot instances
Excellent post on all of the ins and outs of EC2 spot instance usage
ec2  aws  spot-instances  pricing  cloud  auto-scaling  ops 
september 2014 by jm
On-Demand Jenkins Slaves With Amazon EC2
This is very likely where we'll be going for our acceptance tests in Swrve
testing  jenkins  ec2  spot-instances  scalability  auto-scaling  ops  build 
august 2014 by jm
AWS Speed Test: What are the Fastest EC2 and S3 Regions?
My god, this test is awful -- this is how NOT to test networked infrastructure. (1) testing from a single EC2 instance in each region; (2) uploading to a single test bucket for each test; (3) results don't include min/max or percentiles, just an averaged measurement for each test. FAIL
fail  testing  networking  performance  ec2  aws  s3  internet 
august 2014 by jm
The Network is Reliable - ACM Queue
Peter Bailis and Kyle Kingsbury accumulate a comprehensive, informal survey of real-world network failures observed in production. I remember that April 2011 EBS outage...
ec2  aws  networking  outages  partitions  jepsen  pbailis  aphyr  acm-queue  acm  survey  ops 
july 2014 by jm
Netflix/ribbon
a client side IPC library that is battle-tested in cloud. It provides the following features:

Load balancing;
Fault tolerance;
Multiple protocol (HTTP, TCP, UDP) support in an asynchronous and reactive model;
Caching and batching.

I like the integration of Eureka and Hystrix in particular, although I would really like to read more about Eureka's approach to availability during network partitions and CAP.

https://groups.google.com/d/msg/eureka_netflix/LXKWoD14RFY/-5nElGl1OQ0J has some interesting discussion on the topic. It actually sounds like the Eureka approach is more correct than using ZK: 'Eureka is available. ZooKeeper, while tolerant against single node failures, doesn't react well to long partitioning events. For us, it's vastly more important that we maintain an available registry than a necessary consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that's OK with us.'

See also http://ispyker.blogspot.ie/2013/12/zookeeper-as-cloud-native-service.html which corroborates this:

I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances. This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones. What I saw was that the two other instances noticed that the first server “going away”, but they continued to function as they still saw a majority (66%). More interestingly the first instance noticed the other two servers “going away” dropping the ensemble availability to 33%. This caused the first server to stop serving requests to clients (not only writes, but also reads). [...]

To me this seems like a concern, as network partitions should be considered an event that should be survived. In this case (with this specific configuration of zookeeper) no new clients in that availability zone would be able to register themselves with consumers within the same availability zone. Adding more zookeeper instances to the ensemble wouldn’t help considering a balanced deployment as in this case the availability would always be majority (66%) and non-majority (33%).
netflix  ribbon  availability  libraries  java  hystrix  eureka  aws  ec2  load-balancing  networking  http  tcp  architecture  clients  ipc 
july 2014 by jm
New AWS Web Services region: eu-central-1 (soon)
Iiiinteresting. Sounds like new anti-NSA-snooping privacy laws will be driving a lot of new mini-regions in AWS. Hope Amazon have their new-region-standup process a little more streamlined by now than when I was there ;)
aws  germany  privacy  ec2  eu-central-1  nsa  snooping 
july 2014 by jm
New Low Cost EC2 Instances with Burstable Performance
Oh, very neat. New micro, small, and medium-class instances with burstable CPU scaling:
The T2 instances are built around a processing allocation model that provides you a generous, assured baseline amount of processing power coupled with the ability to automatically and transparently scale up to a full core when you need more compute power. Your ability to burst is based on the concept of "CPU Credits" that you accumulate during quiet periods and spend when things get busy. You can provision an instance of modest size and cost and still have more than adequate compute power in reserve to handle peak demands for compute power.
ec2  aws  hosting  cpu  scaling  burst  load  instances 
july 2014 by jm
Building a Smarter Application Stack - DevOps Ireland
This sounds like a very interesting Dublin meetup -- Engine Yard on thursday night:
This month, we'll have Tomas Doran from Yelp talking about Docker, service discovery, and deployments. 'There are many advantages to a container based, microservices architecture - however, as always, there is no silver bullet. Any serious deployment will involve multiple host machines, and will have a pressing need to migrate containers between hosts at some point. In such a dynamic world hard coding IP addresses, or even host names is not a viable solution. This talk will take a journey through how Yelp has solved the discovery problems using Airbnb’s SmartStack to dynamically discover service dependencies, and how this is helping unify our architecture, from traditional metal to EC2 ‘immutable’ SOA images, to Docker containers.'
meetups  talks  dublin  deployment  smartstack  ec2  docker  yelp  service-discovery 
june 2014 by jm
Amazon EC2 Service Limits Report Now Available
'designed to make it easier for you to view and manage your limits for Amazon EC2 by providing the latest information on service limits and links to quickly request limit increases. EC2 Service Limits Report displays all your service limit information in one place to help you avoid encountering limits on future EC2, EBS, Auto Scaling, and VPC usage.'
aws  ec2  vpc  ebs  autoscaling  limits  ops 
june 2014 by jm
Code Spaces data and backups deleted by hackers
Rather scary story of an extortionist wiping out a company's AWS-based infrastructure. Turns out S3 supports MFA-required deletion as a feature, though, which would help against that.
ops  security  extortion  aws  ec2  s3  code-spaces  delete  mfa  two-factor-authentication  authentication  infrastructure 
june 2014 by jm
Use of Formal Methods at Amazon Web Services
Chris Newcombe, Marc Brooker, et al. writing about their experience using formal specification and model-checking languages (TLA+) in production in AWS:

The success with DynamoDB gave us enough evidence to present TLA+ to the broader engineering community at Amazon. This raised a challenge; how to convey the purpose and benefits of formal methods to an audience of software engineers? Engineers think in terms of debugging rather than ‘verification’, so we called the presentation “Debugging Designs”.

Continuing that metaphor, we have found that software engineers more readily grasp the concept and practical value of TLA+ if we dub it 'Exhaustively-testable pseudo-code'.

We initially avoid the words ‘formal’, ‘verification’, and ‘proof’, due to the widespread view that formal methods are impractical. We also initially avoid mentioning what the acronym ‘TLA’ stands for, as doing so would give an incorrect impression of complexity.


More slides at http://tla2012.loria.fr/contributed/newcombe-slides.pdf ; proggit discussion at http://www.reddit.com/r/programming/comments/277fbh/use_of_formal_methods_at_amazon_web_services/
formal-methods  model-checking  tla  tla+  programming  distsys  distcomp  ebs  s3  dynamodb  aws  ec2  marc-brooker  chris-newcombe 
june 2014 by jm
AWS SDK for Java Client Configuration
turns out the AWS SDK has lots of tuning knobs: region selection, socket buffer sizes, and debug logging (including wire logging).
aws  sdk  java  logging  ec2  s3  dynamodb  sockets  tuning 
june 2014 by jm
moto
Mock Boto: 'a library that allows your python tests to easily mock out the boto library.' Supports S3, Autoscaling, EC2, DynamoDB, ELB, Route53, SES, SQS, and STS currently, and even supports a standalone server mode, to act as a mock service for non-Python clients. Excellent!

(via Conor McDermottroe)
python  aws  testing  mocks  mocking  system-tests  unit-tests  coding  ec2  s3 
may 2014 by jm
Docker Plugin for Jenkins
The aim of the docker plugin is to be able to use a docker host to dynamically provision a slave, run a single build, then tear-down that slave. Optionally, the container can be committed, so that (for example) manual QA could be performed by the container being imported into a local docker provider, and run from there.


The holy grail of Jenkins/Docker integration. How cool is that...
jenkins  docker  ops  testing  ec2  hosting  scaling  elastic-scaling  system-testing 
may 2014 by jm
AWS Case Study: Hailo
Ubuntu, C*, HAProxy, MySQL, RDS, multiple AWS regions.
hailo  cassandra  ubuntu  mysql  rds  aws  ec2  haproxy  architecture 
april 2014 by jm
AWS Elastic Beanstalk for Docker
This is pretty amazing. nice work, Beanstalk team. not sure how well it integrates with the rest of AWS though
aws  amazon  docker  ec2  beanstalk  ops  containers  linux 
april 2014 by jm
Using AWS in the context of Australian Privacy Considerations
interesting new white paper from Amazon regarding recent strengthening of the Aussie privacy laws, particularly w.r.t. geographic location of data and access by overseas law enforcement agencies...
amazon  aws  security  law  privacy  data-protection  ec2  s3  nsa  gchq  five-eyes 
april 2014 by jm
Adrian Cockroft's Cloud Outage Reports Collection
The detailed summaries of outages from cloud vendors are comprehensive and the response to each highlights many lessons in how to build robust distributed systems. For outages that significantly affected Netflix, the Netflix techblog report gives insight into how to effectively build reliable services on top of AWS. [....] I plan to collect reports here over time, and welcome links to other write-ups of outages and how to survive them.
outages  post-mortems  documentation  ops  aws  ec2  amazon  google  dropbox  microsoft  azure  incident-response 
march 2014 by jm
« earlier      
per page:    204080120160

related tags

10/8  aas  acm  acm-queue  adroll  advent  advice  alb  alestic  allocation  ama  amazon  analytics  andrew-spyker  ap  aphyr  api  architecture  asg  asgard  aurora  authentication  authorization  auto-scaling  autoscaling  availability  aws  awscli  awslogs  az  azul  azure  beanstalk  benchmarks  billing  blue-green-deployments  blue-green-deploys  bobtail  bots  build  burst  burst-balance  c10k  campaigns  cap  cassandra  cdn  cep  chaos-monkey  chatops  chef  chris-newcombe  circuit-breakers  cli  clients  cloud  cloud-connect  cloudformation  cloudnative  cloudsmith  cloudsplit  cloudsploit  cloudwatch  clusters  code-spaces  codedeploy  coding  collisions  colmmacc  command-line  comparison  conferences  containers  cost-saving  costs  cp  cpu  crypto  data-protection  databases  datadog  dataman  ddos  debugging  dedupe  delete  delta  demo  deploy  deployment  dev  devops  distcomp  distsys  dns  docker  docs  documentation  dropbox  dublin  dynamodb  ebs  ec2  ec2rl  ecs  elastic-scaling  elasticache  elb  emr  eric-hammond  etl  eu-central-1  eureka  event-processing  events  examples  extortion  fail  fault-tolerance  ffmpeg  figures  firewalls  five-eyes  flow-logs  formal-methods  fpga  games  gaming  gce  gchq  germany  gilt  github  golang  google  graphics  h264  ha  hacks  hadoop  hailo  haproxy  hardware  hashing  history  hls  hosting  http  https  hvm  hystrix  iam  images  incident-response  infrastructure  instagram  instances  integration-testing  internet  ip  ip-addresses  ipc  ironfan  java  jenkins  jepsen  joe-drumgoole  kafka  kernel  key-rotation  knife  kubernetes  lambda  latency  law  libraries  limits  linux  load  load-balancing  load-testing  loaders  logging  logs  long-tail  marc-brooker  md5  measurement  measurements  meetups  memory  mesos  metrics  mfa  microservices  microsoft  mocking  mocks  model-checking  money  mongodb  monitoring  mp4  mysql  nat  netflix  netty  network  networking  networks  nginx  nio  nlb  node  nosql  nsa  obama  open-source  ops  osx  outages  packet-capture  packets  partitions  pbailis  percentiles  perfect-forward-secrecy  performance  pinterest  piops  pipeline  plugins  post-mortems  postmortem  prediction  presentations  pricing  privacy  prod  programming  provisioning  proxying  puppet  pv  python  qos  r3  ram  rds  real-time  reddit  redis  redshift  reinvent  replication  resilience  ribbon  rightscale  roles  round-trip  route53  s3  scalability  scaling  scalr  schedulers  scheduling  scraping  scryer  sdk  security  segment  service-discovery  service-registry  services  skyliner  slack  slas  slides  smartstack  snapshots  snooping  sns  sockets  spark  spikes  spot  spot-fleet  spot-fleets  spot-instances  sql  sqs  ssd  ssl  stack-hammer  stacks  steam  stolen-cpu  storage  streaming  survey  sysadmin  sysctl  system-testing  system-tests  talks  tcp  tcpdump  terraform  testimonials  testing  tests  time  tips  tla  tla+  tls  tools  tracing  transactions  tuning  tutorials  two-factor-authentication  ubuntu  unit-tests  uploads  urban-airship  user-stories  versioning  via:chorn  via:hn  via:jgilbert  via:marc  video  virtualization  vpc  windows  wishlist  worker-pools  xen  yelp  yeobot  zing  zonify  zookeeper 

Copy this bookmark:



description:


tags: