jm + cloud   26

AWS Greengrass
AWS Greengrass is software that lets you run local compute, messaging & data caching for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely – even when not connected to the Internet. Using AWS Lambda, Greengrass ensures your IoT devices can respond quickly to local events, operate with intermittent connections, and minimize the cost of transmitting IoT data to the cloud.

AWS Greengrass seamlessly extends AWS to devices so they can act locally on the data they generate, while still using the cloud for management, analytics, and durable storage. With Greengrass, you can use familiar languages and programming models to create and test your device software in the cloud, and then deploy it to your devices. AWS Greengrass can be programmed to filter device data and only transmit necessary information back to the cloud. AWS Greengrass authenticates and encrypts device data at all points of connection using AWS IoT’s security and access management capabilities. This way data is never exchanged between devices when they communicate with each other and the cloud without proven identity.
aws  cloud  iot  lambda  devices  offline  synchronization  architecture 
april 2017 by jm
GitLab.com Database Incident - 2017/01/31
Horrible, horrible postmortem doc. This is the kicker:
So in other words, out of 5 backup/replication techniques deployed none are working reliably or set up in the first place.


Reddit comments: https://www.reddit.com/r/linux/comments/5rd9em/gitlab_is_down_notes_on_the_incident_and_why_you/
devops  backups  cloud  outage  incidents  postmortem  gitlab 
february 2017 by jm
Cloudy Gamer: Playing Overwatch on Azure's new monster GPU instances
pretty amazing. full 60FPS, 2560x1600, everything on Epic quality, streaming from Azure, for $2 per hour
gaming  azure  games  cloud  gpu  overwatch  streaming 
october 2016 by jm
Google Intrusion Detection Problems
'We have lost access to multiple critical data stores because Google has an automated threat detection system that is incapable of handling false positives.'
google  security  cloud  false-positives  intrusion-detection  automation  fail 
august 2016 by jm
Google Cloud Status
Ouch, multi-region outage:
At 14:50 Pacific Time on April 11th, our engineers removed an unused GCE IP block from our network configuration, and instructed Google’s automated systems to propagate the new configuration across our network. By itself, this sort of change was harmless and had been performed previously without incident. However, on this occasion our network configuration management software detected an inconsistency in the newly supplied configuration. The inconsistency was triggered by a timing quirk in the IP block removal - the IP block had been removed from one configuration file, but this change had not yet propagated to a second configuration file also used in network configuration management. In attempting to resolve this inconsistency the network management software is designed to ‘fail safe’ and revert to its current configuration rather than proceeding with the new configuration. However, in this instance a previously-unseen software bug was triggered, and instead of retaining the previous known good configuration, the management software instead removed all GCE IP blocks from the new configuration and began to push this new, incomplete configuration to the network.

One of our core principles at Google is ‘defense in depth’, and Google’s networking systems have a number of safeguards to prevent them from propagating incorrect or invalid configurations in the event of an upstream failure or bug. These safeguards include a canary step where the configuration is deployed at a single site and that site is verified to still be working correctly, and a progressive rollout which makes changes to only a fraction of sites at a time, so that a novel failure can be caught at an early stage before it becomes widespread. In this event, the canary step correctly identified that the new configuration was unsafe. Crucially however, a second software bug in the management software did not propagate the canary step’s conclusion back to the push process, and thus the push system concluded that the new configuration was valid and began its progressive rollout.
multi-region  outages  google  ops  postmortems  gce  cloud  ip  networking  cascading-failures  bugs 
april 2016 by jm
s3git
git for Cloud Storage. Create distributed, decentralized and versioned repositories that scale infinitely to 100s of millions of files and PBs of storage. Huge repos can be cloned on your local SSD for making changes, committing and pushing back. Oh yeah, and it dedupes too due to BLAKE2 Tree hashing. http://s3git.org
git  ops  storage  cloud  s3  disk  aws  version-control  blake2 
april 2016 by jm
Google Cloud Platform HTTP/HTTPS Load Balancing
GCE's LB product is pretty nice -- HTTP/2 support, and a built-in URL mapping feature (presumably based on how Google approach that problem internally, I understand they take that approach). I'm hoping AWS are taking notes for the next generation of ELB, if that ever happens
elb  gce  google  load-balancing  http  https  spdy  http2  urls  request-routing  ops  architecture  cloud 
october 2015 by jm
Google Cloud Shell
your command line environment in the [Google] Cloud. This feature enables you to connect to a shell environment on a virtual machine, pre-loaded with the tools you need to easily run commands to develop, deploy and manage your projects. Currently, Cloud Shell is an f1-micro Google Compute Engine machine that exposes a Debian-based development environment. You are also assigned 5 GB of standard persistent disk space as the home disk so you can store files between sessions.


It's also free. This is a great idea -- handy both for beginners getting to grips with GoogCloud and for experts looking for a quite dev env to hack with. I wish AWS had something similar.
google  cloud  shell  google-cloud  gcs  gce  cli  tools 
october 2015 by jm
Revised and much faster, run your own high-end cloud gaming service on EC2!
a g2.2xlarge provides decent Windows GPU performance over the internet, at about $0.53 per hour
gaming  games  ec2  amazon  aws  cloud  windows  hacks 
july 2015 by jm
Bigcommerce Status Page blasts IBM Softlayer Object Storage service
This is pretty heavy stuff:
Bigcommerce engineers have been very pro-active in working with our storage provider, IBM Softlayer, in finding solutions. Unfortunately, it takes two parties to come to a solution. In this case, IBM Softlayer intentionally let their Object Storage cluster fall into disrepair and chose not to scale it. This has impacted Bigcommerce, IBM and many other Softlayer customers. Our engineers placed too much trust in IBM Softlayer and that's on us. However, the catastrophic failures to see metrics and rapidly scale capacity, the decisions to let hard drives sit at 90% utilization for weeks and months, the cascading failures of an undersized cluster of 52 nodes for the busiest data center in their business speaks to IBM Softlayer’s lack of concern for their customers. We found this out 3 days ago.


(via Oisin)
softlayer  bigcommerce  outages  shambles  ibm  fail  object-storage  storage  iaas  cloud 
april 2015 by jm
AWS Lambda Event-Driven Architecture With Amazon SNS
Any message posted to an SNS topic can trigger the execution of custom code you have written, but you don’t have to maintain any infrastructure to keep that code available to listen for those events and you don’t have to pay for any infrastructure when the code is not being run. This is, in my opinion, the first time that Amazon can truly say that AWS Lambda is event-driven, as we now have a central, independent, event management system (SNS) where any authorized entity can trigger the event (post a message to a topic) and any authorized AWS Lambda function can listen for the event, and neither has to know about the other.
aws  ec2  lambda  sns  events  cep  event-processing  coding  cloud  hacks  eric-hammond 
april 2015 by jm
2015-02-19 GCE outage
40 minutes of multi-zone network outage for majority of instances.

'The internal software system which programs GCE’s virtual network for VM
egress traffic stopped issuing updated routing information. The cause of
this interruption is still under active investigation. Cached route
information provided a defense in depth against missing updates, but GCE VM
egress traffic started to be dropped as the cached routes expired.'

I wonder if Google Pimms fired the alarms for this ;)
google  outages  gce  networking  routing  pimms  multi-az  cloud 
february 2015 by jm
Ind.ie Pulse
Syncthing is becoming Ind.ie Pulse. Pulse replaces proprietary sync and cloud services with something open, trustworthy and decentralised. Your data is your data alone and you deserve to choose where it is stored, if it is shared with some third party, and how it's transmitted over the Internet.
syncing  storage  cloud  dropbox  utilities  gpl  decentralization 
october 2014 by jm
Using spot instances
Excellent post on all of the ins and outs of EC2 spot instance usage
ec2  aws  spot-instances  pricing  cloud  auto-scaling  ops 
september 2014 by jm
BBC News - Microsoft 'must release' data held on Dublin server
Messy. I can't see this lasting beyond an appeal.
Law enforcement efforts would be seriously impeded and the burden on the government would be substantial if they had to co-ordinate with foreign governments to obtain this sort of information from internet service providers such as Microsoft and Google, Judge Francis said. In a blog post, Microsoft's deputy general counsel, David Howard, said: "A US prosecutor cannot obtain a US warrant to search someone's home located in another country, just as another country's prosecutor cannot obtain a court order in her home country to conduct a search in the United States. "We think the same rules should apply in the online world, but the government disagrees."
microsoft  regions  law  us-law  privacy  google  cloud  international-law  surveillance 
april 2014 by jm
Docker
'the Linux container engine'. I totally misunderstood what Docker was -- this is cool.
Heterogeneous payloads: Any combination of binaries, libraries, configuration files, scripts, virtualenvs, jars, gems, tarballs, you name it. No more juggling between domain-specific tools. Docker can deploy and run them all.

Any server: Docker can run on any x64 machine with a modern linux kernel - whether it's a laptop, a bare metal server or a VM. This makes it perfect for multi-cloud deployments.

Isolation: Docker isolates processes from each other and from the underlying host, using lightweight containers.

Repeatability: Because each container is isolated in its own filesystem, they behave the same regardless of where, when, and alongside what they run.
lxc  containers  virtualization  cloud  ops  linux  docker  deployment 
july 2013 by jm
BitTorrent’s Secure Dropbox Alternative Goes Public
As kragen says, 'a decentralized way to sync a folder of large files, using BitTorrent instead of an untrustworthy central server'. Windows, OSX, and Linux supported
bittorrent  dropbox  cloud  storage  filesharing  sharing  sync  synchronization 
april 2013 by jm
By the numbers: How Google Compute Engine stacks up to Amazon EC2
Scalr's thoughts on Google's EC2 competitor.
with Google Compute Engine, AWS has a formidable new competitor in the public cloud space, and we’ll likely be moving some of Scalr’s production workloads from our hybrid aws-rackspace-softlayer setup to it when it leaves beta. There’s a strong technical case for migrating heavy workloads to GCE, and I’ll be grabbing popcorn to eagerly watch as the battle unfolds between the giants.
gce  cloud  ec2  amazon  aws  google  scalr 
march 2013 by jm
Joyent Services Back After 8 Day Outage
Lest we forget. I think it was 10 days in total once everything was resolved
joyent  outages  bingodisk  strongspace  cloud  solaris  zfs 
july 2012 by jm
Building with Legos
Netflix tech blog on how they deploy their services. Notably, they avoid the Puppet/Chef approach, citing these reasons: 'One is that it eliminates a number of dependencies in the production environment: a master control server, package repository and client scripts on the servers, network permissions to talk to all of these. Another is that it guarantees that what we test in the test environment is the EXACT same thing that is deployed in production; there is very little chance of configuration or other creep/bit rot. Finally, it means that there is no way for people to change or install things in the production environment (this may seem like a really harsh restriction, but if you can build a new AMI fast enough it doesn't really make a difference).'
devops  cloud  aws  netflix  puppet  chef  deployment 
august 2011 by jm
Amazon EC2 outage: summary and lessons learned
Rightscale CTO on last week's outage; pretty detailed, good round-up of useful commentary from around the web, too
ebs  ec2  aws  cloud  availability  slas  rightscale  amazon 
april 2011 by jm
Rumor: Google “Disgusted” With Record Labels
'Once again, Warner is the fly in the ointment, the same company that praises Spotify one day, renews their licenses for the rest of the world and then the next day doesn’t want to license them in the US.'
google  music  cloud  licensing  music-industry  record-labels  warner-music  streaming  from delicious
april 2011 by jm
Netflix: Dev and Ops internals
extensive details on the innards of Netflix' move to AWS, from the legendary Adrian Cockcroft
adrian-cockcroft  aws  netflix  ops  cloud  from delicious
november 2010 by jm
First logging-as-a-service tool for the cloud wins NovaUCD award - siliconrepublic.com
first, eh? not sure about that. still, good going for Irish startup JLizard, logging in the cloud seems to be hot
logging  metrics  analysis  cloud  ireland  startups  novaucd  from delicious
november 2010 by jm
Loggly
'Logging as a Service' - a cloud-based logging service
logging  loggly  cloud  logs  data  metrics  from delicious
november 2010 by jm

related tags

adrian-cockcroft  amazon  analysis  architecture  auto-scaling  automation  availability  aws  azure  backups  bigcommerce  bingodisk  bittorrent  blake2  bugs  cascading-failures  cep  chef  cli  cloud  coding  containers  data  decentralization  deployment  devices  devops  disk  docker  dropbox  ebs  ec2  elb  eric-hammond  event-processing  events  fail  false-positives  filesharing  games  gaming  gce  gcs  git  gitlab  google  google-cloud  gpl  gpu  hacks  http  http2  https  iaas  ibm  incidents  international-law  intrusion-detection  iot  ip  ireland  joyent  lambda  law  licensing  linux  load-balancing  logging  loggly  logs  lxc  metrics  microsoft  multi-az  multi-region  music  music-industry  netflix  networking  novaucd  object-storage  offline  ops  outage  outages  overwatch  pimms  postmortem  postmortems  pricing  privacy  puppet  record-labels  regions  request-routing  rightscale  root-cause  routing  s3  scalr  security  shambles  sharing  shell  slas  sns  softlayer  solaris  spdy  spot-instances  startups  storage  streaming  strongspace  surveillance  sync  synchronization  syncing  tools  urls  us-law  utilities  version-control  virtualization  warner-music  windows  zfs 

Copy this bookmark:



description:


tags: