jm + network   33

Jepsen: Hazelcast 3.8.3
Not a very good review of Hazelcast's CAP behaviour from Aphyr. see also https://twitter.com/MarcJBrooker/status/917437286639329280 for more musings from Marc Brooker on the topic ("PA/EC is a confusing and dangerous behaviour for many cases")
jepsen  aphyr  testing  hazelcast  cap-theorem  reliability  partitions  network  pacelc  marc-brooker 
6 weeks ago by jm
NetSpot
'FREE WiFi Site Survey Software for MAC OS X & Windows'.
Sadly reviews from pals are that it is 'shite' :(
osx  wifi  network  survey  netspot  networking  ops  dataviz  wireless 
april 2017 by jm
The revenge of the listening sockets
More adventures in debugging the Linux kernel:
You can't have a very large number of bound TCP sockets and we learned that the hard way. We learned a bit about the Linux networking stack: the fact that LHTABLE is fixed size and is hashed by destination port only. Once again we showed a couple of powerful of System Tap scripts.
ops  linux  networking  tcp  network  lhtable  kernel 
april 2016 by jm
About Microservices, Containers and their Underestimated Impact on Network Performance
shock horror, Docker-SDN layers have terrible performance. Still pretty lousy perf impacts from basic Docker containerization, presumably without "--net=host" (which is apparently vital)
docker  performance  network  containers  sdn  ops  networking  microservices 
january 2016 by jm
toxy
toxy is a fully programmatic and hackable HTTP proxy to simulate server failure scenarios and unexpected network conditions. It was mainly designed for fuzzing/evil testing purposes, when toxy becomes particularly useful to cover fault tolerance and resiliency capabilities of a system, especially in service-oriented architectures, where toxy may act as intermediate proxy among services.

toxy allows you to plug in poisons, optionally filtered by rules, which essentially can intercept and alter the HTTP flow as you need, performing multiple evil actions in the middle of that process, such as limiting the bandwidth, delaying TCP packets, injecting network jitter latency or replying with a custom error or status code.
toxy  proxies  proxy  http  mitm  node.js  soa  network  failures  latency  slowdown  jitter  bandwidth  tcp 
august 2015 by jm
VPC Flow Logs
we are introducing Flow Logs for the Amazon Virtual Private Cloud.  Once enabled for a particular VPC, VPC subnet, or Elastic Network Interface (ENI), relevant network traffic will be logged to CloudWatch Logs for storage and analysis by your own applications or third-party tools.

You can create alarms that will fire if certain types of traffic are detected; you can also create metrics to help you to identify trends and patterns. The information captured includes information about allowed and denied traffic (based on security group and network ACL rules). It also includes source and destination IP addresses, ports, the IANA protocol number, packet and byte counts, a time interval during which the flow was observed, and an action (ACCEPT or REJECT).
ec2  aws  vpc  logging  tracing  ops  flow-logs  network  tcpdump  packets  packet-capture 
june 2015 by jm
SolarCapture Packet Capture Software
Interesting product line -- I didn't know this existed, but it makes good sense as a "network flight recorder". Big in finance.
SolarCapture is powerful packet capture product family that can transform every server into a precision network monitoring device, increasing network visibility, network instrumentation, and performance analysis. SolarCapture products optimize network monitoring and security, while eliminating the need for specialized appliances, expensive adapters relying on exotic protocols, proprietary hardware, and dedicated networking equipment.


See also Corvil (based in Dublin!): 'I'm using a Corvil at the moment and it's awesome- nanosecond precision latency measurements on the wire.'

(via mechanical sympathy list)
corvil  timing  metrics  measurement  latency  network  solarcapture  packet-capture  financial  performance  security  network-monitoring 
may 2015 by jm
Zookeeper: not so great as a highly-available service registry
Turns out ZK isn't a good choice as a service discovery system, if you want to be able to use that service discovery system while partitioned from the rest of the ZK cluster:
I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances.  This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones.  What I saw was that the two other instances noticed the first server “going away”, but they continued to function as they still saw a majority (66%).  More interestingly the first instance noticed the other two servers “going away”, dropping the ensemble availability to 33%.  This caused the first server to stop serving requests to clients (not only writes, but also reads).


So: within that offline AZ, service discovery *reads* (as well as writes) stopped working due to a lack of ZK quorum. This is quite a feasible outage scenario for EC2, by the way, since (at least when I was working there) the network links between AZs, and the links with the external internet, were not 100% overlapping.

In other words, if you want a highly-available service discovery system in the fact of network partitions, you want an AP service discovery system, rather than a CP one -- and ZK is a CP system.

Another risk, noted on the Netflix Eureka mailing list at https://groups.google.com/d/msg/eureka_netflix/LXKWoD14RFY/tA9UnerrBHUJ :

ZooKeeper, while tolerant against single node failures, doesn't react well to long partitioning events. For us, it's vastly more important that we maintain an available registry than a necessarily consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that's OK with us.


I guess this means that a long partition can trigger SESSION_EXPIRED state, resulting in ZK client libraries requiring a restart/reconnect to fix. I'm not entirely clear what happens to the ZK cluster itself in this scenario though.

Finally, Pinterest ran into other issues relying on ZK for service discovery and registration, described at http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest ; sounds like this was mainly around load and the "thundering herd" overload problem. Their workaround was to decouple ZK availability from their services' availability, by building a Smartstack-style sidecar daemon on each host which tracked/cached ZK data.
zookeeper  service-discovery  ops  ha  cap  ap  cp  service-registry  availability  ec2  aws  network  partitions  eureka  smartstack  pinterest 
november 2014 by jm
"Perspectives On The CAP Theorem" [pdf]
"We cannot achieve [CAP theorem] consistency and availability in a partition-prone network."
papers  cap  distcomp  cap-theorem  consistency  availability  partitions  network  reliability 
september 2014 by jm
Boundary's new server monitoring free offering
'High resolution, 1 second intervals for all metrics; Fluid analytics, drag any graph to any point in time; Smart alarms to cut down on false positives; Embedded graphs and customizable dashboards; Up to 10 servers for free'

Pre-registration is open now. Could be interesting, although the limit of 10 machines is pretty small for any production usage
boundary  monitoring  network  ops  metrics  alarms  tcp  ip  netstat 
july 2014 by jm
Call me maybe: Elasticsearch
Wow, these are terrible results. From the sounds of it, ES just cannot deal with realistic outage scenarios and is liable to suffer catastrophic damage in reasonably-common partitions.
If you are an Elasticsearch user (as I am): good luck. Some people actually advocate using Elasticsearch as a primary data store; I think this is somewhat less than advisable at present. If you can, store your data in a safer database, and feed it into Elasticsearch gradually. Have processes in place that continually traverse the system of record, so you can recover from ES data loss automatically.
elasticsearch  ops  storage  databases  jepsen  partition  network  outages  reliability 
june 2014 by jm
Call me maybe: RabbitMQ
We used Knossos and Jepsen to prove the obvious: RabbitMQ is not a lock service. That investigation led to a discovery hinted at by the documentation: in the presence of partitions, RabbitMQ clustering will not only deliver duplicate messages, but will also drop huge volumes of acknowledged messages on the floor. This is not a new result, but it may be surprising if you haven’t read the docs closely–especially if you interpreted the phrase “chooses Consistency and Partition Tolerance” to mean, well, either of those things.
rabbitmq  network  partitions  failure  cap-theorem  consistency  ops  reliability  distcomp  jepsen 
june 2014 by jm
Building a Global, Highly Available Service Discovery Infrastructure with ZooKeeper
This is the written version of a presentation [Camille Fournier] made at the ZooKeeper Users Meetup at Strata/Hadoop World in October, 2012 (slides available here). This writeup expects some knowledge of ZooKeeper.


good advice from one of the ZK committers.
zookeeper  service-discovery  architecture  distcomp  camille-fournier  availability  wan  network 
may 2014 by jm
'Pickles & Spores: Improving Support for Distributed Programming in Scala
'Spores are "small units of possibly mobile functional behavior". They're a closure-like abstraction meant for use in distributed or concurrent environments. Spores provide a guarantee that the environment is effectively immutable, and safe to ship over the wire. Spores aim to give library authors some confidence in exposing functions (or, rather, spores) in public APIs for safe consumption in a distributed or concurrent environment.

The first part of the talk covers a simpler variant of spores as they are proposed for inclusion in Scala 2.11. The second part of the talk briefly introduces a current research project ongoing at EPFL which leverages Scala's type system to provide type constraints that give authors finer-grained control over spore capturing semantics. What's more, these type constraints can be composed during spore composition, so library authors are effectively able to propagate expert knowledge via these composable constraints.

The last part of the talk briefly covers Scala/Pickling, a fast new, open serialization framework.'
pickling  scala  presentations  spores  closures  fp  immutability  coding  distributed  distcomp  serialization  formats  network 
april 2014 by jm
ZooKeeper Resilience at Pinterest
essentially decoupling the client services from ZK using a local daemon on each client host; very similar to Airbnb's Smartstack. This is a bit of an indictment of ZK's usability though
ops  architecture  clustering  network  partitions  cap  reliability  smartstack  airbnb  pinterest  zookeeper 
march 2014 by jm
Blockade
'Testing applications under slow or flaky network conditions can be difficult and time consuming. Blockade aims to make that easier. A config file defines a number of docker containers and a command line tool makes introducing controlled network problems simple.'

Open-source release from Dell's Cloud Manager team (ex-Enstratius), inspired by aphyr's Jepsen. Simulates packet loss using "tc netem", so no ability to e.g. drop packets on certain flows or certain ports. Still, looks very usable -- great stuff.
testing  docker  networking  distributed  distcomp  enstratius  jepsen  network  outages  partitions  cap  via:lusis 
february 2014 by jm
Rhizome | Occupy.here: A tiny, self-contained darknet
Occupy.here began two years ago as an experiment for the encampment at Zuccotti Park. It was a wifi router hacked to run OpenWrt Linux (an operating system mostly used for computer networking) and a small "captive portal" website. When users joined the wifi network and attempted to load any URL, they were redirected to http://occupy.here. The web software offered up a simple BBS-style message board providing its users with a space to share messages and files.


Nifty project from Dan Phiffer.
occupy.here  openwrt  hacking  wifi  network  community 
october 2013 by jm
IrelandOffline broadband availability map
Marking the locations of broadband options in your area, along with VDSL cabinets, local exchanges, and wireless ISP coverage, and the landing sites of submarine cables (presumably from submarinecablemap.com data)
irelandoffline  cables  network  internet  ireland  coverage  wisps  vdsl  broadband 
august 2013 by jm
SSL/TLS overhead
'The TLS handshake has multiple variations, but let’s pick the most common one – anonymous client and authenticated server (the connections browsers use most of the time).' Works out to 4 packets, in addition to the TCP handshake's 3, and about 6.5k bytes on average.
network  tls  ssl  performance  latency  speed  networking  internet  security  packets  tcp  handshake 
june 2013 by jm
the infamous 2008 S3 single-bit-corruption outage
Neat, I didn't realise this was publicly visible. A single corrupted bit infected the S3 gossip network, taking down the whole S3 service in (iirc) one region:
We've now determined that message corruption was the cause of the server-to-server communication problems. More specifically, we found that there were a handful of messages on Sunday morning that had a single bit corrupted such that the message was still intelligible, but the system state information was incorrect. We use MD5 checksums throughout the system, for example, to prevent, detect, and recover from corruption that can occur during receipt, storage, and retrieval of customers' objects. However, we didn't have the same protection in place to detect whether [gossip state] had been corrupted. As a result, when the corruption occurred, we didn't detect it and it spread throughout the system causing the symptoms described above. We hadn't encountered server-to-server communication issues of this scale before and, as a result, it took some time during the event to diagnose and recover from it.

During our post-mortem analysis we've spent quite a bit of time evaluating what happened, how quickly we were able to respond and recover, and what we could do to prevent other unusual circumstances like this from having system-wide impacts. Here are the actions that we're taking: (a) we've deployed several changes to Amazon S3 that significantly reduce the amount of time required to completely restore system-wide state and restart customer request processing; (b) we've deployed a change to how Amazon S3 gossips about failed servers that reduces the amount of gossip and helps prevent the behavior we experienced on Sunday; (c) we've added additional monitoring and alarming of gossip rates and failures; and, (d) we're adding checksums to proactively detect corruption of system state messages so we can log any such messages and then reject them.


This is why you checksum all the things ;)
s3  aws  post-mortems  network  outages  failures  corruption  grey-failures  amazon  gossip 
june 2013 by jm
Call me maybe: Carly Rae Jepsen and the perils of network partitions
Kyle "aphyr" Kingsbury expands on his slides demonstrating the real-world failure scenarios that arise during some kinds of partitions (specifically, the TCP-hang, no clear routing failure, network partition scenario). Great set of blog posts clarifying CAP
distributed  network  databases  cap  nosql  redis  mongodb  postgresql  riak  crdt  aphyr 
may 2013 by jm
CAP Confusion: Problems with ‘partition tolerance’
Another good clarification about CAP which resurfaced during last week's discussion:
So what causes partitions? Two things, really. The first is obvious – a network failure, for example due to a faulty switch, can cause the network to partition. The other is less obvious, but fits with the definition [...]: machine failures, either hard or soft. In an asynchronous network, i.e. one where processing a message could take unbounded time, it is impossible to distinguish between machine failures and lost messages. Therefore a single machine failure partitions it from the rest of the network. A correlated failure of several machines partitions them all from the network. Not being able to receive a message is the same as the network not delivering it. In the face of sufficiently many machine failures, it is still impossible to maintain availability and consistency, not because two writes may go to separate partitions, but because the failure of an entire ‘quorum’ of servers may render some recent writes unreadable.

(sorry, catching up on old interesting things posted last week...)
failure  scalability  network  partitions  cap  quorum  distributed-databases  fault-tolerance 
may 2013 by jm
Alex Feinberg's response to Damien Katz' anti-Dynamoish/pro-Couchbase blog post
Insightful response, worth bookmarking. (the original post is at http://damienkatz.net/2013/05/dynamo_sure_works_hard.html ).
while you are saving on read traffic (online reads only go to the master), you are now decreasing availability (contrary to your stated goal), and increasing system complexity.
You also do hurt performance by requiring all writes and reads to be serialized through a single node: unless you plan to have a leader election whenever the node fails to meet a read SLA (which is going to result a disaster -- I am speaking from personal experience), you will have to accept that you're bottlenecked by a single node. With a Dynamo-style quorum (for either reads or writes), a single straggler will not reduce whole-cluster latency.
The core point of Dynamo is low latency, availability and handling of all kinds of partitions: whether clean partitions (long term single node failures), transient failures (garbage collection pauses, slow disks, network blips, etc...), or even more complex dependent failures.
The reality, of course, is that availability is neither the sole, nor the principal concern of every system. It's perfect fine to trade off availability for other goals -- you just need to be aware of that trade off.
cap  distributed-databases  databases  quorum  availability  scalability  damien-katz  alex-feinberg  partitions  network  dynamo  riak  voldemort  couchbase 
may 2013 by jm
“Call Me Maybe: Carly Rae Jepsen and the Perils of Network Partitions”
Aphyr's epic RICON talk, exploring distributed-database failure modes through music. and what a lot of fail there is!

Bottom line: CRDTs win
crdts  data-structures  storage  ricon  apyhr  failures  network  partitions  puns  slides 
may 2013 by jm
TCP Tune
These notes are intended to help users and system administrators maximize TCP/IP performance on their computer systems. They summarize all of the end-system (computer system) network tuning issues including a tutorial on TCP tuning, easy configuration checks for non-experts, and a repository of operating system specific instructions for getting the best possible network performance on these platforms.


Some tips for maximizing HPC network performance for the intra-DC case; recommended by the LinkedIn Kafka operations page.
tuning  network  tcp  sysadmin  performance  ops  kafka  ec2 
april 2013 by jm
Jeff Dean's list of "Numbers Everyone Should Know"
from a 2007 Google all-hands, the list of typical latency timings from ranging from an L1 cache reference (0.5 nanoseconds) to a CA->NL->CA IP round trip (150 milliseconds).
performance  latencies  google  jeff-dean  timing  caches  speed  network  zippy  disks  via:kellabyte 
march 2013 by jm
Passively Monitoring Network Round-Trip Times - Boundary
'how Boundary uses [TCP timestamps] to calculate round-trip times (RTTs) between any two hosts by passively monitoring TCP traffic flows, i.e., without actively launching ICMP echo requests (pings). The post is primarily an overview of this one aspect of TCP monitoring, it also outlines the mechanism we are using, and demonstrates its correctness.'
tcp  boundary  monitoring  network  ip  passive-monitoring  rtt  timestamping 
february 2013 by jm
Fatcache
from Twitter -- 'a cache for your big data. Even though memory is thousand times faster than SSD, network connected SSD-backed memory makes sense, if we design the system in a way that network latencies dominate over the SSD latencies by a large factor. To understand why network connected SSD makes sense, it is important to understand the role distributed memory plays in large-scale web architecture. In recent years, terabyte-scale, distributed, in-memory caches have become a fundamental building block of any web architecture. In-memory indexes, hash tables, key-value stores and caches are increasingly incorporated for scaling throughput and reducing latency of persistent storage systems. However, power consumption, operational complexity and single node DRAM cost make horizontally scaling this architecture challenging. The current cost of DRAM per server increases dramatically beyond approximately 150 GB, and power cost scales similarly as DRAM density increases. Fatcache extends a volatile, in-memory cache by incorporating SSD-backed storage.'
twitter  ssd  cache  caching  memcached  memcache  memory  network  storage 
february 2013 by jm
Tunlr
'uses DNS witchcraft to allow you to access US/UK-only audio and video services like Hulu.com, BBC iPlayer, etc. without using a VPN or Web proxy.' According to http://superuser.com/questions/461316/how-does-tunlr-work , it proxies the initial connection setup and geo-auth, then mangles the stream address to stream directly, not via proxy. Sounds pretty useful
proxy  network  vpn  dns  tunnel  content  video  audio  iplayer  bbc  hulu  streaming  geo-restriction 
january 2013 by jm
Why upgrading your Linux Kernel will make your customers much happier
enabling TCP Slow Start on the HTTP server-side decreased internet round-trip page load time by 21% in this case; comments suggest an "ip route" command can also work
tcp  performance  linux  network  web  http  rtt  slow-start  via:jacob 
march 2012 by jm
Mallory: Transparent TCP and UDP Proxy – Intrepidus Group - Insight
'a transparent TCP and UDP proxy. It can be used to get at those hard to intercept network streams, assess those tricky mobile web applications, or maybe just pull a prank on your friend.'  basically, cause wifi clients to associate with an Ubuntu host, then sniff their packets
proxy  security  network  sniffing  transparent-proxies  mobile  reverse-engineering  from delicious
april 2011 by jm
NeoRouter
establish an overlay, encrypted private "virtual LAN" for a small set of machines. like Hamachi, except it supports Macs, Linux, and a range of WRT54G firmware; can run off a USB stick
firewall  hamachi  network  openwrt  remote  router  security  vpn  desktop-sharing  neorouter  tomato  from delicious
july 2010 by jm

related tags

airbnb  alarms  alex-feinberg  amazon  ap  aphyr  apyhr  architecture  audio  availability  aws  bandwidth  bbc  boundary  broadband  cables  cache  caches  caching  camille-fournier  cap  cap-theorem  closures  clustering  coding  community  consistency  containers  content  corruption  corvil  couchbase  coverage  cp  crdt  crdts  damien-katz  data-structures  databases  dataviz  desktop-sharing  disks  distcomp  distributed  distributed-databases  dns  docker  dynamo  ec2  elasticsearch  enstratius  eureka  failure  failures  fault-tolerance  financial  firewall  flow-logs  formats  fp  geo-restriction  google  gossip  grey-failures  ha  hacking  hamachi  handshake  hazelcast  http  hulu  immutability  internet  ip  iplayer  ireland  irelandoffline  jay-kreps  jeff-dean  jepsen  jitter  kafka  kernel  latencies  latency  lhtable  linkedin  linux  log  logging  marc-brooker  measurement  memcache  memcached  memory  metrics  microservices  mitm  mobile  mongodb  monitoring  neorouter  netspot  netstat  network  network-monitoring  networking  node.js  nosql  occupy.here  openwrt  ops  osx  outages  pacelc  packet-capture  packets  papers  partition  partitions  passive-monitoring  performance  pickling  pinterest  post-mortems  postgresql  presentations  proxies  proxy  puns  quorum  rabbitmq  redis  reliability  remote  reverse-engineering  riak  ricon  router  rtt  s3  scala  scalability  sdn  security  serialization  service-discovery  service-registry  slides  slow-start  slowdown  smartstack  sniffing  soa  solarcapture  speed  spores  ssd  ssl  storage  streaming  survey  sysadmin  tcp  tcpdump  testing  timestamping  timing  tls  tomato  toxy  tracing  transparent-proxies  tuning  tunnel  twitter  vdsl  via:jacob  via:kellabyte  via:lusis  video  voldemort  vpc  vpn  wan  web  wifi  wireless  wisps  zippy  zookeeper 

Copy this bookmark:



description:


tags: