github/orchestrator: MySQL replication topology management and HA
MySQL replication topology management and HA. Contribute to github/orchestrator development by creating an account on GitHub.
mysql  replication  golang  failover  database  admin  administration 
november 2018 by pinterb
October 21 post-incident analysis | The GitHub Blog
A network outage caused a split-brain scenario, and their failover system allowed writes to occur in both
regional databases. Once the outage was repaired it was impossible to reconcile writes in an automated fashion as a result.

Embarrassingly, this exact scenario was called out in their previous blog post about their Raft-based failover system at --

"In a data center isolation scenario, and assuming a master is in the isolated DC, apps in that DC are still able to write to the master. This may result in state inconsistency once network is brought back up. We are working to mitigate this split-brain by implementing a reliable STONITH from within the very isolated DC. As before, some time will pass before bringing down the master, and there could be a short period of split-brain. The operational cost of avoiding split-brains altogether is very high."

Failover is hard.
github  fail  outages  failover  replication  consensus  ops 
october 2018 by jm
What's New in Failover Clustering in Windows Server | Microsoft Docs
In Windows Server 2012 R2, if the cluster is configured to use dynamic quorum (the default), the witness vote is also dynamically adjusted based on the number of voting nodes in current cluster membership. If there are an odd number of votes, the quorum witness does not have a vo
cluster  failover  sql  availabilitygroups  reference  Microsoft  sqltact 
september 2018 by wda
4G WAN: Bonding vs Load Balancing
Difference between bonding and load balancing
router  internet  isp  failover 
september 2018 by mjs

