mpm + replication   21

Designing an Efficient Replicated Log Store with Consensus Protocol
Highly available and high-performance message logging system is critical building block for various use cases that require global ordering, especially for deterministic distributed transactions. To achieve availability, we maintain multiple replicas that have the same payloads in exactly the same order. This introduces various challenging issues such as consistency between replicas after failure, while minimizing performance degradation. Replicated state machine-based consensus protocols are the most suitable candidates to fulfill those requirements, but double-write problem and different logging granularity make it hard to keep the system efficient. This paper suggests a novel way to build a replicated log store on top of Raft consensus protocol, aiming at providing the same level of consistency as well as fault-tolerance without sacrificing the throughput of the system.
consensus  consistency  replication  storage  database 
4 weeks ago by mpm
On mixing eventual and strong consistency: Bayou revisited
In this paper we study the properties of eventually consistent distributed systems that feature arbitrarily complex semantics and mix eventual and strong consistency. These systems execute requests in a highly-available, weakly-consistent fashion, but also enable stronger guarantees through additional inter-replica synchronization mechanisms that require the ability to solve distributed consensus. We use the seminal Bayou system as a case study, and then generalize our findings to a whole class of systems. We show dubious and unintuitive behaviour exhibited by those systems and provide a theoretical framework for reasoning about their correctness. We also state an impossibility result that formally proves the inherent limitation of such systems, namely temporary operation reordering, which admits interim disagreement between replicas on the relative order in which the client requests were executed.
consistency  replication 
9 weeks ago by mpm
Keep The Data Where You Use It
The de facto method of keeping the data close to the users is full replication. Many fully replicated systems, however, still have a single region responsible for orchestrating the writes, making the data available locally only for reads and not the updates.
data  database  replication 
december 2018 by mpm
Saturn: a Distributed Metadata Service for Causal Consistency
This paper presents the design, implementation, and evaluation of Saturn, a metadata service for geo-replicated systems. Saturn can be used in combination with several distributed and replicated data services to ensure that remote operations are made visible in an order that respects causality, a requirement central to many consistency criteria.
causal  consistency  replication 
april 2017 by mpm
Timestamps for Partial Replication
Maintaining causal consistency in distributed shared memory systems using vector timestamps has received a lot of attention from both theoretical and practical prospective. However, most of the previous literature focuses on full replication where each data is stored in all replicas, which may not be scalable due to the increasing amount of data. In this report, we investigate how to achieve causal consistency in partial replicated systems, where each replica may store different set of data. We propose an algorithm that tracks causal dependencies via vector timestamp in client-server model for partial replication. The cost of our algorithm in terms of timestamps size varies as a function of the manner in which the replicas share data, and the set of replicas accessed by each client. We also establish a connection between our algorithm with the previous work on full replication.
time  replication 
april 2017 by mpm
Efficient and Modular Consensus-Free Reconfiguration for Fault-Tolerant Storage
Quorum systems are useful tools for implementing consistent and available storage in the presence of failures. These systems usually comprise a static set of servers that provide a fault-tolerant read/write register accessed by a set of clients. We consider a dynamic variant of these systems and propose FreeStore, a set of fault-tolerant protocols that emulates a register in dynamic asynchronous systems in which processes are able to join/leave the servers set during the execution. These protoco...
consensus  replication  storage  fault-tolerance 
july 2016 by mpm
machi
Our goal is a robust & reliable, distributed, highly available(*), large file store based upon write-once registers, append-only files, Chain Replication, and client-server style architecture
storage  replication 
november 2015 by mpm
Compare Cost and Performance of Replication and Erasure Coding
Data storage systems are more reliable than their individual components. In order to build highly reliable systems out of less reliable parts, systems introduce redundancy. In replicated systems, objects are simply copied several times with each copy residing on a different physical device. While such an approach is simple and direct, more elaborate approaches such as erasure coding can achieve equivalent levels of data protection while using less redundancy. This report examines the trade-offs ...
replication  data 
august 2014 by mpm
An Efficient Read Dominant Data Replication Protocol under Serial Isolation using Quorum Consensus Approach
In distributed systems, data replication provides better availability, higher read capacity, improved access efficiency and lower bandwidth requirements in the system. In this paper, we propose a significantly efficient approach of the data replication for serial isolation by using newly proposed Circular quorum systems. This paper has three major contributions. First, we have proposed the Circular quorum systems that generalize the various existing quorum systems, such as Read-one-write-all (ROWA) quorum systems, Majority quorum systems, Grid quorum systems, Diamond quorum systems, D-Space quorum systems, Multi-dimensional-grid quorum systems and Generalized-grid quorum systems. Second, Circular quorum systems not only generalizes but also improves the performance over existing quorum systems of their category. Third, we proposed a highly available Circular quorum consensus protocol for data replication under serial isolation level that uses a suitable Circular quorum system for read dominant scenario
consensus  performance  replication 
july 2014 by mpm
Optimistic Parallel State-Machine Replication
State-machine replication, a fundamental approach to fault tolerance, requires replicas to execute commands deterministically, which usually results in sequential execution of commands. Sequential execution limits performance and underuses servers, which are increasingly parallel (i.e., multicore). To narrow the gap between state-machine replication requirements and the characteristics of modern servers, researchers have recently come up with alternative execution models. This paper surveys existing approaches to parallel state-machine replication and proposes a novel optimistic protocol that inherits the scalable features of previous techniques. Using a replicated B+-tree service, we demonstrate in the paper that our protocol outperforms the most efficient techniques by a factor of 2.4 times.
replication  performance  consensus 
june 2014 by mpm
Making Operation-based CRDTs Operation-based
Conflict-free Replicated Datatypes (CRDT) can simplify the design of eventually consistent systems. They can be classi ed into state-based or operation-based. Operation-based designs have the potential for allowing very compact solutions in both the sent messages and the object state size. Unfortunately, the current approaches are still far from this objective. In this paper, we introduce a new `pure' operation-based framework that makes the design and the implementation of these CRDTs more simple and ecient. We show how to leverage the meta-data of the messaging middleware to design very compact CRDTs, while only disseminating operation names and their optional arguments.
crdt  consistency  replication 
march 2014 by mpm
Copysets and Chainsets: A Better Way to Replicate
The traditional technique for performing such partitioning and replication is to randomly assign data to replicas. Although such random assignment is relatively easy to implement, it suffers from a fatal drawback: as cluster size grows, it becomes almost guaranteed that a failure of a small percentage of the cluster will lead to permanent data loss.
database  replication  availability 
february 2014 by mpm
Replicated Data Types: Specification, Verification, Optimality
Geographically distributed systems often rely on replicated eventually consistent data stores to achieve availability and performance. To resolve conflicting updates at different replicas, researchers and practitioners have proposed specialized consistency protocols, called replicated data types, that implement objects such as registers, counters, sets or lists. Reasoning about replicated data types has however not been on par with comparable work on abstract data types and concurrent data types, lacking specifications, correctness proofs, and optimality results.

To fill in this gap, we propose a framework for specifying replicated data types using relations over events and verifying their implementations using replication-aware simulations. We apply it to 7 existing implementations of 4 data types with nontrivial conflict-resolution strategies and optimizations (last-writer-wins register, counter, multi-value register and observed-remove set). We also present a novel technique for obtaining lower bounds on the worst-case space overhead of data type implementations and use it to prove optimality of 4 implementations. Finally, we show how to specify consistency of replicated stores with multiple objects axiomatically, in analogy to prior work on weak memory models. Overall, our work provides foundational reasoning tools to support research on replicated eventually consistent stores.
crdt  consistency  replication 
january 2014 by mpm
Replicant
Replicant is a tool for creating replicated state machines
replication 
december 2013 by mpm
On Barriers and the Gap between Active and Passive Replication
Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primary-order atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broadcasting messages. In this paper, we propose a barrier function tau that explains and encapsulates the...
replication  protocol 
october 2013 by mpm
Calvin: Fast Distributed Transactions for Partitioned Database Systems
By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels—including Paxos based strong consistency across geographically distant replicas—at no cost to transactional throughput.
consistency  database  replication 
july 2012 by mpm
Chain Replication in Theory and in Practice
This paper is a case study of the implementation of the chain replication protocol in a distributed key-value store called Hibari. In theory, the chain replication algorithm is quite simple and should be straightforward to implement correctly. In practice, however, there were many implementation details that had effects both profound and subtle.
availability  replication 
july 2012 by mpm
Chain Replication for Supporting High Throughput and Availablility
Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-dcale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees
distributed  availability  replication 
july 2012 by mpm
Thoughts about Multi-Master Replication of Tree-Structured Data
ideas around independent updates to trees of data (e.g. XML or XAML documents)
datastructure  database  replication 
august 2007 by mpm
Globule
Globule is a third-party module for the Apache Web server that allows a given server to replicate its documents to other Globule servers
replication  web 
april 2007 by mpm

Copy this bookmark:



description:


tags: