mpm + concurrency   241

EBtree - Design for a Scheduler and Use (Almost) Everywhere
Andjelko Iharos explores the goals, design and the choices behind the implementations of EBtree, and how they produce a very fast and versatile data storage for many of HAProxys advanced features. EBtree is a different take on the ubiquitous tree data structure, and has been helping HAProxy, a high performance open source software load balancer, to keep up with demands for over a decade.
datastructure  concurrency 
8 days ago by mpm
WaitFreeHierarchy
The wait-free hierarchy hrm classifies AsynchronousSharedMemory object types T by consensus number, where a type T has consensus number n if with objects of type T and atomic registers (all initialized to appropriate values) it is possible to solve wait-free consensus (i.e., agreement, validity, wait-free termination) for n processes but not for n 1 processes. The consensus number of any type is at least 1, since 1-process consensus requires no interaction, and may range up to ∞ for particularly powerful objects.
concurrency  non-blocking 
26 days ago by mpm
OneFile
OneFile is a Software Transactional Memory (STM) meant to make it easy to implement lock-free and wait-free data structures
concurrency  non-blocking 
5 weeks ago by mpm
Arachne: Core-Aware Thread Management
Arachne is a new user-level implementation of threads that provides both low latency and high throughput for applications with extremely short-lived threads (only a few microseconds). Arachne is core-aware: each application determines how many cores it needs, based on its load; it always knows exactly which cores it has been allocated, and it controls the placement of its threads on those cores. A central core arbiter allocates cores between applications. Adding Arachne to memcached improved SLO-compliant throughput by 37%, reduced tail latency by more than 10x, and allowed memcached to coexist with background applications with almost no performance impact. Adding Arachne to the RAMCloud storage system increased its write throughput by more than 2.5x. The Arachne threading library is optimized to minimize cache misses; it can initiate a new user thread on a different core (with load balancing) in 320 ns. Arachne is implemented entirely at user level on Linux; no kernel modifications are needed
concurrency  memory 
6 weeks ago by mpm
Black-box Concurrent Data Structures for NUMA Architectures
High-performance servers are non-uniform memory access (NUMA) machines. To fully leverage these machines, programmers need efficient concurrent data structures that are aware of the NUMA performance artifacts. We propose Node Replication (NR), a black-box approach to obtaining such data structures. NR takes an arbitrary sequential data structure and automatically transforms it into a NUMA-aware concurrent data structure satisfying linearizability. Using NR requires no expertise in concurrent data structure design, and the result is free of concurrency bugs. NR draws ideas from two disciplines: shared-memory algorithms and distributed systems. Briefly, NR implements a NUMA-aware shared log, and then uses the log to replicate data structures consistently across NUMA nodes. NR is best suited for contended data structures, where it can outperform lock-free algorithms by 3.1x, and lock-based solutions by 30x. To show the benefits of NR to a real application, we apply NR to the data structures of Redis, an in-memory storage system. The result outperforms other methods by up to 14x. The cost of NR is additional memory for its log and replicas.
concurrency  datastructure  memory 
6 weeks ago by mpm
Understanding Real-World Concurrency Bugs in Go
In this paper, we perform the first systematic study onconcurrency bugs in real Go programs. We studied six pop-ular Go software including Docker, Kubernetes, and gRPC.We analyzed 171 concurrency bugs in total, with more thanhalf of them caused by non-traditional, Go-specific problems.Apart from root causes of these bugs, we also studied theirfixes, performed experiments to reproduce them, and eval-uated them with two publicly-available Go bug detectors.Overall, our study provides a better understanding on Go’sconcurrency models and can guide future researchers andpractitioners in writing better, more reliable Go softwareand in developing debugging and diagnosis tools for Go.
concurrency  go 
6 weeks ago by mpm
Design of a Modern Cache
Caching is a common approach for improving performance, yet most implementations use strictly classical techniques. In this article we will explore the modern methods used by Caffeine, an open-source Java caching library, that yield high hit rates and excellent concurrency.
caching  concurrency  memory  scalability 
7 weeks ago by mpm
How to do distributed locking
The algorithm claims to implement fault-tolerant distributed locks (or rather, leases) on top of Redis, and the page asks for feedback from people who are into distributed systems. The algorithm instinctively set off some alarm bells in the back of my mind, so I spent a bit of time thinking about it and writing up these notes.
consistency  concurrency 
april 2019 by mpm
The 5-year journey to bring restartable sequences to Linux
With this design, the optimistic case (where the thread isn't interrupted) is extremely fast because expensive locks and atomic instructions can be entirely avoided
linux  concurrency 
february 2019 by mpm
Evolution of the x86 context switch in Linux
Trace the context switch through the Linux kernel from the earliest (0.01) to the most recent LTS release (4.14.67) -- with special emphasis on the first and last versions
linux  concurrency 
february 2019 by mpm
Toward non-blocking asynchronous I/O
The Linux asynchronous I/O (AIO) layer tends to have many critics and few defenders, but most people at least expect it to actually be asynchronous. In truth, an AIO operation can block in the kernel for a number of reasons, making AIO difficult to use in situations where the calling thread truly cannot afford to block.
linux  io  concurrency 
october 2018 by mpm
The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS
This paper analyzes the impact on application performance of the design and implementation choices made in two widely used open-source schedulers: ULE, the default FreeBSD scheduler, and CFS, the default Linux scheduler.
scheduling  concurrency  linux  bsd 
october 2018 by mpm
libaco
A blazing fast and lightweight C asymmetric coroutine library
concurrency 
july 2018 by mpm
Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model
We present Stamp-it, a new, concurrent, lock-less memory reclamation scheme with amortized, constant-time (thread-count independent) reclamation overhead. Stamp-it has been implemented and proved correct in the C++ memory model using as weak memory-consistency assumptions as possible. We have likewise (re)implemented six other comparable reclamation schemes. We give a detailed performance comparison, showing that Stamp-it performs favorably (sometimes better, at least as good as) than most of these other schemes while being able to reclaim free memory nodes earlier.
c++  memory  concurrency 
july 2018 by mpm
MySQL 8.0: New Lock free, scalable WAL design
The Write Ahead Log (WAL) is one of the most important components of a database. All the changes to data files are logged in the WAL (called the redo log in InnoDB). This allows to postpone the moment when the modified pages are flushed to disk, still protecting from data losses.
database  concurrency  storage  non-blocking 
july 2018 by mpm
Performance Under Load
Enforcing concurrency limits is nothing new; the hard part is figuring out this limit in a large dynamic distributed system where concurrency and latency characteristics are constantly changing. The main purpose of our solution is to dynamically identify this concurrency limit.
concurrency  availability 
march 2018 by mpm
Efficient Single Writer Concurrency
In this paper we consider single writer multiple reader concurrency - any number of transactions can atomically access shared data structures, but only one thread can atomically update the data. Limiting ourselves to a single writer allows us to achieve strong properties, some of which are not achievable with multiple writers. In particular, we guarantee strict serializability, no aborts, wait-freedom with strong bounds on step complexity, and precise garbage collection. Our approach is based on a variant of multiversion concurrency control. The user code accesses shared data in a purely functional manner. This allows each read to get a snapshot of the current database. The writer simply swaps in a new root to commit a new version. For garbage collection, we define the version maintenance problem and present efficient algorithms that solve it. This framework allows for very low overhead for maintaining and collecting multiple versions. The single-writer setup is particularly applicable to search indices, online graph analysis, and hybrid transactional/analytical processing (HTAP) databases, in which the bulk of the work is done by transactions that analyze the data. We have implemented the approach in C++ based on a parallel library PAM on balanced binary trees. We present results for two use cases: concurrent range-sum queries and a search index on a corpus. Both support adding new elements to the database. Experiments show that in both cases there is very little overhead when adding a single writer running in the background, while the queries can gradually get newly updated versions. Also, using our lock-free algorithm for version maintenance, the maximum live versions is bounded by the number of working transactions at the same time.
concurrency  functional 
march 2018 by mpm
uC++
The μC++ project extends C++ with new constructs providing advanced control-flow including light-weight concurrency on shared-memory uni- and multi-processor computers running UNIX and Linux operating systems. μC++ accomplishes this by providing new kinds of classes: coroutines, which have independent execution states; tasks, which have their own threads; and monitors, which allow for safe communication among tasks.
c++  concurrency  coroutines 
march 2018 by mpm
Cache-tries: concurrent lock-free hash tries with constant-time operations
Concurrent non-blocking hash tries have good cache locality, and horizontally scalable operations. However, operations on most existing concurrent hash tries run in O(log n) time. In this paper, we show that the concurrent hash trie operations can run in expected constant time. We present a novel lock-free concurrent hash trie design that exerts less pressure on the memory allocator. This hash trie is augmented with a quiescently consistent cache, which permits the basic operations to run in expected O(1) time. We show a statistical analysis for the constant-time bound, which, to the best of our knowledge, is the first such proof for hash tries. We also prove the safety, lock-freedom and linearizability properties. On typical workloads, our implementation demonstrates up to 5X performance improvements with respect to the previous hash trie variants.
concurrency  datastructure 
march 2018 by mpm
RacerD
RacerD statically analyzes Java code to detect potential concurrency bugs. This analysis does not attempt to prove the absence of concurrency issues, rather, it searches for a high-confidence class of data races
java  concurrency 
december 2017 by mpm
What every systems programmer should know about lockless concurrency
Seasoned programmers are familiar with concurrency building blocks like mutexes, semaphores, and condition variables. But what makes them work? How do we write concurrent code when we can’t use them, like when we’re working below the operating system in an embedded environment, or when we can’t block due to hard time constraints? And since your system transforms your code into things you didn’t write, running in orders you never asked for, how do multithreaded programs work at all? Concurrency—especially on modern hardware—is a complicated and unintuitive topic, but let’s try to cover some fundamentals.
concurrency  non-blocking 
november 2017 by mpm
A Practical Multi-Word Compare-and-Swap Operation
Work on non-blocking data structures has proposed extending processor designs with a compare-and-swap primitive, CAS2, which acts on two arbitrary memory locations. Experience suggested that current operations, typically single-word compare-and-swap (CAS1), are not expressive enough to be used alone in an efficient manner. In this paper we build CAS2 from CAS1 and, in fact, build an arbitrary multi-word compare-and-swap (CASN). Our design requires only the primitives available on contemporary systems, reserves a small and constant amount of space in each word updated (either 0 or 2 bits) and permits nonoverlapping updates to occur concurrently. This provides compelling evidence that current primitives are not only universal in the theoretical sense introduced by Herlihy, but are also universal in their use as foundations for practical algorithms. This provides a straightforward mechanism for deploying many of the interesting non-blocking data structures presented in the literature that have previously required CAS2.
concurrency  non-blocking 
october 2017 by mpm
The Theory and Practice of Concurrency
Since C.A.R. Hoare’s text Communicating Sequential Processes was published in 1985, his notation has been extensively used for teaching and applying concurrency theory. This book is intended to provide a comprehensive text on CSP from the perspective that 12 more years of research and experience have brought.
csp  concurrency  book 
august 2017 by mpm
Concurrent and Real Time Systems: the CSP approach
This book provides an introduction to Communicating Sequential Processes (CSP) and its use as a formal method for concurrency
csp  concurrency  book 
august 2017 by mpm
Locking Timestamps Versus Locking Objects
We present multiversion timestamp locking (MVTL), a new genre of multiversion concurrency control algorithms for serializable transactions. The key idea behind MVTL is simple and novel: lock individual time points instead of locking objects or versions. After showing what a generic MVTL algorithm looks like, we demonstrate MVTL’s expressiveness: we present several simple MVTL algorithms that address limitations of current multiversion schemes, by committing transactions that previous schemes would abort, by avoiding the problem of serial aborts and ghost aborts, and by offering a way to prioritize transactions that should not be aborted.
concurrency  time 
july 2017 by mpm
What is SKIP LOCKED for in PostgreSQL 9.5?
The main utility of SKIP LOCKED is for building simple, reliable and efficient concurrent work queues
messaging  concurrency  database 
july 2017 by mpm
Memory Barriers: a Hardware View for Software Hackers
memory barriers are a necessary evil that is required to enable good performance and scalability, an evil that stems from the fact that CPUs are orders of magnitude faster than are both the interconnects between them and the memory they are attempting to access.
memory  concurrency 
july 2017 by mpm
growing fibers
there is a way to build powerful control abstractions that do compose. This article covers the first half of composing a concurrency facility out of a set of more basic primitives
concurrency  coroutines 
june 2017 by mpm
Rseq
Rseq is a userspace take on the proposed kernel restartable sequences API, and provides and mechanism to perform efficient per-cpu operations.
concurrency 
june 2017 by mpm
Concurrency
This library provides high level constructs for implementing algorithms that eases the use of multiple CPU cores while minimizing the contention.
c++  concurrency 
june 2017 by mpm
comdb2
Comdb2 is a clustered RDBMS built on Optimistic Concurrency Control techniques. It provides multiple isolation levels, including Snapshot and Serializable Isolation. Read/Write transactions run on any node, with the client library transparently negotiating connections to lowest cost (latency) node which is available. The client library provides transparent reconnect.
concurrency  database 
june 2017 by mpm
Obscure Synchronization Primitives
Every other synchronization primitive can be built from mutexes and condition variables, so those two are technically all that we require. However, in the search for performance, sometimes it is necessary to investigate how other primitives that might better-fit a given task may be created
concurrency  linux 
may 2017 by mpm
Eventcount (gate) proposal
I've implemented a sketch of eventcount for TBB. It can be used as replacement for Gate in current TBB scheduler, and/or it can be used as blocking/signalling logic in concurrent_queue, and/or it can be exposed as public API, and/or on top of it portable condition variable can be implemented and exposed as public API
concurrency 
april 2017 by mpm
BwTree
BwTree is a general purpose, concurrent and lock-free B+-Tree index. It allows for multiple threads modifying and/or querying the tree concurrently without corrupting the tree or giving inconsistent results. However, BwTree only guarantees atomicity of operations, and is not a concurrency control agent. If both atomicity and isolation is required, an extra concurrency control layer must be implemented
datastructure  non-blocking  concurrency 
march 2017 by mpm
RaftLib
RaftLib is a way to write sequential blocks of code that can be executed in a distributed parallel manner without all the fuss, boilerplate, and hassle of standard parallel code
concurrency  c++ 
march 2017 by mpm
Using JDK 9 Memory Order Modes
This guide is mainly intended for expert programmers familiar with Java concurrency, but unfamiliar with the memory order modes available in JDK 9 provided by VarHandles. Mostly, it focuses on how to think about modes when developing parallel software
java  concurrency  memory 
march 2017 by mpm
libbreakr
Catches use of non re-entrant glibc functions
concurrency  testing 
march 2017 by mpm
Shooting yourself in the head with threads
I though I would put together a list of things I have seen over the years of some obvious and some not so obvious issues when using threads on Linux and c or c++. At lot of these issues will also apply to systems based around POSIX as well.
linux  unix  c++  concurrency  time 
march 2017 by mpm
elle
Elle is a collection of libraries, written in modern C++ (C++14). It contains a rich set of highly reusable concepts, algorithms, API wrappers, ... Elle is split into different smaller specialized libraries to provide elegant ways to approach coroutine, networking, formatting, serialization, logging, RPCs, etc.
concurrency  networking  c++ 
february 2017 by mpm
q
A platform-independent promise library for C++, implementing asynchronous continuations.
c++  concurrency 
february 2017 by mpm
Nonblocking Concurrent Data Structures with Condition Synchronization
We use the term dual data structure to describe a concurrent object implementation that may hold both data and reservations (registered requests). By reasoning separately about a request, its successful follow-up, and the period in-between, we obtain meaningful definitions of nonblocking dual data structures. As concrete examples, we present lock-free dualstacks and dualqueues, and experimentally compare their performance with that of lock-based and nonblocking alternatives.
concurrency  datastructure  non-blocking 
february 2017 by mpm
Exponential Backoff And Jitter
The return on implementation complexity of using jittered backoff is huge, and it should be considered a standard approach for remote clients.
concurrency 
february 2017 by mpm
ugh: http server with libev and coroutines
Ugh is HTTP server, for which it's very simple to write module, which goes to different backends simultaneously, then does something with their response, then goes again to some other backends et cetera
http  proxy  concurrency 
february 2017 by mpm
uThreads
A concurrent user-level thread library implemented in C++
concurrency  c++ 
december 2016 by mpm
Handling Overload
In this text, I'm going to dig through the common overload patterns that can be encountered in Erlang with the regular workarounds available today for your systems. Do note, however, that the overall criticism of flow-control remain true in other concurrent platforms
erlang  concurrency 
november 2016 by mpm
immer
immer is a library of persistent and immutable data structures written in C++.
c++  datastructure  concurrency 
november 2016 by mpm
CRTurn Queue
The first MPMC memory-unbounded wait-free queue with memory reclamation
concurrency  c++  non-blocking 
november 2016 by mpm
URCU ReadersVersion in C++
On the previous post we showed a new Userspace RCU (URCU) algorithm that can be implemented with just a atomics and the Java/C11/C++1x memory model. Here is the implementation in C++
concurrency  c++ 
september 2016 by mpm
Towards a unified theory of Operational Transformation and CRDT
There are basically two opposing camps: Operational Transformation, which dates back to the 1989 Grove system and is the basis of most collaborative editors today, and the much newer Conflict-free Replicated Data Types (CRDT) approach. However, I’ve come to realize that there is a common theory which unifies at least a subclass of OT and CRDT’s. I believe the two camps have potentially much to learn from each other.
crdt  concurrency 
august 2016 by mpm
memory_order_consume
To get a understanding of the release-consume ordering, it's a good idea to compare it with the release-acquire ordering
c++  concurrency 
july 2016 by mpm
Encapsulation of asynchronous behaviour in distributed system of scripted autotests
This is an example how to create continuation-like monadic behaviour in C++11 (encapsulate async execution) which is used for specific example: writing asynchronous scripts for system of distributed autotests
c++  concurrency  testing 
july 2016 by mpm
Close Encounters of The Java Memory Model Kind
In this post, we will try to follow up on particular misunderstandings about Java Memory Model, hopefully on the practical examples
java  memory  concurrency 
june 2016 by mpm
RaftLib: C++ Stream Parallel Processing
RaftLib is a C++ Library for enabling stream/data-flow parallel computation. Using simple right shift operators (just like the C++ streams that you would use for string manipulation), you can link parallel compute kernels together. With RaftLib, we do away with explicit use of pthreads, std::thread, OpenMP, or any other parallel "threading" library
c++  concurrency  dataflow 
may 2016 by mpm
Locking in WebKit
Back in August 2015 we replaced all spinlocks and OS-provided mutexes in WebKit with the new WTF::Lock (WTF stands for Web Template Framework). We also replaced all OS-provided condition variables with WTF::Condition. These new primitives have some cool properties
concurrency  c++ 
may 2016 by mpm
Project Reactor
Reactor is a second-generation Reactive library for building non-blocking applications on the JVM based on the Reactive Streams Specification
concurrency  java 
april 2016 by mpm
timeout.c: Tickless Hierarchical Timing Wheel
timeout.c implements hierarchical timing wheels as described in "Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility" by George Varghese and Tony Lauck, ACM 089791-242-X/87/0011/0025, pp. 25-38. This data structure implements timers in O(1) worst-case time for all operations including insertion, deletion, and expiration
c++  concurrency 
march 2016 by mpm
Why CSP matters
CSP is a mathematical language for representing processes that talk to each other. It allows you to model problems and then make assertions about those models. Modeling can be useful, as it helps you think about the problem, and assertions are grand. If you can make it work
concurrency 
march 2016 by mpm
windmill
Windmill is a Java 8 library for building high-performance applications using the "Thread-per-Core" or "Shared-Nothing" architecture
java  concurrency 
february 2016 by mpm
libcuckoo
libcuckoo provides a high-performance, compact hash table that allows multiple concurrent reader and writer threads
c++  concurrency  datastructure 
november 2015 by mpm
Seastar
Seastar is an advanced C++ framework for high-performance server applications on modern hardware.
c++  concurrency 
november 2015 by mpm
curio
Curio is a modern library for performing reliable concurrent I/O using Python coroutines and the explicit async/await syntax introduced in Python 3.5. Its programming model is based on cooperative multitasking and common system programming abstractions such as threads, sockets, files, subprocesses, locks, and queues. Under the covers, it is based on a task queuing system that is small, fast, and powerful
python  concurrency  coroutines 
november 2015 by mpm
scal
High-performance multicore-scalable data structures and benchmarks
c++  concurrency  performance  testing 
october 2015 by mpm
scalloc
The problem of concurrent memory allocation is to find the right balance between temporal and spatial performance and scalability across a large range of workloads. Our contributions to address this problem are: uniform treatment of small and big objects through the idea of virtual spans, efficiently and effectively reclaiming unused memory through fast and scalable global data structures, and constant-time (modulo synchronization) allocation and deallocation operations that trade off memory reuse and spatial locality without being subject to false sharing
memory  concurrency  performance 
october 2015 by mpm
Left-Right: A Concurrency Control Technique with Wait-Free Population Oblivious Reads
The easiest way to explain it is, to say it is a kind of Reader-Writer Lock that is Wait-Free for Readers, but in fact, it is much more than that.

Just in case you missed it the first time, this means it does not block Readers... or put in another way, a Writer and multiple Readers can simultaneous execute, and you don't have to worry about atomicity, or memory management, or invariants... and it's wait-free for Readers!
concurrency 
october 2015 by mpm
NUMA-Aware Reader-Writer Locks
Non-Uniform Memory Access (NUMA) architectures are gaining importance in mainstream computing systems due to the rapid growth of multi-core multi-chip machines. Extracting the best possible performance from these new machines will require us to revisit the design of the concurrent algorithms and synchronization primitives which form the building blocks of many of today’s applications. This paper revisits one such critical synchronization primitive – the reader-writer lock. We present what is, to the best of our knowledge, the first family of reader-writer lock algorithms tailored to NUMA architectures. We present several variations which trade fairness between readers and writers for higher concurrency among readers and better back-to-back batching of writers from the same NUMA node. Our algorithms leverage the lock cohorting technique to manage synchronization between writers in a NUMA-friendly fashion, binary flags to coordinate readers and writers, and simple distributed reader counter implementations to enable NUMA-friendly concurrency among readers. The end result is a collection of surprisingly simple NUMA-aware algorithms that outperform the state-of-theart reader-writer locks by up to a factor of 10 in our microbenchmark experiments. To evaluate our algorithms in a realistic setting we also present performance results of the kccachetest benchmark of the Kyoto-Cabinet distribution, an open-source database which makes heavy use of pthread reader-writer locks. Our locks boost the performance of kccachetest by up to 40 % over the best prior alternatives
concurrency  performance 
october 2015 by mpm
SuperMalloc
A Super Fast Multithreaded malloc() for 64-bit Machines
memory  concurrency 
september 2015 by mpm
folly/MPMCQueue.h
MPMCQueue<T> is a high-performance bounded concurrent queue that supports multiple producers, multiple consumers, and optional blocking.
c++  concurrency 
september 2015 by mpm
libmill
Libmill is a library that introduces Go-style concurrency to C
concurrency  coroutines  go 
august 2015 by mpm
CppMem: Interactive C/C++ memory model
Visualize C++ memory_order constraints
c++  concurrency  memory 
july 2015 by mpm
The Adaptive Priority Queue with Elimination and Combining
Priority queues are fundamental abstract data structures, often used to manage limited resources in parallel programming. Several proposed parallel priority queue implementations are based on skiplists, harnessing the potential for parallelism of the add() operations. In addition, methods such as Flat Combining have been proposed to reduce contention by batching together multiple operations to be executed by a single thread. While this technique can decrease lock-switching overhead and the number of pointer changes required by the removeMin() operations in the priority queue, it can also create a sequential bottleneck and limit parallelism, especially for non-conflicting add() operations. In this paper, we describe a novel priority queue design, harnessing the scalability of parallel insertions in conjunction with the efficiency of batched removals. Moreover, we present a new elimination algorithm suitable for a priority queue, which further increases concurrency on balanced workloads with similar numbers of add() and removeMin() operations. We implement and evaluate our design using a variety of techniques including locking, atomic operations, hardware transactional memory, as well as employing adaptive heuristics given the workload.
concurrency  datastructure 
april 2015 by mpm
ASCYLIB
ASCYLIB is a concurrent-search data-structure (CSDS) library. It contains over 30 implementations of linked lists, hash tables, skip lists, and binary search trees (BST). ASCYLIB contains sequential, lock-based, and lock-free implementations for each data structure.
concurrency  datastructure 
april 2015 by mpm
Comparative Performance of Memory Reclamation Strategies for Lock-free and Concurrently-readable Data Structures
We compare the costs of three memory reclamation strategies: quiescent-state-based reclamation, epoch-based reclamation, and safe memory reclamation. Our experiments show that changing the workload or execution environment can change which of these schemes is the most efficient.
memory  concurrency  non-blocking 
march 2015 by mpm
cpp11-on-multicore
Various synchronization primitives for multithreaded applications in C++11
c++  concurrency 
march 2015 by mpm
Fast Bounded-Concurrency Hash Tables
This article introduces a general technique for achieving single-writer many-reader lock-free and wait-free hash tables at low to negligible cost. The resulting hash table requires no barriers (fences) or locked instructions on architectures such as x86/x86-64. Read operations are lock-free and write operations are wait-free.
concurrency  non-blocking 
march 2015 by mpm
Exponential Backoff And Jitter
The solution isn’t to remove backoff. It’s to add jitter. Initially, jitter may appear to be a counter-intuitive idea: trying to improve the performance of a system by adding randomness. The time series above makes a great case for jitter – we want to spread out the spikes to an approximately constant rate.
performance  concurrency 
march 2015 by mpm
libmtev
a toolkit for building high-performance servers
performance  networking  concurrency 
february 2015 by mpm
JCTools
Java Concurrency Tools for the JVM. This project aims to offer some concurrent data structures currently missing from the JDK:
concurrency  java  data-structures  non-blocking 
january 2015 by mpm
Thin Locks: Featherweight Synchronization for Java
Language-supported synchronization is a source of serious performance problems in many Java programs. Even singlethreaded applications may spend up to half their time performing useless synchronization due to the thread-safe nature of the Java libraries. We solve this performance problem with a new algorithm that allows lock and unlock operations to be performed with only a few machine instructions in the most common cases. Our locks only require a partial word per object, and were implemented without increasing object size. We present measurements from our implementation in the JDK 1.1.2 for AIX, demonstrating speedups of up to a factor of 5 in micro-benchmarks and up to a factor of 1.7 in real programs. 1 Introduction Monitors [5] are a language-level construct for providing mutually exclusive access to shared data structures in a multithreaded environment. However, the overhead required by the necessary locking has generally restricted their use to relatively "heavy-weight" object
concurrency 
january 2015 by mpm
« earlier      
per page:    204080120160

Copy this bookmark:



description:


tags: