jm + threading   12

"Understanding Real-World Concurrency Bugs in Go" (paper)
'Go advocates for the usage of message passing as the means of inter-thread communication and provides several new concurrency mechanisms and libraries to ease multi-threading programming. It is important to understand the implication of these new proposals and the comparison of message passing and shared memory synchronization in terms of program errors, or bugs. Unfortunately, as far as we know, there has been no study on Go’s concurrency bugs. In this paper, we perform the first systematic study on concurrency bugs in real Go programs. We studied six popular Go software including Docker, Kubernetes, and gRPC.

We analyzed 171 concurrency bugs in total, with more than half of them caused by non-traditional, Go-specific problems. Apart from root causes of these bugs, we also studied their fixes, performed experiments to reproduce them, and evaluated them with two publicly-available Go bug detectors.
Overall, our study provides a better understanding on Go’s concurrency models and can guide future researchers and practitioners in writing better, more reliable Go software and in developing debugging and diagnosis tools for Go.'

(via Bill de hOra)
via:dehora  golang  go  concurrency  bugs  lint  synchronization  threading  threads  bug-detection 
19 days ago by jm
Orbit Async
Orbit Async implements async-await methods in the JVM. It allows programmers to write asynchronous code in a sequential fashion. It was developed by BioWare, a division of Electronic Arts.

Open source, BSD-licensed.
async  await  java  jvm  bioware  coding  threading 
june 2015 by jm
ExecutorService - 10 tips and tricks
Excellent advice from Tomasz Nurkiewicz' blog for anyone using java.util.concurrent.ExecutorService regularly. The whole blog is full of great posts btw
concurrency  java  jvm  threading  threads  executors  coding 
november 2014 by jm
Google's purify/valgrind-like concurrency checking tool:

'As a bonus, ThreadSanitizer finds some other types of bugs: thread leaks, deadlocks, incorrect uses of mutexes, malloc calls in signal handlers, and more. It also natively understands atomic operations and thus can find bugs in lock-free algorithms. [...] The tool is supported by both Clang and GCC compilers (only on Linux/Intel64). Using it is very simple: you just need to add a -fsanitize=thread flag during compilation and linking. For Go programs, you simply need to add a -race flag to the go tool (supported on Linux, Mac and Windows).'
concurrency  bugs  valgrind  threadsanitizer  threading  deadlocks  mutexes  locking  synchronization  coding  testing 
june 2014 by jm
Why Disqus made the Python->Go switchover
for their realtime component, from the horse's mouth:
at higher contention, the CPU was choking everything. Switching over to Go removed that contention for us, which was the primary issue that we were seeing.
python  languages  concurrency  go  threading  gevent  scalability  disqus  realtime  hn 
may 2014 by jm
'Lightweight performance tools'.
Likwid stands for 'Like I knew what I am doing'. This project contributes easy to use command line tools for Linux to support programmers in developing high performance multi-threaded programs. It contains the following tools:

likwid-topology: Show the thread and cache topology
likwid-perfctr: Measure hardware performance counters on Intel and AMD processors
likwid-features: Show and Toggle hardware prefetch control bits on Intel Core 2 processors
likwid-pin: Pin your threaded application without touching your code (supports pthreads, Intel OpenMP and gcc OpenMP)
likwid-bench: Benchmarking framework allowing rapid prototyping of threaded assembly kernels
likwid-mpirun: Script enabling simple and flexible pinning of MPI and MPI/threaded hybrid applications
likwid-perfscope: Frontend for likwid-perfctr timeline mode. Allows live plotting of performance metrics.
likwid-powermeter: Tool for accessing RAPL counters and query Turbo mode steps on Intel processor.
likwid-memsweeper: Tool to cleanup ccNUMA memory domains.

No kernel patching required. (via kellabyte)
via:kellabyte  linux  performance  testing  perf  likwid  threading  multithreading  multicore  mpi  numa 
january 2014 by jm
Safe cross-thread publication of a non-final variable in the JVM
Scary, but potentially useful in future, so worth bookmarking. By carefully orchestrating memory accesses using volatile and non-volatile fields, one can ensure that a non-volatile, non-synchronized field's value is safely visible to all threads after that point due to JMM barrier semantics.

What you are looking to do is enforce a barrier between your initializing stores and your publishing store, without that publishing store being made to a volatile field. This can be done by using volatile access to other fields in the publication path, without using those variables in the later access paths to the published object.
volatile  atomic  java  jvm  gil-tene  synchronization  performance  threading  jmm  memory-barriers 
january 2014 by jm
A Non-Blocking HashTable by Dr. Cliff Click : programming
Proggit discovers the NonBlockingHashMap. This comment from Boundary's cscotta is particularly interesting: "The code is intricate and curiously-formatted, but NBHM is quite excellent. The majority of our analytics platform is backed by NBHMs updated rapidly in parallel. Cliff's a great, friendly, approachable guy; if you have any specific questions about the approaches or implementation, he may be happy to answer."
data-structures  algorithms  non-blocking  concurrency  threading  multicore  cliff-click  azul  maps  java  boundary 
january 2013 by jm
Locks & Condition Variables - Latency Impact

Firstly, this is 3 orders of magnitude greater latency than what I illustrated in the previous article using just memory barriers to signal between threads. This cost comes about because the kernel needs to get involved to arbitrate between the threads for the lock, and then manage the scheduling for the threads to awaken when the condition is signalled. The one-way latency to signal a change is pretty much the same as what is considered current state of the art for network hops between nodes via a switch. It is possible to get ~1µs latency with InfiniBand and less than 5µs with 10GigE and user-space IP stacks.

Secondly, the impact is clear when letting the OS choose what CPUs the threads get scheduled on rather than pinning them manually. I've observed this same issue across many use cases whereby Linux, in default configuration for its scheduler, will greatly impact the performance of a low-latency system by scheduling threads on different cores resulting in cache pollution. Windows by default seems to make a better job of this.
locking  concurrency  java  jvm  signalling  locks  linux  threading 
september 2012 by jm
Martin "Disruptor" Thompson's Single Writer Principle
Contains these millisecond estimates for highly-contended inter-thread signalling when incrementing a 64-bit counter in java:
One Thread 300<br>
One Thread with Memory Barrier 4,700<br>
One Thread with CAS 5,700<br>
Two Threads with CAS 18,000<br>
One Thread with Lock 10,000<br>
Two Threads with Lock 118,000<br>

Undoubtedly not realistic for a lot of cases, but it's still useful for order-of-magnitude estimates of locking cost. Bottom line: don't lock if you can avoid it, even with 'volatile' or AtomicFoo types.
java  jvm  performance  coding  concurrency  threading  cas  locking 
september 2012 by jm
Thousands of Threads and Blocking I/O [PDF]
classic presentation from Paul Tyma of Mailinator regarding the java.nio (event-driven, non-threaded) vs (threaded) model of server concurrency, backing up the scalability of threads on modern JVMs
java  async  io  jvm  linux  performance  scalability  threading  threads  server  nio  paul-tyma  mailinator  from delicious
july 2010 by jm

Copy this bookmark: