nhaliday + concurrency   38

algorithm - Skip List vs. Binary Search Tree - Stack Overflow
Skip lists are more amenable to concurrent access/modification. Herb Sutter wrote an article about data structure in concurrent environments. It has more indepth information.

The most frequently used implementation of a binary search tree is a red-black tree. The concurrent problems come in when the tree is modified it often needs to rebalance. The rebalance operation can affect large portions of the tree, which would require a mutex lock on many of the tree nodes. Inserting a node into a skip list is far more localized, only nodes directly linked to the affected node need to be locked.
q-n-a  stackex  nibble  programming  tcs  data-structures  performance  concurrency  comparison  cost-benefit  applicability-prereqs  random  trees 
3 days ago by nhaliday
Cilk Hub
looks like this is run by Billy Moses and Leiserson (the L in CLRS)
mit  tools  programming  pls  plt  systems  c(pp)  libraries  compilers  performance  homepage  concurrency 
18 days ago by nhaliday
x86 - What does multicore assembly language look like? - Stack Overflow
This isn't a direct answer to the question, but it's an answer to a question that appears in the comments. Essentially, the question is what support the hardware gives to multi-threaded operation.

Nicholas Flynt had it right, at least regarding x86. In a multi threaded environment (Hyper-threading, multi-core or multi-processor), the Bootstrap thread (usually thread 0 in core 0 in processor 0) starts up fetching code from address 0xfffffff0. All the other threads start up in a special sleep state called Wait-for-SIPI. As part of its initialization, the primary thread sends a special inter-processor-interrupt (IPI) over the APIC called a SIPI (Startup IPI) to each thread that is in WFS. The SIPI contains the address from which that thread should start fetching code.

This mechanism allows each thread to execute code from a different address. All that's needed is software support for each thread to set up its own tables and messaging queues. The OS uses those to do the actual multi-threaded scheduling.

As far as the actual assembly is concerned, as Nicholas wrote, there's no difference between the assemblies for a single threaded or multi threaded application. Each logical thread has its own register set, so writing:

mov edx, 0
will only update EDX for the currently running thread. There's no way to modify EDX on another processor using a single assembly instruction. You need some sort of system call to ask the OS to tell another thread to run code that will update its own EDX.
q-n-a  stackex  programming  nitty-gritty  systems  assembly  concurrency  init 
20 days ago by nhaliday
What is "vectorization"? - Stack Overflow
Many CPUs have "vector" or "SIMD" instruction sets which apply the same operation simultaneously to two, four, or more pieces of data. Modern x86 chips have the SSE instructions, many PPC chips have the "Altivec" instructions, and even some ARM chips have a vector instruction set, called NEON.

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

(I chose 4 because it's what modern hardware is most likely to directly support; the term "vectorization" is also used to describe a higher level software transformation where you might just abstract away the loop altogether and just describe operating on arrays instead of the elements that comprise them)
q-n-a  stackex  programming  systems  performance  concurrency  numerics 
26 days ago by nhaliday
I don't understand Python's Asyncio | Armin Ronacher's Thoughts and Writings
Man that thing is complex and it keeps getting more complex. I do not have the mental capacity to casually work with asyncio. It requires constantly updating the knowledge with all language changes and it has tremendously complicated the language. It's impressive that an ecosystem is evolving around it but I can't help but get the impression that it will take quite a few more years for it to become a particularly enjoyable and stable development experience.

What landed in 3.5 (the actual new coroutine objects) is great. In particular with the changes that will come up there is a sensible base that I wish would have been in earlier versions. The entire mess with overloading generators to be coroutines was a mistake in my mind. With regards to what's in asyncio I'm not sure of anything. It's an incredibly complex thing and super messy internally. It's hard to comprehend how it works in all details. When you can pass a generator, when it has to be a real coroutine, what futures are, what tasks are, how the loop works and that did not even come to the actual IO part.

The worst part is that asyncio is not even particularly fast. David Beazley's live demo hacked up asyncio replacement is twice as fast as it. There is an enormous amount of complexity that's hard to understand and reason about and then it fails on it's main promise. I'm not sure what to think about it but I know at least that I don't understand asyncio enough to feel confident about giving people advice about how to structure code for it.
python  libraries  review  concurrency  programming  pls  rant  🖥  techtariat  intricacy 
october 2016 by nhaliday

bundles : frametechie

related tags

accretion  accuracy  acm  acmtariat  algorithms  amazon  analysis  apple  applicability-prereqs  approximation  article  assembly  best-practices  big-peeps  bits  blowhards  books  bostrom  c(pp)  caching  carmack  checking  chemistry  civilization  cloud  cocktail  coding-theory  cog-psych  commentary  communication  comparison  compilers  complex-systems  composition-decomposition  computation  computer-vision  concept  concurrency  confusion  contrarianism  cost-benefit  crypto  cs  cycles  dan-luu  data  data-science  data-structures  dbs  death  debugging  deep-learning  dennett  density  dirty-hands  distributed  documentation  eden-heaven  education  efficiency  electromag  empirical  engineering  enhancement  entropy-like  erlang  essay  evidence-based  examples  explanation  exposition  facebook  fiction  finiteness  formal-methods  frequency  frontier  functional  futurism  gedanken  giants  google  gradient-descent  graphics  gravity  ground-up  guide  hard-tech  hardware  haskell  higher-ed  history  hmm  hn  homepage  humanity  ideas  illusion  information-theory  infrastructure  init  instinct  intelligence  interdisciplinary  internet  intricacy  iteration-recursion  janus  jargon  jvm  lecture-notes  lectures  len:long  len:short  lens  libraries  linear-models  liner-notes  links  list  llvm  long-short-run  lower-bounds  machine-learning  magnitude  measure  mechanics  media  memory-management  metabuch  mit  mobile  model-class  models  multi  network-structure  networking  neuro  nibble  nietzschean  nitty-gritty  no-go  numerics  ocaml-sml  ocw  oly  optimization  org:bleg  org:edu  org:mat  os  oss  papers  pdf  people  performance  philosophy  phys-energy  physics  plan9  pls  plt  pragmatic  prediction  preprint  presentation  priors-posteriors  programming  project  protocol  psychology  python  q-n-a  qra  quantum  quantum-info  quixotic  quora  quotes  random  rant  ratty  reason  recommendations  reflection  relativity  retention  review  roadmap  rsc  rust  s:*  saas  scale  scaling-tech  sci-comp  scifi-fantasy  security  shipping  signal-noise  singularity  skunkworks  slides  smoothness  social  software  spatial  speed  stackex  stats  structure  study  summary  systems  tcs  teaching  tech  techtariat  the-self  the-world-is-just-atoms  thermo  thinking  time  tools  top-n  track-record  trees  trivia  tutorial  ui  unit  video  virtualization  web  whole-partial-many  within-without  working-stiff  🖥 

Copy this bookmark: