nhaliday + tcs   397

data structures - Why are Red-Black trees so popular? - Computer Science Stack Exchange
- AVL trees have smaller average depth than red-black trees, and thus searching for a value in AVL tree is consistently faster.
- Red-black trees make less structural changes to balance themselves than AVL trees, which could make them potentially faster for insert/delete. I'm saying potentially, because this would depend on the cost of the structural change to the tree, as this will depend a lot on the runtime and implemntation (might also be completely different in a functional language when the tree is immutable?)

There are many benchmarks online that compare AVL and Red-black trees, but what struck me is that my professor basically said, that usually you'd do one of two things:
- Either you don't really care that much about performance, in which case the 10-20% difference of AVL vs Red-black in most cases won't matter at all.
- Or you really care about performance, in which you case you'd ditch both AVL and Red-black trees, and go with B-trees, which can be tweaked to work much better (or (a,b)-trees, I'm gonna put all of those in one basket.)

--

> For some kinds of binary search trees, including red-black trees but not AVL trees, the "fixes" to the tree can fairly easily be predicted on the way down and performed during a single top-down pass, making the second pass unnecessary. Such insertion algorithms are typically implemented with a loop rather than recursion, and often run slightly faster in practice than their two-pass counterparts.

So a RedBlack tree insert can be implemented without recursion, on some CPUs recursion is very expensive if you overrun the function call cache (e.g SPARC due to is use of Register window)

--

There are some cases where you can't use B-trees at all.

One prominent case is std::map from C++ STL. The standard requires that insert does not invalidate existing iterators

...

I also believe that "single pass tail recursive" implementation is not the reason for red black tree popularity as a mutable data structure.

First of all, stack depth is irrelevant here, because (given log𝑛 height) you would run out of the main memory before you run out of stack space. Jemalloc is happy with preallocating worst case depth on the stack.
nibble  q-n-a  overflow  cs  algorithms  tcs  data-structures  functional  orders  trees  cost-benefit  tradeoffs  roots  explanans  impetus  performance  applicability-prereqs  programming  pls  c(pp)  ubiquity 
29 days ago by nhaliday
Bareiss algorithm - Wikipedia
During the execution of Bareiss algorithm, every integer that is computed is the determinant of a submatrix of the input matrix. This allows, using the Hadamard inequality, to bound the size of these integers. Otherwise, the Bareiss algorithm may be viewed as a variant of Gaussian elimination and needs roughly the same number of arithmetic operations.
nibble  ground-up  cs  tcs  algorithms  complexity  linear-algebra  numerics  sci-comp  fields  calculation  nitty-gritty 
5 weeks ago by nhaliday
What's the expected level of paper for top conferences in Computer Science - Academia Stack Exchange
Top. The top level.

My experience on program committees for STOC, FOCS, ITCS, SODA, SOCG, etc., is that there are FAR more submissions of publishable quality than can be accepted into the conference. By "publishable quality" I mean a well-written presentation of a novel, interesting, and non-trivial result within the scope of the conference.

...

There are several questions that come up over and over in the FOCS/STOC review cycle:

- How surprising / novel / elegant / interesting is the result?
- How surprising / novel / elegant / interesting / general are the techniques?
- How technically difficult is the result? Ironically, FOCS and STOC committees have a reputation for ignoring the distinction between trivial (easy to derive from scratch) and nondeterministically trivial (easy to understand after the fact).
- What is the expected impact of this result? Is this paper going to change the way people do theoretical computer science over the next five years?
- Is the result of general interest to the theoretical computer science community? Or is it only of interest to a narrow subcommunity? In particular, if the topic is outside the STOC/FOCS mainstream—say, for example, computational topology—does the paper do a good job of explaining and motivating the results to a typical STOC/FOCS audience?
nibble  q-n-a  overflow  academia  tcs  cs  meta:research  publishing  scholar  lens  properties  cost-benefit  analysis  impetus  increase-decrease  soft-question  motivation  proofs  search  complexity  analogy  problem-solving  elegance  synthesis  hi-order-bits  novelty  discovery 
6 weeks ago by nhaliday
Analysis of Current and Future Computer Science Needs via Advertised Faculty Searches for 2019 - CRN
Differences are also seen when analyzing results based on the type of institution. Positions related to Security have the highest percentages for all but top-100 institutions. The area of Artificial Intelligence/Data Mining/Machine Learning is of most interest for top-100 PhD institutions. Roughly 35% of positions for PhD institutions are in data-oriented areas. The results show a strong interest in data-oriented areas by public PhD and private PhD, MS, and BS institutions while public MS and BS institutions are most interested in Security.
org:edu  data  analysis  visualization  trends  recruiting  jobs  career  planning  academia  higher-ed  cs  tcs  machine-learning  systems  pro-rata  measure  long-term  🎓  uncertainty  progression  grad-school  phd  distribution  ranking  top-n  security  status  s-factor  comparison  homo-hetero  correlation  org:ngo  white-paper 
7 weeks ago by nhaliday
algorithm - Skip List vs. Binary Search Tree - Stack Overflow
Skip lists are more amenable to concurrent access/modification. Herb Sutter wrote an article about data structure in concurrent environments. It has more indepth information.

The most frequently used implementation of a binary search tree is a red-black tree. The concurrent problems come in when the tree is modified it often needs to rebalance. The rebalance operation can affect large portions of the tree, which would require a mutex lock on many of the tree nodes. Inserting a node into a skip list is far more localized, only nodes directly linked to the affected node need to be locked.
q-n-a  stackex  nibble  programming  tcs  data-structures  performance  concurrency  comparison  cost-benefit  applicability-prereqs  random  trees  tradeoffs 
9 weeks ago by nhaliday
Complexity no Bar to AI - Gwern.net
Critics of AI risk suggest diminishing returns to computing (formalized asymptotically) means AI will be weak; this argument relies on a large number of questionable premises and ignoring additional resources, constant factors, and nonlinear returns to small intelligence advantages, and is highly unlikely. (computer science, transhumanism, AI, R)
created: 1 June 2014; modified: 01 Feb 2018; status: finished; confidence: likely; importance: 10
ratty  gwern  analysis  faq  ai  risk  speedometer  intelligence  futurism  cs  computation  complexity  tcs  linear-algebra  nonlinearity  convexity-curvature  average-case  adversarial  article  time-complexity  singularity  iteration-recursion  magnitude  multiplicative  lower-bounds  no-go  performance  hardware  humanity  psychology  cog-psych  psychometrics  iq  distribution  moments  complement-substitute  hanson  ems  enhancement  parable  detail-architecture  universalism-particularism  neuro  ai-control  environment  climate-change  threat-modeling  security  theory-practice  hacker  academia  realness  crypto  rigorous-crypto  usa  government 
april 2018 by nhaliday
If Quantum Computers are not Possible Why are Classical Computers Possible? | Combinatorics and more
As most of my readers know, I regard quantum computing as unrealistic. You can read more about it in my Notices AMS paper and its extended version (see also this post) and in the discussion of Puzzle 4 from my recent puzzles paper (see also this post). The amazing progress and huge investment in quantum computing (that I presented and update  routinely in this post) will put my analysis to test in the next few years.
tcstariat  mathtariat  org:bleg  nibble  tcs  cs  computation  quantum  volo-avolo  no-go  contrarianism  frontier  links  quantum-info  analogy  comparison  synthesis  hi-order-bits  speedometer  questions  signal-noise 
november 2017 by nhaliday
Rank aggregation basics: Local Kemeny optimisation | David R. MacIver
This turns our problem from a global search to a local one: Basically we can start from any point in the search space and search locally by swapping adjacent pairs until we hit a minimum. This turns out to be quite easy to do. _We basically run insertion sort_: At step n we have the first n items in a locally Kemeny optimal order. Swap the n+1th item backwards until the majority think its predecessor is < it. This ensures all adjacent pairs are in the majority order, so swapping them would result in a greater than or equal K. This is of course an O(n^2) algorithm. In fact, the problem of merely finding a locally Kemeny optimal solution can be done in O(n log(n)) (for much the same reason as you can sort better than insertion sort). You just take the directed graph of majority votes and find a Hamiltonian Path. The nice thing about the above version of the algorithm is that it gives you a lot of control over where you start your search.
techtariat  liner-notes  papers  tcs  algorithms  machine-learning  acm  optimization  approximation  local-global  orders  graphs  graph-theory  explanation  iteration-recursion  time-complexity  nibble 
september 2017 by nhaliday
Correlated Equilibria in Game Theory | Azimuth
Given this, it’s not surprising that Nash equilibria can be hard to find. Last September a paper came out making this precise, in a strong way:

• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… in general. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in Quanta gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:
baez  org:bleg  nibble  mathtariat  commentary  summary  news  org:mag  org:sci  popsci  equilibrium  GT-101  game-theory  acm  conceptual-vocab  concept  definition  thinking  signaling  coordination  tcs  complexity  communication-complexity  lower-bounds  no-go  liner-notes  big-surf  papers  research  algorithmic-econ  volo-avolo 
july 2017 by nhaliday
Talks
Quantum Supremacy: Office of Science and Technology Policy QIS Forum, Eisenhower Executive Office Building, White House Complex, Washington DC, October 18, 2016. Another version at UTCS Faculty Lunch, October 26, 2016. Another version at UT Austin Physics Colloquium, Austin, TX, November 9, 2016.

Complexity-Theoretic Foundations of Quantum Supremacy Experiments: Quantum Algorithms Workshop, Aspen Center for Physics, Aspen, CO, March 25, 2016

When Exactly Do Quantum Computers Provide A Speedup?: Yale Quantum Institute Seminar, Yale University, New Haven, CT, October 10, 2014. Another version at UT Austin Physics Colloquium, Austin, TX, November 19, 2014; Applied and Interdisciplinary Mathematics Seminar, Northeastern University, Boston, MA, November 25, 2014; Hebrew University Physics Colloquium, Jerusalem, Israel, January 5, 2015; Computer Science Colloquium, Technion, Haifa, Israel, January 8, 2015; Stanford University Physics Colloquium, January 27, 2015
tcstariat  aaronson  tcs  complexity  quantum  quantum-info  talks  list  slides  accretion  algorithms  applications  physics  nibble  frontier  computation  volo-avolo  speedometer  questions 
may 2017 by nhaliday
Peter Norvig, the meaning of polynomials, debugging as psychotherapy | Quomodocumque
He briefly showed a demo where, given values of a polynomial, a machine can put together a few lines of code that successfully computes the polynomial. But the code looks weird to a human eye. To compute some quadratic, it nests for-loops and adds things up in a funny way that ends up giving the right output. So has it really ”learned” the polynomial? I think in computer science, you typically feel you’ve learned a function if you can accurately predict its value on a given input. For an algebraist like me, a function determines but isn’t determined by the values it takes; to me, there’s something about that quadratic polynomial the machine has failed to grasp. I don’t think there’s a right or wrong answer here, just a cultural difference to be aware of. Relevant: Norvig’s description of “the two cultures” at the end of this long post on natural language processing (which is interesting all the way through!)
mathtariat  org:bleg  nibble  tech  ai  talks  summary  philosophy  lens  comparison  math  cs  tcs  polynomials  nlp  debugging  psychology  cog-psych  complex-systems  deep-learning  analogy  legibility  interpretability  debuggin  composition-decomposition  coupling-cohesion 
march 2017 by nhaliday
Equivalence between counting and sampling
also: every counting problem either has FPTRAS or no approx. w/i polynomial factor
pdf  exposition  lecture-notes  berkeley  nibble  tcs  counting  sampling  characterization  complexity  approximation  rand-approx  proofs 
february 2017 by nhaliday
MinHash - Wikipedia
- goal: compute Jaccard coefficient J(A, B) = |A∩B| / |A∪B| in sublinear space
- idea: pick random injective hash function h, define h_min(S) = argmin_{x in S} h(x), and note that Pr[h_min(A) = h_min(B)] = J(A, B)
- reduce variance w/ Chernoff bound
algorithms  data-structures  sublinear  hashing  wiki  reference  random  tcs  nibble  measure  metric-space  metrics  similarity  PAC  intersection  intersection-connectedness 
february 2017 by nhaliday
Count–min sketch - Wikipedia
- estimates frequency vector (f_i)
- idea:
d = O(log 1/δ) hash functions h_j: [n] -> [w] (w = O(1/ε))
d*w counters a[r, c]
for each event i, increment counters a[1, h_1(i)], a[2, h_2(i)], ..., a[d, h_d(i)]
estimate for f_i is min_j a[j, h_j(i)]
- never underestimates but upward-biased
- pf: Markov to get constant probability of success, then exponential decrease with repetition
lecture notes: http://theory.stanford.edu/~tim/s15/l/l2.pdf
- note this can work w/ negative updates. just use median instead of min. pf still uses markov on the absolute value of error.
algorithms  data-structures  sublinear  hashing  wiki  reference  bias-variance  approximation  random  tcs  multi  stanford  lecture-notes  pdf  tim-roughgarden  nibble  pigeonhole-markov  PAC 
february 2017 by nhaliday
6.896: Essential Coding Theory
- probabilistic method and Chernoff bound for Shannon coding
- probabilistic method for asymptotically good Hamming codes (Gilbert coding)
- sparsity used for LDPC codes
mit  course  yoga  tcs  complexity  coding-theory  math.AG  fields  polynomials  pigeonhole-markov  linear-algebra  probabilistic-method  lecture-notes  bits  sparsity  concentration-of-measure  linear-programming  linearity  expanders  hamming  pseudorandomness  crypto  rigorous-crypto  communication-complexity  no-go  madhu-sudan  shannon  unit  p:**  quixotic  advanced 
february 2017 by nhaliday
What is the relationship between information theory and Coding theory? - Quora
basically:
- finite vs. asymptotic
- combinatorial vs. probabilistic (lotsa overlap their)
- worst-case (Hamming) vs. distributional (Shannon)

Information and coding theory most often appear together in the subject of error correction over noisy channels. Historically, they were born at almost exactly the same time - both Richard Hamming and Claude Shannon were working at Bell Labs when this happened. Information theory tends to heavily use tools from probability theory (together with an "asymptotic" way of thinking about the world), while traditional "algebraic" coding theory tends to employ mathematics that are much more finite sequence length/combinatorial in nature, including linear algebra over Galois Fields. The emergence in the late 90s and first decade of 2000 of codes over graphs blurred this distinction though, as code classes such as low density parity check codes employ both asymptotic analysis and random code selection techniques which have counterparts in information theory.

They do not subsume each other. Information theory touches on many other aspects that coding theory does not, and vice-versa. Information theory also touches on compression (lossy & lossless), statistics (e.g. large deviations), modeling (e.g. Minimum Description Length). Coding theory pays a lot of attention to sphere packing and coverings for finite length sequences - information theory addresses these problems (channel & lossy source coding) only in an asymptotic/approximate sense.
q-n-a  qra  math  acm  tcs  information-theory  coding-theory  big-picture  comparison  confusion  explanation  linear-algebra  polynomials  limits  finiteness  math.CO  hi-order-bits  synthesis  probability  bits  hamming  shannon  intricacy  nibble  s:null  signal-noise 
february 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : academeframetcs

related tags

2016-election  80000-hours  aaronson  abstraction  academia  accretion  accuracy  acm  acmtariat  additive-combo  advanced  adversarial  advice  aesthetics  aggregator  aging  ai  ai-control  alg-combo  algebra  algebraic-complexity  algorithmic-econ  algorithms  altruism  ama  analogy  analysis  ankur-moitra  announcement  anthropology  aphorism  apollonian-dionysian  applicability-prereqs  applications  approximation  arrows  art  art-generation  article  atoms  automata-languages  automation  average-case  axioms  backup  baez  bare-hands  bayesian  beauty  ben-recht  berkeley  best-practices  bias-variance  big-list  big-picture  big-surf  binomial  bio  bioinformatics  biotech  bits  blog  boaz-barak  boltzmann  bonferroni  books  boolean-analysis  broad-econ  business  c(pp)  caching  calculation  california  caltech  cancer  capitalism  career  cartoons  causation  certificates-recognition  chaining  characterization  chart  cheatsheet  checklists  chemistry  chicago  circuits  clarity  classic  clever-rats  climate-change  cmu  coalitions  coarse-fine  code-dive  coding-theory  cog-psych  columbia  combo-optimization  comics  commentary  communication  communication-complexity  commutativity  comparison  compensation  competition  complement-substitute  complex-systems  complexity  composition-decomposition  compressed-sensing  computation  computational-geometry  concentration-of-measure  concept  conceptual-vocab  concrete  concurrency  conference  confluence  confusion  constraint-satisfaction  contradiction  contrarianism  convexity-curvature  cool  cooperate-defect  coordination  cornell  correlation  cost-benefit  counterexample  counting  coupling-cohesion  course  creative  criminal-justice  CRISPR  critique  crypto  cs  cultural-dynamics  culture  curiosity  current-events  curvature  cybernetics  dan-luu  dana-moshkovitz  dark-arts  data  data-science  data-structures  database  debuggin  debugging  decision-making  decision-theory  deep-learning  definition  degrees-of-freedom  dennett  dependence-independence  detail-architecture  differential  differential-privacy  dimensionality  direction  discovery  discrete  discussion  distributed  distribution  DP  draft  dropbox  duality  duplication  dynamic  dynamical  economics  eden-heaven  education  effective-altruism  EGT  elections  electromag  elegance  embeddings  empirical  ems  encyclopedic  ends-means  engineering  enhancement  ensembles  entanglement  entropy-like  environment  epidemiology  equilibrium  erdos  ergodic  erik-demaine  error  essay  estimate  ethical-algorithms  ethics  events  evidence  evolution  examples  existence  exocortex  expanders  expectancy  expert  expert-experience  explanans  explanation  exposition  extrema  fall-2015  fall-2016  faq  features  feynman  fiber  fields  finance  finiteness  fixed-point  flux-stasis  focs  formal-values  forms-instances  fourier  free-riding  frequentist  frontier  functional  futurism  game-theory  games  gaussian-processes  gavisti  generalization  generative  geoengineering  geometry  georgia  germanic  giants  gnon  gnosis-logos  google  government  gowers  grad-school  gradient-descent  graph-theory  graphical-models  graphs  greedy  ground-up  growth  GT-101  guide  GWAS  gwern  habit  hacker  haidt  hamming  hanson  hard-core  hardness  hardware  harvard  hashing  heterodox  heuristic  hi-order-bits  hierarchy  high-dimension  high-variance  higher-ed  history  hmm  hn  homepage  homo-hetero  homogeneity  hsu  huge-data-the-biggest  human-capital  humanity  ide  ideas  identity  idk  IEEE  iidness  impact  impetus  impro  increase-decrease  info-dynamics  info-foraging  information-theory  init  inner-product  insight  integral  intelligence  interdisciplinary  internet  interpretability  intersection  intersection-connectedness  interview  intricacy  intuition  iq  ising  isotropy  israel  iteration-recursion  iterative-methods  jelani-nelson  jobs  justice  jvm  knowledge  labor  language  latent-variables  latex  law  learning  learning-theory  lecture-notes  lectures  legacy  legibility  len:long  lens  lesswrong  let-me-see  letters  levers  libraries  lifts-projections  limits  linear-algebra  linear-programming  linearity  liner-notes  links  linux  list  local-global  logic  lol  long-term  lower-bounds  luca-trevisan  machine-learning  macro  madhu-sudan  magnitude  management  manifolds  market-failure  markets  markov  martingale  matching  math  math.AG  math.AT  math.CA  math.CO  math.CV  math.DS  math.FA  math.GN  math.GR  math.MG  math.NT  math.RT  mathtariat  matrix-factorization  measure  measurement  mechanics  mechanism-design  meta:math  meta:reading  meta:research  meta:science  metabuch  metameta  metric-space  metrics  michael-nielsen  micro  microsoft  mihai  miri-cfar  mit  mixing  model-class  models  moments  money  monotonicity  monte-carlo  mostly-modern  motivation  mrtz  multi  multiplicative  music-theory  mutation  narrative  naturality  network-structure  neuro  neurons  news  nibble  nips  nitty-gritty  nlp  no-go  nonlinearity  norms  notetaking  novelty  number  numerics  obama  objective-measure  objektbuch  ocw  off-convex  oly  oly-programming  online-learning  open-problems  operational  optimization  order-disorder  orders  org:bleg  org:edge  org:edu  org:inst  org:junk  org:mag  org:mat  org:ngo  org:sci  organization  orourke  os  oscillation  overflow  p:*  p:**  p:***  p:null  p:someday  p:whenever  PAC  papadimitriou  papers  parable  paradox  pcp  pdf  pennsylvania  people  percolation  performance  perturbation  phase-transition  phd  philosophy  physics  pigeonhole-markov  piracy  planning  play  pls  politics  polynomials  popsci  population-genetics  positivity  postmortem  potential  pragmatic  pre-2013  prediction  preimage  prejudice  preprint  presentation  primitivism  princeton  prioritizing  privacy  pro-rata  probabilistic-method  probability  problem-solving  productivity  prof  profile  programming  progression  proof-systems  proofs  properties  property-rights  pseudorandomness  psychology  psychometrics  public-goodish  publishing  puzzles  q-n-a  qra  quantifiers-sums  quantitative-qualitative  quantum  quantum-info  quantum-money  questions  quixotic  quotes  rand-approx  rand-complexity  random  random-matrices  random-networks  ranking  rationality  ratty  reading  realness  reason  rec-math  recommendations  recruiting  reduction  reference  reflection  regularization  reinforcement  relativity  relativization  relaxation  reputation  research  research-program  retrofit  review  rhetoric  rigidity  rigor  rigorous-crypto  risk  roadmap  robust  roots  rounding  ryan-odonnell  s-factor  s:*  s:**  s:***  s:null  salil-vadhan  sampling  sanjeev-arora  scale  scaling-tech  scholar  scholar-pack  schools  sci-comp  science  scitariat  SDP  search  sebastien-bubeck  security  seminar  sensitivity  series  shannon  signal-noise  signaling  signum  similarity  simulation  singularity  skeleton  sleuthin  slides  smoothness  social  social-choice  social-science  sociality  society  soft-question  software  space-complexity  sparsity  spatial  spectral  speculation  speedometer  spock  stackex  stanford  stat-mech  state  stats  status  stoc  stochastic-processes  stock-flow  stories  stream  street-fighting  strings  structure  students  studying  sub-super  subculture  subjective-objective  sublinear  submodular  sum-of-squares  summary  summer-2016  supply-demand  survey  sv  symmetry  synchrony  synthesis  system-design  systematic-ad-hoc  systems  talks  tcs  tcstariat  teaching  tech  technical-writing  technocracy  technology  techtariat  telos-atelos  tensors  texas  the-prices  the-self  the-trenches  the-west  the-world-is-just-atoms  theory-practice  thermo  thesis  thiel  things  thinking  threat-modeling  thurston  tidbits  tightness  tim-roughgarden  time-complexity  tip-of-tongue  todo  toolkit  tools  top-n  topics  topology  track-record  tradeoffs  trees  trends  tribalism  tricki  turing  tutorial  twitter  ubiquity  UGC  unaffiliated  uncertainty  unintended-consequences  uniqueness  unit  universalism-particularism  unix  unsupervised  us-them  usa  vague  valiant  vazirani  video  visual-understanding  visualization  volo-avolo  von-neumann  walls  washington  water  web  west-hunter  white-paper  whole-partial-many  wigderson  wiki  winter-2015  winter-2017  wire-guided  wisdom  within-without  workflow  working-stiff  workshop  world-war  wormholes  worrydream  writing  yak-shaving  yoga  zeitgeist  zooming  🎓  👳  🔬  🖥  🤖  🦉 

Copy this bookmark:



description:


tags: