nhaliday + computer-memory   53

CppCon 2014: Chandler Carruth "Efficiency with Algorithms, Performance with Data Structures" - YouTube
- idk how I feel about this
- makes a distinction between efficiency (basically asymptotic complexity, "doing less work") and performance ("doing that work faster"). idiosyncratic terminology but similar to the "two performance aesthetics" described here: https://pinboard.in/u:nhaliday/b:913a284640c5
- some bikeshedding about vector::reserve and references
- "discontiguous data structures are the root of all evil" (cache-locality, don't use linked lists, etc)
- stacks? queues? just use vector. also suggests circular buffers. says std::deque is really bad
- std::map is bad too (for real SWE, not oly-programming). if you want ordered associative container, just binary search in vector
- std::unordered_map is poorly implemented, unfortunately (due to requirement for buckets in API)
- good implementation of hash table uses open addressing and local (linear?) probing
video  presentation  performance  nitty-gritty  best-practices  working-stiff  programming  c(pp)  systems  data-structures  algorithms  jvm  pls  metal-to-virtual  stylized-facts  rhetoric  expert-experience  google  llvm  efficiency  time-complexity  mobile  computer-memory  caching  oly-programming  common-case  hashing  multi  energy-resources  methodology  trees  techtariat 
8 weeks ago by nhaliday
"Performance Matters" by Emery Berger - YouTube
Stabilizer is a tool that enables statistically sound performance evaluation, making it possible to understand the impact of optimizations and conclude things like the fact that the -O2 and -O3 optimization levels are indistinguishable from noise (sadly true).

Since compiler optimizations have run out of steam, we need better profiling support, especially for modern concurrent, multi-threaded applications. Coz is a new "causal profiler" that lets programmers optimize for throughput or latency, and which pinpoints and accurately predicts the impact of optimizations.

- randomize extraneous factors like code layout and stack size to avoid spurious speedups
- simulate speedup of component of concurrent system (to assess effect of optimization before attempting) by slowing down the complement (all but that component)
- latency vs. throughput, Little's law
video  presentation  programming  engineering  nitty-gritty  performance  devtools  compilers  latency-throughput  concurrency  legacy  causation  wire-guided  let-me-see  manifolds  pro-rata  tricks  endogenous-exogenous  control  random  signal-noise  comparison  marginal  llvm  systems  hashing  computer-memory  build-packaging  composition-decomposition  coupling-cohesion  local-global  dbs  direct-indirect  symmetry  research  models  metal-to-virtual  linux  measurement  simulation  magnitude  realness  hypothesis-testing  techtariat 
8 weeks ago by nhaliday
multithreading - C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming? - Stack Overflow
I like the analogy of abandonment of sequential consistency to special relativity tho I think (emphasis on *think*, not know...) GR might be actually be the more appropriate one
q-n-a  stackex  programming  pls  c(pp)  systems  metal-to-virtual  computer-memory  concurrency  intricacy  nitty-gritty  analogy  comparison  physics  relativity  advanced 
august 2019 by nhaliday
What kind of problems give an MLE (memory limit exceeded)? - Quora
To be fair, memory limit exceeded is a pretty rare occurrence. Problems aren't usually set such that you have to optimize for memory.
q-n-a  qra  programming  oly  oly-programming  space-complexity  stylized-facts  computer-memory  prioritizing 
july 2019 by nhaliday
c++ - mmap() vs. reading blocks - Stack Overflow
The discussion of mmap/read reminds me of two other performance discussions:

Some Java programmers were shocked to discover that nonblocking I/O is often slower than blocking I/O, which made perfect sense if you know that nonblocking I/O requires making more syscalls.

Some other network programmers were shocked to learn that epoll is often slower than poll, which makes perfect sense if you know that managing epoll requires making more syscalls.

Conclusion: Use memory maps if you access data randomly, keep it around for a long time, or if you know you can share it with other processes (MAP_SHARED isn't very interesting if there is no actual sharing). Read files normally if you access data sequentially or discard it after reading. And if either method makes your program less complex, do that. For many real world cases there's no sure way to show one is faster without testing your actual application and NOT a benchmark.
q-n-a  stackex  programming  systems  performance  tradeoffs  objektbuch  stylized-facts  input-output  caching  computer-memory  sequential  applicability-prereqs 
july 2019 by nhaliday
Panel: Systems Programming in 2014 and Beyond | Lang.NEXT 2014 | Channel 9
- Bjarne Stroustrup, Niko Matsakis, Andrei Alexandrescu, Rob Pike
- 2014 so pretty outdated but rare to find a discussion with people like this together
- pretty sure Jonathan Blow asked a couple questions
- Rob Pike compliments Rust at one point. Also kinda softly rags on dynamic typing at one point ("unit testing is what they have instead of static types").
video  presentation  debate  programming  pls  c(pp)  systems  os  rust  d-lang  golang  computer-memory  legacy  devtools  formal-methods  concurrency  compilers  syntax  parsimony  google  intricacy  thinking  cost-benefit  degrees-of-freedom  facebook  performance  people  rsc  cracker-prog  critique  types  checking  api  flux-stasis  engineering  time  wire-guided  worse-is-better/the-right-thing  static-dynamic  latency-throughput  techtariat 
july 2019 by nhaliday
c++ - Which is faster: Stack allocation or Heap allocation - Stack Overflow
On my machine, using g++ 3.4.4 on Windows, I get "0 clock ticks" for both stack and heap allocation for anything less than 100000 allocations, and even then I get "0 clock ticks" for stack allocation and "15 clock ticks" for heap allocation. When I measure 10,000,000 allocations, stack allocation takes 31 clock ticks and heap allocation takes 1562 clock ticks.

so maybe around 100x difference? what does that work out to in terms of total workload?

hmm:
http://vlsiarch.eecs.harvard.edu/wp-content/uploads/2017/02/asplos17mallacc.pdf
Recent work shows that dynamic memory allocation consumes nearly 7% of all cycles in Google datacenters.

That's not too bad actually. Seems like I shouldn't worry about shifting from heap to stack/globals unless profiling says it's important, particularly for non-oly stuff.

edit: Actually, factor x100 for 7% is pretty high, could be increase constant factor by almost an order of magnitude.

edit: Well actually that's not the right math. 93% + 7%*.01 is not much smaller than 100%
q-n-a  stackex  programming  c(pp)  systems  memory-management  performance  intricacy  comparison  benchmarks  data  objektbuch  empirical  google  papers  nibble  time  measure  pro-rata  distribution  multi  pdf  oly-programming  computer-memory 
june 2019 by nhaliday
C++ Core Guidelines
This document is a set of guidelines for using C++ well. The aim of this document is to help people to use modern C++ effectively. By “modern C++” we mean effective use of the ISO C++ standard (currently C++17, but almost all of our recommendations also apply to C++14 and C++11). In other words, what would you like your code to look like in 5 years’ time, given that you can start now? In 10 years’ time?

https://isocpp.github.io/CppCoreGuidelines/
“Within C++ is a smaller, simpler, safer language struggling to get out.” – Bjarne Stroustrup

...

The guidelines are focused on relatively higher-level issues, such as interfaces, resource management, memory management, and concurrency. Such rules affect application architecture and library design. Following the rules will lead to code that is statically type safe, has no resource leaks, and catches many more programming logic errors than is common in code today. And it will run fast - you can afford to do things right.

We are less concerned with low-level issues, such as naming conventions and indentation style. However, no topic that can help a programmer is out of bounds.

Our initial set of rules emphasize safety (of various forms) and simplicity. They may very well be too strict. We expect to have to introduce more exceptions to better accommodate real-world needs. We also need more rules.

...

The rules are designed to be supported by an analysis tool. Violations of rules will be flagged with references (or links) to the relevant rule. We do not expect you to memorize all the rules before trying to write code.

contrary:
https://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/
This will be a long wall of text, and kinda random! My main points are:
1. C++ compile times are important,
2. Non-optimized build performance is important,
3. Cognitive load is important. I don’t expand much on this here, but if a programming language or a library makes me feel stupid, then I’m less likely to use it or like it. C++ does that a lot :)
programming  engineering  pls  best-practices  systems  c(pp)  guide  metabuch  objektbuch  reference  cheatsheet  elegance  frontier  libraries  intricacy  advanced  advice  recommendations  big-picture  novelty  lens  philosophy  state  error  types  concurrency  memory-management  performance  abstraction  plt  compilers  expert-experience  multi  checking  devtools  flux-stasis  safety  system-design  techtariat  time  measure  dotnet  comparison  examples  build-packaging  thinking  worse-is-better/the-right-thing  cost-benefit  tradeoffs  essay  commentary  oop  correctness  computer-memory  error-handling  resources-effects  latency-throughput 
june 2019 by nhaliday
Frama-C
Frama-C is organized with a plug-in architecture (comparable to that of the Gimp or Eclipse). A common kernel centralizes information and conducts the analysis. Plug-ins interact with each other through interfaces defined by the kernel. This makes for robustness in the development of Frama-C while allowing a wide functionality spectrum.

...

Three heavyweight plug-ins that are used by the other plug-ins:

- Eva (Evolved Value analysis)
This plug-in computes variation domains for variables. It is quite automatic, although the user may guide the analysis in places. It handles a wide spectrum of C constructs. This plug-in uses abstract interpretation techniques.
- Jessie and Wp, two deductive verification plug-ins
These plug-ins are based on weakest precondition computation techniques. They allow to prove that C functions satisfy their specification as expressed in ACSL. These proofs are modular: the specifications of the called functions are used to establish the proof without looking at their code.

For browsing unfamiliar code:
- Impact analysis
This plug-in highlights the locations in the source code that are impacted by a modification.
- Scope & Data-flow browsing
This plug-in allows the user to navigate the dataflow of the program, from definition to use or from use to definition.
- Variable occurrence browsing
Also provided as a simple example for new plug-in development, this plug-in allows the user to reach the statements where a given variable is used.
- Metrics calculation
This plug-in allows the user to compute various metrics from the source code.

For code transformation:
- Semantic constant folding
This plug-in makes use of the results of the evolved value analysis plug-in to replace, in the source code, the constant expressions by their values. Because it relies on EVA, it is able to do more of these simplifications than a syntactic analysis would.
- Slicing
This plug-in slices the code according to a user-provided criterion: it creates a copy of the program, but keeps only those parts which are necessary with respect to the given criterion.
- Spare code: remove "spare code", code that does not contribute to the final results of the program.
- E-ACSL: translate annotations into C code for runtime assertion checking.
For verifying functional specifications:

- Aoraï: verify specifications expressed as LTL (Linear Temporal Logic) formulas
Other functionalities documented together with the EVA plug-in can be considered as verifying low-level functional specifications (inputs, outputs, dependencies,…)
For test-case generation:

- PathCrawler automatically finds test-case inputs to ensure coverage of a C function. It can be used for structural unit testing, as a complement to static analysis or to study the feasible execution paths of the function.
For concurrent programs:

- Mthread
This plug-in automatically analyzes concurrent C programs, using the EVA plug-in, taking into account all possible thread interactions. At the end of its execution, the concurrent behavior of each thread is over-approximated, resulting in precise information about shared variables, which mutex protects a part of the code, etc.
Front-end for other languages

- Frama-Clang
This plug-in provides a C++ front-end to Frama-C, based on the clang compiler. It transforms C++ code into a Frama-C AST, which can then be analyzed by the plug-ins above. Note however that it is very experimental and only supports a subset of C++11
tools  devtools  formal-methods  programming  software  c(pp)  systems  memory-management  ocaml-sml  debugging  checking  rigor  oss  code-dive  graphs  state  metrics  llvm  gallic  cool  worrydream  impact  flux-stasis  correctness  computer-memory  structure  static-dynamic 
may 2019 by nhaliday
c - What REALLY happens when you don't free after malloc? - Stack Overflow
keep this stuff in mind when writing competition stuff, can usually just omit deletes/frees unless you're really running up against the memory limit:
Just about every modern operating system will recover all the allocated memory space after a program exits.

...

On the other hand, the similar admonition to close your files on exit has a much more concrete result - if you don't, the data you wrote to them might not get flushed, or if they're a temp file, they might not get deleted when you're done. Also, database handles should have their transactions committed and then closed when you're done with them. Similarly, if you're using an object oriented language like C++ or Objective C, not freeing an object when you're done with it will mean the destructor will never get called, and any resources the class is responsible might not get cleaned up.

--

I really consider this answer wrong.One should always deallocate resources after one is done with them, be it file handles/memory/mutexs. By having that habit, one will not make that sort of mistake when building servers. Some servers are expected to run 24x7. In those cases, any leak of any sort means that your server will eventually run out of that resource and hang/crash in some way. A short utility program, ya a leak isn't that bad. Any server, any leak is death. Do yourself a favor. Clean up after yourself. It's a good habit.

--

Allocation Myth 4: Non-garbage-collected programs should always deallocate all memory they allocate.

The Truth: Omitted deallocations in frequently executed code cause growing leaks. They are rarely acceptable. but Programs that retain most allocated memory until program exit often perform better without any intervening deallocation. Malloc is much easier to implement if there is no free.

In most cases, deallocating memory just before program exit is pointless. The OS will reclaim it anyway. Free will touch and page in the dead objects; the OS won't.

Consequence: Be careful with "leak detectors" that count allocations. Some "leaks" are good!
q-n-a  stackex  programming  memory-management  performance  systems  c(pp)  oly-programming  computer-memory 
april 2019 by nhaliday
Recitation 25: Data locality and B-trees
The same idea can be applied to trees. Binary trees are not good for locality because a given node of the binary tree probably occupies only a fraction of a cache line. B-trees are a way to get better locality. As in the hash table trick above, we store several elements in a single node -- as many as will fit in a cache line.

B-trees were originally invented for storing data structures on disk, where locality is even more crucial than with memory. Accessing a disk location takes about 5ms = 5,000,000ns. Therefore if you are storing a tree on disk you want to make sure that a given disk read is as effective as possible. B-trees, with their high branching factor, ensure that few disk reads are needed to navigate to the place where data is stored. B-trees are also useful for in-memory data structures because these days main memory is almost as slow relative to the processor as disk drives were when B-trees were introduced!
nibble  org:junk  org:edu  cornell  lecture-notes  exposition  programming  engineering  systems  dbs  caching  performance  memory-management  os  computer-memory  metal-to-virtual  trees  data-structures 
september 2017 by nhaliday
Ask HN: What is the state of C++ vs. Rust? | Hacker News
https://www.reddit.com/r/rust/comments/2mwpie/what_are_the_advantages_of_rust_over_modern_c/
https://www.reddit.com/r/rust/comments/bya8k6/programming_with_rust_vs_c_c/

Writing C++ from a Rust developers perspective: https://www.reddit.com/r/cpp/comments/b5wkw7/writing_c_from_a_rust_developers_perspective/

https://www.reddit.com/r/rust/comments/1gs93k/rust_for_game_development/
https://www.reddit.com/r/rust/comments/acjcbp/rust_2019_beat_c/
https://www.reddit.com/r/rust/comments/62sewn/did_you_ever_hear_the_tragedy_of_darth_stroustrup/

Rust from C++ perspective: https://www.reddit.com/r/cpp/comments/611811/have_you_used_rust_do_you_prefer_it_over_modern_c/

mostly discussion of templates:
What can C++ do that Rust cant?: https://www.reddit.com/r/rust/comments/5ti9fc/what_can_c_do_that_rust_cant/
Templates are a big part of C++, It's kind of unfair to exclude them. Type-level integers and variadic templates are not to be underestimated.
...
C++ has the benefit of many competing compilers, each with some of the best compiler architects in the industry (and the backing of extremely large companies). rust so far has only rustc for viable compilers.
--
A language specification.
--
Rust has principled overloading, while C++ has wild-wild-west overloading.

Ok rustaceans, here's a question for you. Is there anything that C++ templates can do that you can't do in rust?: https://www.reddit.com/r/rust/comments/7q7nn0/ok_rustaceans_heres_a_question_for_you_is_there/
I think you won't get the best answer about templates in the Rust community. People don't like them here, and there's... not an insignificant amount of FUD going around.

You can do most things with templates, and I think they're an elegant solution to the problem of creating generic code in an un-GC'd language. However, C++'s templates are hard to understand without the context of the design of C++'s types. C++'s class system is about the closest thing you can get to a duck typed ML module system.

I dunno, I'm not sure exactly where I'm going with this. There's a lot about the philosophy of C++'s type system that I think would be good to talk about, but it really requires a full on blog post or a talk or something. I don't think you'll get a good answer on reddit. Learning an ML will get you pretty close to understanding the philosophy behind the C++ type system though - functors are equivalent to templates, modules equivalent to classes.
--
...
You're making a greater distinction than is necessary. Aside from/given const generics, SFINAE and impl where clauses have similar power, and monomorphization is substitution.

Both C++ templates and Rust generics are turing-complete (via associated types), the difference lies in the explicitness of bounds and ad-hoc polymorphism.
In Rust we have implicit bounds out of the "WF" (well-formed) requirements of signatures, so you can imagine C++ as having WF over the entire body of a function (even if I don't think current-generation C++ compilers take advantage of this).

While the template expansion may seem more like a macro-by-example, it's still type/value-driven, just in a more ad-hoc and implicit way.

https://gist.github.com/brendanzab/9220415
discussion  hn  summary  comparison  pls  rust  pragmatic  c(pp)  performance  safety  memory-management  build-packaging  types  community  culture  coupling-cohesion  oop  multi  reddit  social  engineering  games  memes(ew)  troll  scifi-fantasy  plt  chart  paste  computer-memory  code-organizing  ecosystem  q-n-a 
october 2016 by nhaliday
Memory Leaks are Memory Safe | Huon on the internet
The best programs lie in the 👍 cells only: they manipulate valid things, and don’t manipulate invalid ones. Passable programs might also have some valid data that they don’t use (leak memory), but bad ones will try to use invalid data.

When a language, such as Rust, advertises itself as memory safe, it isn’t saying anything about whether memory leaks are impossible.

...

However, this was not correct in practice, as things like reference cycles and thread deadlock could cause memory to leak. Rust decided to make forget safe, focusing its guarantees on just preventing memory unsafety and instead making only best-effort attempts towards preventing memory leaks (like essentially all other languages, memory safe and otherwise).
systems  rust  programming  memory-management  explanation  compilers  pls  techtariat  computer-memory  safety  correctness  comparison 
april 2016 by nhaliday

bundles : techie

related tags

abstraction  accuracy  additive  advanced  advertising  advice  aggregator  algorithms  amortization-potential  analogy  analysis  aphorism  api  app  applicability-prereqs  arrows  assembly  atoms  backup  bayesian  benchmarks  best-practices  biases  big-picture  books  browser  build-packaging  c(pp)  caching  caltech  causation  characterization  chart  cheatsheet  checking  checklists  cmu  coalitions  cocoa  code-dive  code-organizing  collaboration  commentary  common-case  community  comparison  compilers  complexity  composition-decomposition  compression  computer-memory  concurrency  config  consumerism  contrarianism  control  cool  cornell  correctness  correlation  cost-benefit  coupling-cohesion  course  cracker-prog  creative  critique  culture  d-lang  dan-luu  data  data-science  data-structures  dbs  debate  debugging  deep-learning  degrees-of-freedom  design  desktop  devtools  diogenes  direct-indirect  direction  discipline  discussion  distribution  documentation  dotnet  DSL  duplication  ecosystem  editors  education  efficiency  elegance  empirical  endogenous-exogenous  energy-resources  engineering  ergo  erik-demaine  error  error-handling  essay  estimate  ethics  examples  exocortex  expert-experience  explanans  explanation  exposition  extra-introversion  facebook  faq  flux-stasis  formal-methods  formal-values  frontier  functional  futurism  gallic  games  gedanken  geometry  git  golang  google  gotchas  graph-theory  graphics  graphs  grokkability-clarity  ground-up  guide  gwern  hacker  hardware  harvard  hashing  haskell  heuristic  hn  homepage  homo-hetero  howto  huge-data-the-biggest  hypothesis-testing  ideas  idk  IEEE  impact  impetus  incentives  infographic  init  input-output  integration-extension  interface-compatibility  internet  intersection-connectedness  interview-prep  intricacy  iq  iteration-recursion  javascript  jvm  keyboard  latency-throughput  lecture-notes  lectures  legacy  lens  lesswrong  let-me-see  levers  lexical  libraries  links  linux  lisp  list  llvm  local-global  lol  magnitude  manifolds  marginal  measure  measurement  media  memes(ew)  memory-management  metabuch  metal-to-virtual  methodology  metrics  mihai  mit  mobile  models  money  money-for-time  mooc  morality  move-fast-(and-break-things)  multi  multiplicative  networking  nibble  nitty-gritty  nonlinearity  nostalgia  notetaking  novelty  numerics  objektbuch  ocaml-sml  oly  oly-programming  oop  open-closed  orders  org:bleg  org:com  org:edu  org:junk  org:med  os  oss  osx  p:someday  papers  parsimony  paste  pdf  people  performance  personality  philosophy  physics  pic  pls  plt  politics  pragmatic  prediction  presentation  prioritizing  priors-posteriors  privacy  pro-rata  probabilistic-method  profile  programming  project  protocol-metadata  psych-architecture  psychometrics  python  q-n-a  qra  quixotic  quiz  quotes  random  rant  rationality  ratty  reading  realness  recommendations  reddit  reduction  reference  reflection  reinforcement  relativity  research  research-program  resources-effects  retention  review  rhetoric  rigor  roots  rsc  rust  safety  sanctity-degradation  scala  scale  scaling-tech  sci-comp  scifi-fantasy  security  sequential  SIGGRAPH  signal-noise  simplification-normalization  simulation  sleep  slides  social  software  space-complexity  span-cover  speculation  stackex  stanford  state  static-dynamic  stats  stories  street-fighting  stress  strings  structure  stylized-facts  sublinear  summary  summer-2014  sv  symmetry  syntax  synthesis  system-design  systems  tcs  techtariat  terminal  things  thinking  threat-modeling  tidbits  time  time-complexity  toolkit  tools  top-n  topics  tradeoffs  trees  tricks  trivia  troll  tutorial  tv  types  ubiquity  unit  universalism-particularism  vcs  via:popular  video  virginia-DC  virtualization  visualization  vulgar  web  wiki  wire-guided  workflow  working-stiff  worrydream  worse-is-better/the-right-thing  yak-shaving  yoga  👳 

Copy this bookmark:



description:


tags: