nhaliday + compilers   76

CppCon 2015: Chandler Carruth "Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My!" - YouTube
- very basics of benchmarking
- Q: why does preemptive reserve speed up push_back by 10x?
- favoriate tool is Linux perf
- callgraph profiling
- important option: -fomit-frame-pointer
- perf has nice interface ('a' = "annotate") for reading assembly (good display of branches/jumps)
- A: optimized to no-op
- how to turn off optimizer
- profilers aren't infallible. a lot of the time samples are misattributed to neighboring ops
- fast mod example
- branch prediction hints (#define UNLIKELY(x), __builtin_expected, etc)
video  presentation  c(pp)  pls  programming  unix  heavyweights  cracker-prog  benchmarks  engineering  best-practices  working-stiff  systems  expert-experience  google  llvm  common-case  stories  libraries  measurement  linux  performance  traces  graphs  static-dynamic  ui  assembly  compilers  methodology 
8 days ago by nhaliday
"Performance Matters" by Emery Berger - YouTube
Stabilizer is a tool that enables statistically sound performance evaluation, making it possible to understand the impact of optimizations and conclude things like the fact that the -O2 and -O3 optimization levels are indistinguishable from noise (sadly true).

Since compiler optimizations have run out of steam, we need better profiling support, especially for modern concurrent, multi-threaded applications. Coz is a new "causal profiler" that lets programmers optimize for throughput or latency, and which pinpoints and accurately predicts the impact of optimizations.

- randomize extraneous factors like code layout and stack size to avoid spurious speedups
- simulate speedup of component of concurrent system (to assess effect of optimization before attempting) by slowing down the complement (all but that component)
- latency vs. throughput, Little's law
video  presentation  programming  engineering  nitty-gritty  performance  devtools  compilers  latency-throughput  concurrency  legacy  causation  wire-guided  let-me-see  manifolds  pro-rata  tricks  endogenous-exogenous  control  random  signal-noise  comparison  marginal  llvm  systems  hashing  computer-memory  build-packaging  composition-decomposition  coupling-cohesion  local-global  dbs  direct-indirect  symmetry  research  models  metal-to-virtual  linux  measurement  simulation  magnitude  realness  hypothesis-testing 
10 days ago by nhaliday
CakeML
some interesting job openings in Sydney listed here
programming  pls  plt  functional  ocaml-sml  formal-methods  rigor  compilers  types  numerics  accuracy  estimate  research-program  homepage  anglo  jobs  tech  cool 
9 weeks ago by nhaliday
mypy - Optional Static Typing for Python
developed by Dropbox in Python and past contributors include Greg Price, Reid Barton

https://pyre-check.org
developed by Facebook in OCaml, seems less complete atm
python  types  programming  pls  devtools  compilers  formal-methods  dropbox  multi  facebook  ocaml-sml  libraries  the-prices  oly  static-dynamic 
11 weeks ago by nhaliday
Laurence Tratt: What Challenges and Trade-Offs do Optimising Compilers Face?
Summary
It's important to be realistic: most people don't care about program performance most of the time. Modern computers are so fast that most programs run fast enough even with very slow language implementations. In that sense, I agree with Daniel's premise: optimising compilers are often unimportant. But “often” is often unsatisfying, as it is here. Users find themselves transitioning from not caring at all about performance to suddenly really caring, often in the space of a single day.

This, to me, is where optimising compilers come into their own: they mean that even fewer people need care about program performance. And I don't mean that they get us from, say, 98 to 99 people out of 100 not needing to care: it's probably more like going from 80 to 99 people out of 100 not needing to care. This is, I suspect, more significant than it seems: it means that many people can go through an entire career without worrying about performance. Martin Berger reminded me of A N Whitehead’s wonderful line that “civilization advances by extending the number of important operations which we can perform without thinking about them” and this seems a classic example of that at work. Even better, optimising compilers are widely tested and thus generally much more reliable than the equivalent optimisations performed manually.

But I think that those of us who work on optimising compilers need to be honest with ourselves, and with users, about what performance improvement one can expect to see on a typical program. We have a tendency to pick the maximum possible improvement and talk about it as if it's the mean, when there's often a huge difference between the two. There are many good reasons for that gap, and I hope in this blog post I've at least made you think about some of the challenges and trade-offs that optimising compilers are subject to.

[1]
Most readers will be familiar with Knuth’s quip that “premature optimisation is the root of all evil.” However, I doubt that any of us have any real idea what proportion of time is spent in the average part of the average program. In such cases, I tend to assume that Pareto’s principle won't be far too wrong (i.e. that 80% of execution time is spent in 20% of code). In 1971 a study by Knuth and others of Fortran programs, found that 50% of execution time was spent in 4% of code. I don't know of modern equivalents of this study, and for them to be truly useful, they'd have to be rather big. If anyone knows of something along these lines, please let me know!
techtariat  programming  compilers  performance  tradeoffs  cost-benefit  engineering  yak-shaving  pareto  plt  c(pp)  rust  golang  trivia  data  objektbuch  street-fighting  estimate  distribution  pro-rata 
july 2019 by nhaliday
Panel: Systems Programming in 2014 and Beyond | Lang.NEXT 2014 | Channel 9
- Bjarne Stroustrup, Niko Matsakis, Andrei Alexandrescu, Rob Pike
- 2014 so pretty outdated but rare to find a discussion with people like this together
- pretty sure Jonathan Blow asked a couple questions
- Rob Pike compliments Rust at one point. Also kinda softly rags on dynamic typing at one point ("unit testing is what they have instead of static types").
video  presentation  debate  programming  pls  c(pp)  systems  os  rust  d-lang  golang  computer-memory  legacy  devtools  formal-methods  concurrency  compilers  syntax  parsimony  google  intricacy  thinking  cost-benefit  degrees-of-freedom  facebook  performance  people  rsc  cracker-prog  critique  types  checking  api  flux-stasis  engineering  time  wire-guided  worse-is-better/the-right-thing  static-dynamic  latency-throughput 
july 2019 by nhaliday
C++ Core Guidelines
This document is a set of guidelines for using C++ well. The aim of this document is to help people to use modern C++ effectively. By “modern C++” we mean effective use of the ISO C++ standard (currently C++17, but almost all of our recommendations also apply to C++14 and C++11). In other words, what would you like your code to look like in 5 years’ time, given that you can start now? In 10 years’ time?

https://isocpp.github.io/CppCoreGuidelines/
“Within C++ is a smaller, simpler, safer language struggling to get out.” – Bjarne Stroustrup

...

The guidelines are focused on relatively higher-level issues, such as interfaces, resource management, memory management, and concurrency. Such rules affect application architecture and library design. Following the rules will lead to code that is statically type safe, has no resource leaks, and catches many more programming logic errors than is common in code today. And it will run fast - you can afford to do things right.

We are less concerned with low-level issues, such as naming conventions and indentation style. However, no topic that can help a programmer is out of bounds.

Our initial set of rules emphasize safety (of various forms) and simplicity. They may very well be too strict. We expect to have to introduce more exceptions to better accommodate real-world needs. We also need more rules.

...

The rules are designed to be supported by an analysis tool. Violations of rules will be flagged with references (or links) to the relevant rule. We do not expect you to memorize all the rules before trying to write code.

contrary:
https://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/
This will be a long wall of text, and kinda random! My main points are:
1. C++ compile times are important,
2. Non-optimized build performance is important,
3. Cognitive load is important. I don’t expand much on this here, but if a programming language or a library makes me feel stupid, then I’m less likely to use it or like it. C++ does that a lot :)
programming  engineering  pls  best-practices  systems  c(pp)  guide  metabuch  objektbuch  reference  cheatsheet  elegance  frontier  libraries  intricacy  advanced  advice  recommendations  big-picture  novelty  lens  philosophy  state  error  types  concurrency  memory-management  performance  abstraction  plt  compilers  expert-experience  multi  checking  devtools  flux-stasis  safety  system-design  techtariat  time  measure  dotnet  comparison  examples  build-packaging  thinking  worse-is-better/the-right-thing  cost-benefit  tradeoffs  essay  commentary  oop  correctness  computer-memory  error-handling  resources-effects  latency-throughput 
june 2019 by nhaliday
packages - Are the TeX semantics and grammar defined somewhere in some official documents? - TeX - LaTeX Stack Exchange
The grammar of each TeX command is more or less completely given in The TeXBook. Note, however, that unlike most programming languages the lexical analysis and tokenisation of the input cannot be separated from execution as the catcode table which controls tokenisation is dynamically changeable. Thus parsing TeX tends to defeat most parser generation tools.

LaTeX is a set of macros written in TeX so is defined by its implementation, although there is fairly extensive documentation in The LaTeX Companion, the LaTeX book (LaTeX: A Document Preparation System), and elsewhere.
q-n-a  stackex  programming  compilers  latex  yak-shaving  nitty-gritty  syntax 
may 2019 by nhaliday
One week of bugs
If I had to guess, I'd say I probably work around hundreds of bugs in an average week, and thousands in a bad week. It's not unusual for me to run into a hundred new bugs in a single week. But I often get skepticism when I mention that I run into multiple new (to me) bugs per day, and that this is inevitable if we don't change how we write tests. Well, here's a log of one week of bugs, limited to bugs that were new to me that week. After a brief description of the bugs, I'll talk about what we can do to improve the situation. The obvious answer to spend more effort on testing, but everyone already knows we should do that and no one does it. That doesn't mean it's hopeless, though.

...

Here's where I'm supposed to write an appeal to take testing more seriously and put real effort into it. But we all know that's not going to work. It would take 90k LOC of tests to get Julia to be as well tested as a poorly tested prototype (falsely assuming linear complexity in size). That's two person-years of work, not even including time to debug and fix bugs (which probably brings it closer to four of five years). Who's going to do that? No one. Writing tests is like writing documentation. Everyone already knows you should do it. Telling people they should do it adds zero information1.

Given that people aren't going to put any effort into testing, what's the best way to do it?

Property-based testing. Generative testing. Random testing. Concolic Testing (which was done long before the term was coined). Static analysis. Fuzzing. Statistical bug finding. There are lots of options. Some of them are actually the same thing because the terminology we use is inconsistent and buggy. I'm going to arbitrarily pick one to talk about, but they're all worth looking into.

...

There are a lot of great resources out there, but if you're just getting started, I found this description of types of fuzzers to be one of those most helpful (and simplest) things I've read.

John Regehr has a udacity course on software testing. I haven't worked through it yet (Pablo Torres just pointed to it), but given the quality of Dr. Regehr's writing, I expect the course to be good.

For more on my perspective on testing, there's this.

Everything's broken and nobody's upset: https://www.hanselman.com/blog/EverythingsBrokenAndNobodysUpset.aspx
https://news.ycombinator.com/item?id=4531549

https://hypothesis.works/articles/the-purpose-of-hypothesis/
From the perspective of a user, the purpose of Hypothesis is to make it easier for you to write better tests.

From my perspective as the primary author, that is of course also a purpose of Hypothesis. I write a lot of code, it needs testing, and the idea of trying to do that without Hypothesis has become nearly unthinkable.

But, on a large scale, the true purpose of Hypothesis is to drag the world kicking and screaming into a new and terrifying age of high quality software.

Software is everywhere. We have built a civilization on it, and it’s only getting more prevalent as more services move online and embedded and “internet of things” devices become cheaper and more common.

Software is also terrible. It’s buggy, it’s insecure, and it’s rarely well thought out.

This combination is clearly a recipe for disaster.

The state of software testing is even worse. It’s uncontroversial at this point that you should be testing your code, but it’s a rare codebase whose authors could honestly claim that they feel its testing is sufficient.

Much of the problem here is that it’s too hard to write good tests. Tests take up a vast quantity of development time, but they mostly just laboriously encode exactly the same assumptions and fallacies that the authors had when they wrote the code, so they miss exactly the same bugs that you missed when they wrote the code.

Preventing the Collapse of Civilization [video]: https://news.ycombinator.com/item?id=19945452
- Jonathan Blow

NB: DevGAMM is a game industry conference

- loss of technological knowledge (Antikythera mechanism, aqueducts, etc.)
- hardware driving most gains, not software
- software's actually less robust, often poorly designed and overengineered these days
- *list of bugs he's encountered recently*:
https://youtu.be/pW-SOdj4Kkk?t=1387
- knowledge of trivia becomes more than general, deep knowledge
- does at least acknowledge value of DRY, reusing code, abstraction saving dev time
techtariat  dan-luu  tech  software  error  list  debugging  linux  github  robust  checking  oss  troll  lol  aphorism  webapp  email  google  facebook  games  julia  pls  compilers  communication  mooc  browser  rust  programming  engineering  random  jargon  formal-methods  expert-experience  prof  c(pp)  course  correctness  hn  commentary  video  presentation  carmack  pragmatic  contrarianism  pessimism  sv  unix  rhetoric  critique  worrydream  hardware  performance  trends  multiplicative  roots  impact  comparison  history  iron-age  the-classics  mediterranean  conquest-empire  gibbon  technology  the-world-is-just-atoms  flux-stasis  increase-decrease  graphics  hmm  idk  systems  os  abstraction  intricacy  worse-is-better/the-right-thing  build-packaging  microsoft  osx  apple  reflection  assembly  things  knowledge  detail-architecture  thick-thin  trivia  info-dynamics  caching  frameworks  generalization  systematic-ad-hoc  universalism-particularism  analytical-holistic  structure  tainter  libraries  tradeoffs  prepping  threat-modeling  network-structure  writing  risk  local-glob 
may 2019 by nhaliday
c++ - Debugging template instantiations - Stack Overflow
Yes, there is a template metaprogramming debugger. Templight

https://github.com/mikael-s-persson/templight
--
Seems to be dead now, though :( [ed.: Partially true. They've merged pull requests recently tho.]
--
Metashell is still in active development though: github.com/metashell/metashell
q-n-a  stackex  nitty-gritty  pls  types  c(pp)  debugging  devtools  tools  programming  howto  advice  checklists  multi  repo  wire-guided  static-dynamic  compilers  performance  measurement  time  latency-throughput 
may 2019 by nhaliday
What makes Java easier to parse than C? - Stack Overflow
Parsing C++ is getting hard. Parsing Java is getting to be just as hard

cf the Linked questions too, lotsa good stuff
q-n-a  stackex  compilers  pls  plt  jvm  c(pp)  intricacy  syntax  automata-languages  cost-benefit  incentives  legacy 
may 2019 by nhaliday
oop - Functional programming vs Object Oriented programming - Stack Overflow
When you anticipate a different kind of software evolution:
- Object-oriented languages are good when you have a fixed set of operations on things, and as your code evolves, you primarily add new things. This can be accomplished by adding new classes which implement existing methods, and the existing classes are left alone.
- Functional languages are good when you have a fixed set of things, and as your code evolves, you primarily add new operations on existing things. This can be accomplished by adding new functions which compute with existing data types, and the existing functions are left alone.

When evolution goes the wrong way, you have problems:
- Adding a new operation to an object-oriented program may require editing many class definitions to add a new method.
- Adding a new kind of thing to a functional program may require editing many function definitions to add a new case.

This problem has been well known for many years; in 1998, Phil Wadler dubbed it the "expression problem". Although some researchers think that the expression problem can be addressed with such language features as mixins, a widely accepted solution has yet to hit the mainstream.

What are the typical problem definitions where functional programming is a better choice?

Functional languages excel at manipulating symbolic data in tree form. A favorite example is compilers, where source and intermediate languages change seldom (mostly the same things), but compiler writers are always adding new translations and code improvements or optimizations (new operations on things). Compilation and translation more generally are "killer apps" for functional languages.
q-n-a  stackex  programming  engineering  nitty-gritty  comparison  best-practices  cost-benefit  functional  data-structures  arrows  flux-stasis  atoms  compilers  examples  pls  plt  oop  types 
may 2019 by nhaliday
language design - Why does C++ need a separate header file? - Stack Overflow
C++ does it that way because C did it that way, so the real question is why did C do it that way? Wikipedia speaks a little to this.

Newer compiled languages (such as Java, C#) do not use forward declarations; identifiers are recognized automatically from source files and read directly from dynamic library symbols. This means header files are not needed.
q-n-a  stackex  programming  pls  c(pp)  compilers  trivia  roots  yak-shaving  flux-stasis  comparison  jvm  code-organizing 
may 2019 by nhaliday
c++ - Why are forward declarations necessary? - Stack Overflow
C++, while created almost 17 years later, was defined as a superset of C, and therefore had to use the same mechanism.

By the time Java rolled around in 1995, average computers had enough memory that holding a symbolic table, even for a complex project, was no longer a substantial burden. And Java wasn't designed to be backwards-compatible with C, so it had no need to adopt a legacy mechanism. C# was similarly unencumbered.

As a result, their designers chose to shift the burden of compartmentalizing symbolic declaration back off the programmer and put it on the computer again, since its cost in proportion to the total effort of compilation was minimal.
q-n-a  stackex  programming  pls  c(pp)  trivia  yak-shaving  roots  compilers  flux-stasis  comparison  jvm  static-dynamic 
may 2019 by nhaliday
When to use C over C++, and C++ over C? - Software Engineering Stack Exchange
You pick C when
- you need portable assembler (which is what C is, really) for whatever reason,
- your platform doesn't provide C++ (a C compiler is much easier to implement),
- you need to interact with other languages that can only interact with C (usually the lowest common denominator on any platform) and your code consists of little more than the interface, not making it worth to lay a C interface over C++ code,
- you hack in an Open Source project (many of which, for various reasons, stick to C),
- you don't know C++.
In all other cases you should pick C++.

--

At the same time, I have to say that @Toll's answers (for one obvious example) have things just about backwards in most respects. Reasonably written C++ will generally be at least as fast as C, and often at least a little faster. Readability is generally much better, if only because you don't get buried in an avalanche of all the code for even the most trivial algorithms and data structures, all the error handling, etc.

...

As it happens, C and C++ are fairly frequently used together on the same projects, maintained by the same people. This allows something that's otherwise quite rare: a study that directly, objectively compares the maintainability of code written in the two languages by people who are equally competent overall (i.e., the exact same people). At least in the linked study, one conclusion was clear and unambiguous: "We found that using C++ instead of C results in improved software quality and reduced maintenance effort..."

--

(Side-note: Check out Linus Torvads' rant on why he prefers C to C++. I don't necessarily agree with his points, but it gives you insight into why people might choose C over C++. Rather, people that agree with him might choose C for these reasons.)

http://harmful.cat-v.org/software/c++/linus

Why would anybody use C over C++? [closed]: https://stackoverflow.com/questions/497786/why-would-anybody-use-c-over-c
Joel's answer is good for reasons you might have to use C, though there are a few others:
- You must meet industry guidelines, which are easier to prove and test for in C.
- You have tools to work with C, but not C++ (think not just about the compiler, but all the support tools, coverage, analysis, etc)
- Your target developers are C gurus
- You're writing drivers, kernels, or other low level code
- You know the C++ compiler isn't good at optimizing the kind of code you need to write
- Your app not only doesn't lend itself to be object oriented, but would be harder to write in that form

In some cases, though, you might want to use C rather than C++:
- You want the performance of assembler without the trouble of coding in assembler (C++ is, in theory, capable of 'perfect' performance, but the compilers aren't as good at seeing optimizations a good C programmer will see)
- The software you're writing is trivial, or nearly so - whip out the tiny C compiler, write a few lines of code, compile and you're all set - no need to open a huge editor with helpers, no need to write practically empty and useless classes, deal with namespaces, etc. You can do nearly the same thing with a C++ compiler and simply use the C subset, but the C++ compiler is slower, even for tiny programs.
- You need extreme performance or small code size, and know the C++ compiler will actually make it harder to accomplish due to the size and performance of the libraries
- You contend that you could just use the C subset and compile with a C++ compiler, but you'll find that if you do that you'll get slightly different results depending on the compiler.

Regardless, if you're doing that, you're using C. Is your question really "Why don't C programmers use C++ compilers?" If it is, then you either don't understand the language differences, or you don't understand compiler theory.

--

- Because they already know C
- Because they're building an embedded app for a platform that only has a C compiler
- Because they're maintaining legacy software written in C
- You're writing something on the level of an operating system, a relational database engine, or a retail 3D video game engine.
q-n-a  stackex  programming  engineering  pls  best-practices  impetus  checklists  c(pp)  systems  assembly  compilers  hardware  embedded  oss  links  study  evidence-based  devtools  performance  rant  expert-experience  types  blowhards  linux  git  vcs  debate  rhetoric  worse-is-better/the-right-thing  cracker-prog  multi  metal-to-virtual  interface-compatibility 
may 2019 by nhaliday
Cilk Hub
looks like this is run by Billy Moses and Leiserson (the L in CLRS)
mit  tools  programming  pls  plt  systems  c(pp)  libraries  compilers  performance  homepage  concurrency 
may 2019 by nhaliday
coding style - C++ code in header files - Stack Overflow
There is occasionally some merit to putting code in the header, this can allow more clever inlining by the compiler. But at the same time, it can destroy your compile times since all code has to be processed every time it is included by the compiler.

Finally, it is often annoying to have circular object relationships (sometimes desired) when all the code is the headers.

Bottom line, you were right, he is wrong.

EDIT: I have been thinking about your question. There is one case where what he says is true. templates. Many newer "modern" libraries such as boost make heavy use of templates and often are "header only." However, this should only be done when dealing with templates as it is the only way to do it when dealing with them.
q-n-a  stackex  programming  best-practices  c(pp)  pls  compilers  types  code-organizing 
april 2019 by nhaliday
parsing - lexers vs parsers - Stack Overflow
Yes, they are very different in theory, and in implementation.

Lexers are used to recognize "words" that make up language elements, because the structure of such words is generally simple. Regular expressions are extremely good at handling this simpler structure, and there are very high-performance regular-expression matching engines used to implement lexers.

Parsers are used to recognize "structure" of a language phrases. Such structure is generally far beyond what "regular expressions" can recognize, so one needs "context sensitive" parsers to extract such structure. Context-sensitive parsers are hard to build, so the engineering compromise is to use "context-free" grammars and add hacks to the parsers ("symbol tables", etc.) to handle the context-sensitive part.

Neither lexing nor parsing technology is likely to go away soon.

They may be unified by deciding to use "parsing" technology to recognize "words", as is currently explored by so-called scannerless GLR parsers. That has a runtime cost, as you are applying more general machinery to what is often a problem that doesn't need it, and usually you pay for that in overhead. Where you have lots of free cycles, that overhead may not matter. If you process a lot of text, then the overhead does matter and classical regular expression parsers will continue to be used.
q-n-a  stackex  programming  compilers  explanation  comparison  jargon  strings  syntax  lexical  automata-languages 
november 2017 by nhaliday
Panopticon: A libre, cross platform disassembler for reverse engineering | Hacker News
Great work. Therefore that's one more disassembler in the wide. Some of the great open source ones, reverse oriented, that I have been able to test are:
Metasm: https://github.com/jjyg/metasm/
Radare: http://radare.org/
Capstone: http://www.capstone-engine.org/
Capstone is based on LLVM, that you cannot beat in term of architectures and industrial quality, and has great plugins, which make it kind of my favorite.
Of course, for non-scripting needs, you need decent graphical interfaces, as provided by these ones:
IDA: https://www.hex-rays.com/products/ida/
Hopper: http://www.hopperapp.com/
IDA is clearly the best, but Hopper is a fair choice if you need it and you can't afford IDA license for personal use.
[Edit: add forgotten IDA and Hopper...]
- jbaviat

also it's a c++ -> rust port
software  tools  hn  commentary  systems  rust  programming  pls  compilers  c(pp)  multi 
may 2016 by nhaliday
Memory Leaks are Memory Safe | Huon on the internet
The best programs lie in the 👍 cells only: they manipulate valid things, and don’t manipulate invalid ones. Passable programs might also have some valid data that they don’t use (leak memory), but bad ones will try to use invalid data.

When a language, such as Rust, advertises itself as memory safe, it isn’t saying anything about whether memory leaks are impossible.

...

However, this was not correct in practice, as things like reference cycles and thread deadlock could cause memory to leak. Rust decided to make forget safe, focusing its guarantees on just preventing memory unsafety and instead making only best-effort attempts towards preventing memory leaks (like essentially all other languages, memory safe and otherwise).
systems  rust  programming  memory-management  explanation  compilers  pls  techtariat  computer-memory  safety  correctness  comparison 
april 2016 by nhaliday
Rob Pike: Notes on Programming in C
Issues of typography
...
Sometimes they care too much: pretty printers mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. Although many people think programs should look like the Algol-68 report (and some systems even require you to edit programs in that style), a clear program is not made any clearer by such presentation, and a bad program is only made laughable.
Typographic conventions consistently held are important to clear presentation, of course - indentation is probably the best known and most useful example - but when the ink obscures the intent, typography has taken over.

...

Naming
...
Finally, I prefer minimum-length but maximum-information names, and then let the context fill in the rest. Globals, for instance, typically have little context when they are used, so their names need to be relatively evocative. Thus I say maxphysaddr (not MaximumPhysicalAddress) for a global variable, but np not NodePointer for a pointer locally defined and used. This is largely a matter of taste, but taste is relevant to clarity.

...

Pointers
C is unusual in that it allows pointers to point to anything. Pointers are sharp tools, and like any such tool, used well they can be delightfully productive, but used badly they can do great damage (I sunk a wood chisel into my thumb a few days before writing this). Pointers have a bad reputation in academia, because they are considered too dangerous, dirty somehow. But I think they are powerful notation, which means they can help us express ourselves clearly.
Consider: When you have a pointer to an object, it is a name for exactly that object and no other.

...

Comments
A delicate matter, requiring taste and judgement. I tend to err on the side of eliminating comments, for several reasons. First, if the code is clear, and uses good type names and variable names, it should explain itself. Second, comments aren't checked by the compiler, so there is no guarantee they're right, especially after the code is modified. A misleading comment can be very confusing. Third, the issue of typography: comments clutter code.
But I do comment sometimes. Almost exclusively, I use them as an introduction to what follows.

...

Complexity
Most programs are too complicated - that is, more complex than they need to be to solve their problems efficiently. Why? Mostly it's because of bad design, but I will skip that issue here because it's a big one. But programs are often complicated at the microscopic level, and that is something I can address here.
Rule 1. You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.

Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.

Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.) For example, binary trees are always faster than splay trees for workaday problems.

Rule 4. Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures.

The following data structures are a complete list for almost all practical programs:

array
linked list
hash table
binary tree
Of course, you must also be prepared to collect these into compound data structures. For instance, a symbol table might be implemented as a hash table containing linked lists of arrays of characters.
Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming. (See The Mythical Man-Month: Essays on Software Engineering by F. P. Brooks, page 102.)

Rule 6. There is no Rule 6.

Programming with data.
...
One of the reasons data-driven programs are not common, at least among beginners, is the tyranny of Pascal. Pascal, like its creator, believes firmly in the separation of code and data. It therefore (at least in its original form) has no ability to create initialized data. This flies in the face of the theories of Turing and von Neumann, which define the basic principles of the stored-program computer. Code and data are the same, or at least they can be. How else can you explain how a compiler works? (Functional languages have a similar problem with I/O.)

Function pointers
Another result of the tyranny of Pascal is that beginners don't use function pointers. (You can't have function-valued variables in Pascal.) Using function pointers to encode complexity has some interesting properties.
Some of the complexity is passed to the routine pointed to. The routine must obey some standard protocol - it's one of a set of routines invoked identically - but beyond that, what it does is its business alone. The complexity is distributed.

There is this idea of a protocol, in that all functions used similarly must behave similarly. This makes for easy documentation, testing, growth and even making the program run distributed over a network - the protocol can be encoded as remote procedure calls.

I argue that clear use of function pointers is the heart of object-oriented programming. Given a set of operations you want to perform on data, and a set of data types you want to respond to those operations, the easiest way to put the program together is with a group of function pointers for each type. This, in a nutshell, defines class and method. The O-O languages give you more of course - prettier syntax, derived types and so on - but conceptually they provide little extra.

...

Include files
Simple rule: include files should never include include files. If instead they state (in comments or implicitly) what files they need to have included first, the problem of deciding which files to include is pushed to the user (programmer) but in a way that's easy to handle and that, by construction, avoids multiple inclusions. Multiple inclusions are a bane of systems programming. It's not rare to have files included five or more times to compile a single C source file. The Unix /usr/include/sys stuff is terrible this way.
There's a little dance involving #ifdef's that can prevent a file being read twice, but it's usually done wrong in practice - the #ifdef's are in the file itself, not the file that includes it. The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.

Just follow the simple rule.

cf https://stackoverflow.com/questions/1101267/where-does-the-compiler-spend-most-of-its-time-during-parsing
First, I don't think it actually is true: in many compilers, most time is not spend in lexing source code. For example, in C++ compilers (e.g. g++), most time is spend in semantic analysis, in particular in overload resolution (trying to find out what implicit template instantiations to perform). Also, in C and C++, most time is often spend in optimization (creating graph representations of individual functions or the whole translation unit, and then running long algorithms on these graphs).

When comparing lexical and syntactical analysis, it may indeed be the case that lexical analysis is more expensive. This is because both use state machines, i.e. there is a fixed number of actions per element, but the number of elements is much larger in lexical analysis (characters) than in syntactical analysis (tokens).

https://news.ycombinator.com/item?id=7728207
programming  systems  philosophy  c(pp)  summer-2014  intricacy  engineering  rhetoric  contrarianism  diogenes  parsimony  worse-is-better/the-right-thing  data-structures  list  algorithms  stylized-facts  essay  ideas  performance  functional  state  pls  oop  gotchas  blowhards  duplication  compilers  syntax  lexical  checklists  metabuch  lens  notation  thinking  neurons  guide  pareto  heuristic  time  cost-benefit  multi  q-n-a  stackex  plt  hn  commentary  minimalism  techtariat  rsc  writing  technical-writing  cracker-prog  code-organizing  grokkability  protocol-metadata  direct-indirect  grokkability-clarity  latency-throughput 
august 2014 by nhaliday

bundles : frametechie

related tags

:)  :/  abstraction  academia  accretion  accuracy  advanced  advice  aggregator  algorithms  analogy  analysis  analytical-holistic  anglo  announcement  aphorism  api  app  apple  arrows  assembly  atoms  automata-languages  backup  bangbang  benchmarks  best-practices  big-picture  blowhards  books  browser  build-packaging  business  c(pp)  caching  calculation  carmack  cartoons  causation  characterization  chart  cheatsheet  checking  checklists  civilization  cmu  cocoa  code-dive  code-organizing  commentary  common-case  communication  comparison  compilers  complex-systems  complexity  composition-decomposition  computer-memory  concept  concurrency  conquest-empire  constraint-satisfaction  context  contrarianism  control  cool  cornell  correctness  correlation  cost-benefit  coupling-cohesion  course  cracker-prog  creative  critique  cs  d-lang  dan-luu  data  data-science  data-structures  dbs  debate  debugging  degrees-of-freedom  design  desktop  detail-architecture  devtools  diogenes  direct-indirect  discussion  distributed  distribution  documentation  dotnet  dropbox  DSL  duplication  ecosystem  editors  education  elegance  email  embedded  empirical  endogenous-exogenous  engineering  erlang  error  error-handling  essay  estimate  evidence-based  examples  exocortex  experiment  expert  expert-experience  explanans  explanation  exposition  facebook  faq  flux-stasis  form-design  formal-methods  frameworks  frontier  functional  futurism  games  gedanken  generalization  gibbon  git  github  gnu  golang  google  gotchas  gowers  graphics  graphs  grokkability  grokkability-clarity  guide  hardware  harvard  hashing  haskell  hci  heavyweights  heuristic  hg  history  hmm  hn  homepage  homo-hetero  howto  hypothesis-testing  ide  ideas  idk  impact  impetus  incentives  increase-decrease  info-dynamics  info-foraging  infographic  init  integration-extension  interface-compatibility  interview  intricacy  ios  iron-age  iteration-recursion  jargon  javascript  jobs  julia  jvm  knowledge  latency-throughput  latex  learning  legacy  lens  let-me-see  lexical  libraries  lifts-projections  links  linux  lisp  list  llvm  local-global  lol  machine-learning  magnitude  manifolds  marginal  math  mathtariat  measure  measurement  mediterranean  memory-management  mental-math  meta:math  meta:reading  meta:research  metabuch  metal-to-virtual  metaprogramming  methodology  microsoft  minimalism  minimum-viable  mit  models  mooc  move-fast-(and-break-things)  multi  multiplicative  musk  network-structure  networking  neurons  nibble  nitty-gritty  nonlinearity  nostalgia  notation  novelty  numerics  objektbuch  ocaml-sml  oly  oop  open-closed  orders  org:edu  org:junk  org:med  os  oss  osx  packaging  papers  pareto  parsimony  paste  pdf  people  performance  pessimism  philosophy  pic  plan9  pls  plt  poast  pragmatic  prediction  prepping  presentation  pro-rata  problem-solving  prof  programming  project  protocol-metadata  python  q-n-a  qra  random  rant  reading  realness  recommendations  reddit  reference  reflection  repo  research  research-program  resources-effects  rhetoric  rigor  risk  robust  roots  rsc  rust  s:**  safety  scala  scale  scaling-tech  scholar  sci-comp  search  sequential  SIGGRAPH  signal-noise  signum  simplification  simulation  slides  social  software  space-complexity  span-cover  speculation  stackex  stanford  state  state-of-art  static-dynamic  stories  street-fighting  strings  structure  study  studying  stylized-facts  summary  summer-2014  sv  symmetry  syntax  system-design  systematic-ad-hoc  systems  tainter  teaching  tech  technical-writing  technology  techtariat  terminal  the-classics  the-prices  the-trenches  the-world-is-just-atoms  thick-thin  things  thinking  threat-modeling  time  time-complexity  tools  top-n  traces  trade  tradeoffs  tradition  trees  trends  tricks  trivia  troll  tutorial  types  ubiquity  ui  unit  universalism-particularism  unix  ux  vcs  video  virtualization  visualization  volo-avolo  web  webapp  whole-partial-many  wiki  wire-guided  workflow  working-stiff  worrydream  worse-is-better/the-right-thing  writing  yak-shaving  zooming  🎓  🖥 

Copy this bookmark:



description:


tags: