nhaliday + metal-to-virtual   48

Parallel Computing: Theory and Practice
by Umut Acar who also co-authored a different book on parallel algorithms w/ Guy Blelloch from a more high-level and functional perspective
unit  books  cmu  cs  programming  tcs  algorithms  concurrency  c(pp)  divide-and-conquer  libraries  complexity  time-complexity  data-structures  orders  graphs  graph-theory  trees  models  functional  metal-to-virtual  systems 
yesterday by nhaliday
Two Performance Aesthetics: Never Miss a Frame and Do Almost Nothing - Tristan Hume
I’ve noticed when I think about performance nowadays that I think in terms of two different aesthetics. One aesthetic, which I’ll call Never Miss a Frame, comes from the world of game development and is focused on writing code that has good worst case performance by making good use of the hardware. The other aesthetic, which I’ll call Do Almost Nothing comes from a more academic world and is focused on algorithmically minimizing the work that needs to be done to the extent that there’s barely any work left, paying attention to the performance at all scales.

[ed.: Neither of these exactly matches TCS performance PoV but latter is closer (the focus on diffs is kinda weird).]

...

Never Miss a Frame

In game development the most important performance criteria is that your game doesn’t miss frame deadlines. You have a target frame rate and if you miss the deadline for the screen to draw a new frame your users will notice the jank. This leads to focusing on the worst case scenario and often having fixed maximum limits for various quantities. This property can also be important in areas other than game development, like other graphical applications, real-time audio, safety-critical systems and many embedded systems. A similar dynamic occurs in distributed systems where one server needs to query 100 others and combine the results, you’ll wait for the slowest of the 100 every time so speeding up some of them doesn’t make the query faster, and queries occasionally taking longer (e.g because of garbage collection) will impact almost every request!

...

In this kind of domain you’ll often run into situations where in the worst case you can’t avoid processing a huge number of things. This means you need to focus your effort on making the best use of the hardware by writing code at a low level and paying attention to properties like cache size and memory bandwidth.

Projects with inviolable deadlines need to adjust different factors than speed if the code runs too slow. For example a game might decrease the size of a level or use a more efficient but less pretty rendering technique.

Aesthetically: Data should be tightly packed, fixed size, and linear. Transcoding data to and from different formats is wasteful. Strings and their variable lengths and inefficient operations must be avoided. Only use tools that allow you to work at a low level, even if they’re annoying, because that’s the only way you can avoid piles of fixed costs making everything slow. Understand the machine and what your code does to it.

Personally I identify this aesthetic most with Jonathan Blow. He has a very strong personality and I’ve watched enough of videos of him that I find imagining “What would Jonathan Blow say?” as a good way to tap into this aesthetic. My favourite articles about designs following this aesthetic are on the Our Machinery Blog.

...

Do Almost Nothing

Sometimes, it’s important to be as fast as you can in all cases and not just orient around one deadline. The most common case is when you simply have to do something that’s going to take an amount of time noticeable to a human, and if you can make that time shorter in some situations that’s great. Alternatively each operation could be fast but you may run a server that runs tons of them and you’ll save on server costs if you can decrease the load of some requests. Another important case is when you care about power use, for example your text editor not rapidly draining a laptop’s battery, in this case you want to do the least work you possibly can.

A key technique for this approach is to never recompute something from scratch when it’s possible to re-use or patch an old result. This often involves caching: keeping a store of recent results in case the same computation is requested again.

The ultimate realization of this aesthetic is for the entire system to deal only in differences between the new state and the previous state, updating data structures with only the newly needed data and discarding data that’s no longer needed. This way each part of the system does almost no work because ideally the difference from the previous state is very small.

Aesthetically: Data must be in whatever structure scales best for the way it is accessed, lots of trees and hash maps. Computations are graphs of inputs and results so we can use all our favourite graph algorithms to optimize them! Designing optimal systems is hard so you should use whatever tools you can to make it easier, any fixed cost they incur will be made negligible when you optimize away all the work they need to do.

Personally I identify this aesthetic most with my friend Raph Levien and his articles about the design of the Xi text editor, although Raph also appreciates the other aesthetic and taps into it himself sometimes.

...

_I’m conflating the axes of deadline-oriented vs time-oriented and low-level vs algorithmic optimization, but part of my point is that while they are different, I think these axes are highly correlated._

...

Text Editors

Sublime Text is a text editor that mostly follows the Never Miss a Frame approach. ...

The Xi Editor is designed to solve this problem by being designed from the ground up to grapple with the fact that some operations, especially those interacting with slow compilers written by other people, can’t be made instantaneous. It does this using a fancy asynchronous plugin model and lots of fancy data structures.
...

...

Compilers

Jonathan Blow’s Jai compiler is clearly designed with the Never Miss a Frame aesthetic. It’s written to be extremely fast at every level, and the language doesn’t have any features that necessarily lead to slow compiles. The LLVM backend wasn’t fast enough to hit his performance goals so he wrote an alternative backend that directly writes x86 code to a buffer without doing any optimizations. Jai compiles something like 100,000 lines of code per second. Designing both the language and compiler to not do anything slow lead to clean build performance 10-100x faster than other commonly-used compilers. Jai is so fast that its clean builds are faster than most compilers incremental builds on common project sizes, due to limitations in how incremental the other compilers are.

However, Jai’s compiler is still O(n) in the codebase size where incremental compilers can be O(n) in the size of the change. Some compilers like the work-in-progress rust-analyzer and I think also Roslyn for C# take a different approach and focus incredibly hard on making everything fully incremental. For small changes (the common case) this can let them beat Jai and respond in milliseconds on arbitrarily large projects, even if they’re slower on clean builds.

Conclusion
I find both of these aesthetics appealing, but I also think there’s real trade-offs that incentivize leaning one way or the other for a given project. I think people having different performance aesthetics, often because one aesthetic really is better suited for their domain, is the source of a lot of online arguments about making fast systems. The different aesthetics also require different bases of knowledge to pursue, like knowledge of data-oriented programming in C++ vs knowledge of abstractions for incrementality like Adapton, so different people may find that one approach seems way easier and better for them than the other.

I try to choose how to dedicate my effort to pursuing each aesthetics on a per project basis by trying to predict how effort in each direction would help. Some projects I know if I code it efficiently it will always hit the performance deadline, others I know a way to drastically cut down on work by investing time in algorithmic design, some projects need a mix of both. Personally I find it helpful to think of different programmers where I have a good sense of their aesthetic and ask myself how they’d solve the problem. One reason I like Rust is that it can do both low-level optimization and also has a good ecosystem and type system for algorithmic optimization, so I can more easily mix approaches in one project. In the end the best approach to follow depends not only on the task, but your skills or the skills of the team working on it, as well as how much time you have to work towards an ambitious design that may take longer for a better result.
techtariat  reflection  things  comparison  aesthetics  lens  programming  engineering  cracker-prog  carmack  games  performance  big-picture  system-design  constraint-satisfaction  metrics  telos-atelos  worst-case  distributed  incentives  concurrency  cost-benefit  tradeoffs  systems  metal-to-virtual  latency-throughput  abstraction  marginal  caching  editors  strings  ideas  ui  common-case  examples  applications  flux-stasis  nitty-gritty  ends-means  thinking  summary  correlation  degrees-of-freedom  c(pp)  rust  interface  integration-extension 
13 days ago by nhaliday
multithreading - C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming? - Stack Overflow
I like the analogy of abandonment of sequential consistency to special relativity tho I think (emphasis on *think*, not know...) GR might be actually be the more appropriate one
q-n-a  stackex  programming  pls  c(pp)  systems  metal-to-virtual  computer-memory  concurrency  intricacy  nitty-gritty  analogy  comparison  physics  relativity  advanced 
5 weeks ago by nhaliday
The Law of Leaky Abstractions – Joel on Software
[TCP/IP example]

All non-trivial abstractions, to some degree, are leaky.

...

- Something as simple as iterating over a large two-dimensional array can have radically different performance if you do it horizontally rather than vertically, depending on the “grain of the wood” — one direction may result in vastly more page faults than the other direction, and page faults are slow. Even assembly programmers are supposed to be allowed to pretend that they have a big flat address space, but virtual memory means it’s really just an abstraction, which leaks when there’s a page fault and certain memory fetches take way more nanoseconds than other memory fetches.

- The SQL language is meant to abstract away the procedural steps that are needed to query a database, instead allowing you to define merely what you want and let the database figure out the procedural steps to query it. But in some cases, certain SQL queries are thousands of times slower than other logically equivalent queries. A famous example of this is that some SQL servers are dramatically faster if you specify “where a=b and b=c and a=c” than if you only specify “where a=b and b=c” even though the result set is the same. You’re not supposed to have to care about the procedure, only the specification. But sometimes the abstraction leaks and causes horrible performance and you have to break out the query plan analyzer and study what it did wrong, and figure out how to make your query run faster.

...

- C++ string classes are supposed to let you pretend that strings are first-class data. They try to abstract away the fact that strings are hard and let you act as if they were as easy as integers. Almost all C++ string classes overload the + operator so you can write s + “bar” to concatenate. But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type “foo” + “bar”, because string literals in C++ are always char*’s, never strings. The abstraction has sprung a leak that the language doesn’t let you plug. (Amusingly, the history of the evolution of C++ over time can be described as a history of trying to plug the leaks in the string abstraction. Why they couldn’t just add a native string class to the language itself eludes me at the moment.)

- And you can’t drive as fast when it’s raining, even though your car has windshield wipers and headlights and a roof and a heater, all of which protect you from caring about the fact that it’s raining (they abstract away the weather), but lo, you have to worry about hydroplaning (or aquaplaning in England) and sometimes the rain is so strong you can’t see very far ahead so you go slower in the rain, because the weather can never be completely abstracted away, because of the law of leaky abstractions.

One reason the law of leaky abstractions is problematic is that it means that abstractions do not really simplify our lives as much as they were meant to. When I’m training someone to be a C++ programmer, it would be nice if I never had to teach them about char*’s and pointer arithmetic. It would be nice if I could go straight to STL strings. But one day they’ll write the code “foo” + “bar”, and truly bizarre things will happen, and then I’ll have to stop and teach them all about char*’s anyway.

...

The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying “learn how to do it manually first, then use the wizzy tool to save time.” Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning.
techtariat  org:com  working-stiff  essay  programming  cs  software  abstraction  worrydream  thinking  intricacy  degrees-of-freedom  networking  examples  traces  no-go  volo-avolo  tradeoffs  c(pp)  pls  strings  dbs  transportation  driving  analogy  aphorism  learning  paradox  systems  elegance  nitty-gritty  concrete  cracker-prog  metal-to-virtual  protocol-metadata 
10 weeks ago by nhaliday
Computer latency: 1977-2017
If we look at overall results, the fastest machines are ancient. Newer machines are all over the place. Fancy gaming rigs with unusually high refresh-rate displays are almost competitive with machines from the late 70s and early 80s, but “normal” modern computers can’t compete with thirty to forty year old machines.

...

If we exclude the game boy color, which is a different class of device than the rest, all of the quickest devices are Apple phones or tablets. The next quickest device is the blackberry q10. Although we don’t have enough data to really tell why the blackberry q10 is unusually quick for a non-Apple device, one plausible guess is that it’s helped by having actual buttons, which are easier to implement with low latency than a touchscreen. The other two devices with actual buttons are the gameboy color and the kindle 4.

After that iphones and non-kindle button devices, we have a variety of Android devices of various ages. At the bottom, we have the ancient palm pilot 1000 followed by the kindles. The palm is hamstrung by a touchscreen and display created in an era with much slower touchscreen technology and the kindles use e-ink displays, which are much slower than the displays used on modern phones, so it’s not surprising to see those devices at the bottom.

...

Almost every computer and mobile device that people buy today is slower than common models of computers from the 70s and 80s. Low-latency gaming desktops and the ipad pro can get into the same range as quick machines from thirty to forty years ago, but most off-the-shelf devices aren’t even close.

If we had to pick one root cause of latency bloat, we might say that it’s because of “complexity”. Of course, we all know that complexity is bad. If you’ve been to a non-academic non-enterprise tech conference in the past decade, there’s a good chance that there was at least one talk on how complexity is the root of all evil and we should aspire to reduce complexity.

Unfortunately, it's a lot harder to remove complexity than to give a talk saying that we should remove complexity. A lot of the complexity buys us something, either directly or indirectly. When we looked at the input of a fancy modern keyboard vs. the apple 2 keyboard, we saw that using a relatively powerful and expensive general purpose processor to handle keyboard inputs can be slower than dedicated logic for the keyboard, which would both be simpler and cheaper. However, using the processor gives people the ability to easily customize the keyboard, and also pushes the problem of “programming” the keyboard from hardware into software, which reduces the cost of making the keyboard. The more expensive chip increases the manufacturing cost, but considering how much of the cost of these small-batch artisanal keyboards is the design cost, it seems like a net win to trade manufacturing cost for ease of programming.

...

If you want a reference to compare the kindle against, a moderately quick page turn in a physical book appears to be about 200 ms.

https://twitter.com/gravislizard/status/927593460642615296
almost everything on computers is perceptually slower than it was in 1983
https://archive.is/G3D5K
https://archive.is/vhDTL
https://archive.is/a3321
https://archive.is/imG7S
techtariat  dan-luu  performance  time  hardware  consumerism  objektbuch  data  history  reflection  critique  software  roots  tainter  engineering  nitty-gritty  ui  ux  hci  ios  mobile  apple  amazon  sequential  trends  increase-decrease  measure  analysis  measurement  os  systems  IEEE  intricacy  desktop  benchmarks  rant  carmack  system-design  degrees-of-freedom  keyboard  terminal  editors  links  input-output  networking  world  s:**  multi  twitter  social  discussion  tech  programming  web  internet  speed  backup  worrydream  interface  metal-to-virtual  latency-throughput  workflow 
10 weeks ago by nhaliday
Which of Haskell and OCaml is more practical? For example, in which aspect will each play a key role? - Quora
- Tikhon Jelvis,

Haskell.

This is a question I'm particularly well-placed to answer because I've spent quite a bit of time with both Haskell and OCaml, seeing both in the real world (including working at Jane Street for a bit). I've also seen the languages in academic settings and know many people at startups using both languages. This gives me a good perspective on both languages, with a fairly similar amount of experience in the two (admittedly biased towards Haskell).

And so, based on my own experience rather than the languages' reputations, I can confidently say it's Haskell.

Parallelism and Concurrency

...

Libraries

...

Typeclasses vs Modules

...

In some sense, OCaml modules are better behaved and founded on a sounder theory than Haskell typeclasses, which have some serious drawbacks. However, the fact that typeclasses can be reliably inferred whereas modules have to be explicitly used all the time more than makes up for this. Moreover, extensions to the typeclass system enable much of the power provided by OCaml modules.

...

Of course, OCaml has some advantages of its own as well. It has a performance profile that's much easier to predict. The module system is awesome and often missed in Haskell. Polymorphic variants can be very useful for neatly representing certain situations, and don't have an obvious Haskell analog.

While both languages have a reasonable C FFI, OCaml's seems a bit simpler. It's hard for me to say this with any certainty because I've only used the OCaml FFI myself, but it was quite easy to use—a hard bar for Haskell's to clear. One really nice use of modules in OCaml is to pass around values directly from C as abstract types, which can help avoid extra marshalling/unmarshalling; that seemed very nice in OCaml.

However, overall, I still think Haskell is the more practical choice. Apart from the reasoning above, I simply have my own observations: my Haskell code tends to be clearer, simpler and shorter than my OCaml code. I'm also more productive in Haskell. Part of this is certainly a matter of having more Haskell experience, but the delta is limited especially as I'm working at my third OCaml company. (Of course, the first two were just internships.)

Both Haskell and OCaml are uniquivocally superb options—miles ahead of any other languages I know. While I do prefer Haskell, I'd choose either one in a pinch.

--
I've looked at F# a bit, but it feels like it makes too many tradeoffs to be on .NET. You lose the module system, which is probably OCaml's best feature, in return for an unfortunate, nominally typed OOP layer.

I'm also not invested in .NET at all: if anything, I'd prefer to avoid it in favor of simplicity. I exclusively use Linux and, from the outside, Mono doesn't look as good as it could be. I'm also far more likely to interoperate with a C library than a .NET library.

If I had some additional reason to use .NET, I'd definitely go for F#, but right now I don't.

https://www.reddit.com/r/haskell/comments/3huexy/what_are_haskellers_critiques_of_f_and_ocaml/
https://www.reddit.com/r/haskell/comments/3huexy/what_are_haskellers_critiques_of_f_and_ocaml/cub5mmb/
Thinking about it now, it boils down to a single word: expressiveness. When I'm writing OCaml, I feel more constrained than when I'm writing Haskell. And that's important: unlike so many others, what first attracted me to Haskell was expressiveness, not safety. It's easier for me to write code that looks how I want it to look in Haskell. The upper bound on code quality is higher.

...

Perhaps it all boils down to OCaml and its community feeling more "worse is better" than Haskell, something I highly disfavor.

...

Laziness or, more strictly, non-strictness is big. A controversial start, perhaps, but I stand by it. Unlike some, I do not see non-strictness as a design mistake but as a leap in abstraction. Perhaps a leap before its time, but a leap nonetheless. Haskell lets me program without constantly keeping the code's order in my head. Sure, it's not perfect and sometimes performance issues jar the illusion, but they are the exception not the norm. Coming from imperative languages where order is omnipresent (I can't even imagine not thinking about execution order as I write an imperative program!) it's incredibly liberating, even accounting for the weird issues and jinks I'd never see in a strict language.

This is what I imagine life felt like with the first garbage collectors: they may have been slow and awkward, the abstraction might have leaked here and there, but, for all that, it was an incredible advance. You didn't have to constantly think about memory allocation any more. It took a lot of effort to get where we are now and garbage collectors still aren't perfect and don't fit everywhere, but it's hard to imagine the world without them. Non-strictness feels like it has the same potential, without anywhere near the work garbage collection saw put into it.

...

The other big thing that stands out are typeclasses. OCaml might catch up on this front with implicit modules or it might not (Scala implicits are, by many reports, awkward at best—ask Edward Kmett about it, not me) but, as it stands, not having them is a major shortcoming. Not having inference is a bigger deal than it seems: it makes all sorts of idioms we take for granted in Haskell awkward in OCaml which means that people simply don't use them. Haskell's typeclasses, for all their shortcomings (some of which I find rather annoying), are incredibly expressive.

In Haskell, it's trivial to create your own numeric type and operators work as expected. In OCaml, while you can write code that's polymorphic over numeric types, people simply don't. Why not? Because you'd have to explicitly convert your literals and because you'd have to explicitly open a module with your operators—good luck using multiple numeric types in a single block of code! This means that everyone uses the default types: (63/31-bit) ints and doubles. If that doesn't scream "worse is better", I don't know what does.

...

There's more. Haskell's effect management, brought up elsewhere in this thread, is a big boon. It makes changing things more comfortable and makes informal reasoning much easier. Haskell is the only language where I consistently leave code I visit better than I found it. Even if I hadn't worked on the project in years. My Haskell code has better longevity than my OCaml code, much less other languages.

http://blog.ezyang.com/2011/02/ocaml-gotchas/
One observation about purity and randomness: I think one of the things people frequently find annoying in Haskell is the fact that randomness involves mutation of state, and thus be wrapped in a monad. This makes building probabilistic data structures a little clunkier, since you can no longer expose pure interfaces. OCaml is not pure, and as such you can query the random number generator whenever you want.

However, I think Haskell may get the last laugh in certain circumstances. In particular, if you are using a random number generator in order to generate random test cases for your code, you need to be able to reproduce a particular set of random tests. Usually, this is done by providing a seed which you can then feed back to the testing script, for deterministic behavior. But because OCaml's random number generator manipulates global state, it's very easy to accidentally break determinism by asking for a random number for something unrelated. You can work around it by manually bracketing the global state, but explicitly handling the randomness state means providing determinism is much more natural.
q-n-a  qra  programming  pls  engineering  nitty-gritty  pragmatic  functional  haskell  ocaml-sml  dotnet  types  arrows  cost-benefit  tradeoffs  concurrency  libraries  performance  expert-experience  composition-decomposition  comparison  critique  multi  reddit  social  discussion  techtariat  reflection  review  random  data-structures  numerics  rand-approx  sublinear  syntax  volo-avolo  causation  scala  jvm  ecosystem  metal-to-virtual 
11 weeks ago by nhaliday
Hardware is unforgiving
Today, anyone with a CS 101 background can take Geoffrey Hinton's course on neural networks and deep learning, and start applying state of the art machine learning techniques in production within a couple months. In software land, you can fix minor bugs in real time. If it takes a whole day to run your regression test suite, you consider yourself lucky because it means you're in one of the few environments that takes testing seriously. If the architecture is fundamentally flawed, you pull out your copy of Feathers' “Working Effectively with Legacy Code” and you apply minor fixes until you're done.

This isn't to say that software isn't hard, it's just a different kind of hard: the sort of hard that can be attacked with genius and perseverance, even without experience. But, if you want to build a ship, and you "only" have a decade of experience with carpentry, milling, metalworking, etc., well, good luck. You're going to need it. With a large ship, “minor” fixes can take days or weeks, and a fundamental flaw means that your ship sinks and you've lost half a year of work and tens of millions of dollars. By the time you get to something with the complexity of a modern high-performance microprocessor, a minor bug discovered in production costs three months and five million dollars. A fundamental flaw in the architecture will cost you five years and hundreds of millions of dollars2.

Physical mistakes are costly. There's no undo and editing isn't simply a matter of pressing some keys; changes consume real, physical resources. You need enough wisdom and experience to avoid common mistakes entirely – especially the ones that can't be fixed.
techtariat  comparison  software  hardware  programming  engineering  nitty-gritty  realness  roots  explanans  startups  tech  sv  the-world-is-just-atoms  examples  stories  economics  heavy-industry  hard-tech  cs  IEEE  oceans  trade  korea  asia  recruiting  britain  anglo  expert-experience  growth-econ  world  developing-world  books  recommendations  intricacy  dan-luu  age-generation  system-design  correctness  metal-to-virtual 
june 2019 by nhaliday
One week of bugs
If I had to guess, I'd say I probably work around hundreds of bugs in an average week, and thousands in a bad week. It's not unusual for me to run into a hundred new bugs in a single week. But I often get skepticism when I mention that I run into multiple new (to me) bugs per day, and that this is inevitable if we don't change how we write tests. Well, here's a log of one week of bugs, limited to bugs that were new to me that week. After a brief description of the bugs, I'll talk about what we can do to improve the situation. The obvious answer to spend more effort on testing, but everyone already knows we should do that and no one does it. That doesn't mean it's hopeless, though.

...

Here's where I'm supposed to write an appeal to take testing more seriously and put real effort into it. But we all know that's not going to work. It would take 90k LOC of tests to get Julia to be as well tested as a poorly tested prototype (falsely assuming linear complexity in size). That's two person-years of work, not even including time to debug and fix bugs (which probably brings it closer to four of five years). Who's going to do that? No one. Writing tests is like writing documentation. Everyone already knows you should do it. Telling people they should do it adds zero information1.

Given that people aren't going to put any effort into testing, what's the best way to do it?

Property-based testing. Generative testing. Random testing. Concolic Testing (which was done long before the term was coined). Static analysis. Fuzzing. Statistical bug finding. There are lots of options. Some of them are actually the same thing because the terminology we use is inconsistent and buggy. I'm going to arbitrarily pick one to talk about, but they're all worth looking into.

...

There are a lot of great resources out there, but if you're just getting started, I found this description of types of fuzzers to be one of those most helpful (and simplest) things I've read.

John Regehr has a udacity course on software testing. I haven't worked through it yet (Pablo Torres just pointed to it), but given the quality of Dr. Regehr's writing, I expect the course to be good.

For more on my perspective on testing, there's this.

https://hypothesis.works/articles/the-purpose-of-hypothesis/
From the perspective of a user, the purpose of Hypothesis is to make it easier for you to write better tests.

From my perspective as the primary author, that is of course also a purpose of Hypothesis. I write a lot of code, it needs testing, and the idea of trying to do that without Hypothesis has become nearly unthinkable.

But, on a large scale, the true purpose of Hypothesis is to drag the world kicking and screaming into a new and terrifying age of high quality software.

Software is everywhere. We have built a civilization on it, and it’s only getting more prevalent as more services move online and embedded and “internet of things” devices become cheaper and more common.

Software is also terrible. It’s buggy, it’s insecure, and it’s rarely well thought out.

This combination is clearly a recipe for disaster.

The state of software testing is even worse. It’s uncontroversial at this point that you should be testing your code, but it’s a rare codebase whose authors could honestly claim that they feel its testing is sufficient.

Much of the problem here is that it’s too hard to write good tests. Tests take up a vast quantity of development time, but they mostly just laboriously encode exactly the same assumptions and fallacies that the authors had when they wrote the code, so they miss exactly the same bugs that you missed when they wrote the code.

Preventing the Collapse of Civilization [video]: https://news.ycombinator.com/item?id=19945452
- Jonathan Blow

NB: DevGAMM is a game industry conference

- loss of technological knowledge (Antikythera mechanism, aqueducts, etc.)
- hardware driving most gains, not software
- software's actually less robust, often poorly designed and overengineered these days
- *list of bugs he's encountered recently*:
https://youtu.be/pW-SOdj4Kkk?t=1387
- knowledge of trivia becomes more than general, deep knowledge
- does at least acknowledge value of DRY, reusing code, abstraction saving dev time
techtariat  dan-luu  tech  software  error  list  debugging  linux  github  robust  checking  oss  troll  lol  aphorism  webapp  email  google  facebook  games  julia  pls  compilers  communication  mooc  browser  rust  programming  engineering  random  jargon  formal-methods  expert-experience  prof  c(pp)  course  correctness  hn  commentary  video  presentation  carmack  pragmatic  contrarianism  pessimism  sv  unix  rhetoric  critique  worrydream  hardware  performance  trends  multiplicative  roots  impact  comparison  history  iron-age  the-classics  mediterranean  conquest-empire  gibbon  technology  the-world-is-just-atoms  flux-stasis  increase-decrease  graphics  hmm  idk  systems  os  abstraction  intricacy  worse-is-better/the-right-thing  build-packaging  microsoft  osx  apple  reflection  assembly  things  knowledge  detail-architecture  thick-thin  trivia  info-dynamics  caching  frameworks  generalization  systematic-ad-hoc  universalism-particularism  analytical-holistic  structure  tainter  libraries  tradeoffs  prepping  threat-modeling  network-structure  writing  risk  local-glob 
may 2019 by nhaliday
When to use C over C++, and C++ over C? - Software Engineering Stack Exchange
You pick C when
- you need portable assembler (which is what C is, really) for whatever reason,
- your platform doesn't provide C++ (a C compiler is much easier to implement),
- you need to interact with other languages that can only interact with C (usually the lowest common denominator on any platform) and your code consists of little more than the interface, not making it worth to lay a C interface over C++ code,
- you hack in an Open Source project (many of which, for various reasons, stick to C),
- you don't know C++.
In all other cases you should pick C++.

--

At the same time, I have to say that @Toll's answers (for one obvious example) have things just about backwards in most respects. Reasonably written C++ will generally be at least as fast as C, and often at least a little faster. Readability is generally much better, if only because you don't get buried in an avalanche of all the code for even the most trivial algorithms and data structures, all the error handling, etc.

...

As it happens, C and C++ are fairly frequently used together on the same projects, maintained by the same people. This allows something that's otherwise quite rare: a study that directly, objectively compares the maintainability of code written in the two languages by people who are equally competent overall (i.e., the exact same people). At least in the linked study, one conclusion was clear and unambiguous: "We found that using C++ instead of C results in improved software quality and reduced maintenance effort..."

--

(Side-note: Check out Linus Torvads' rant on why he prefers C to C++. I don't necessarily agree with his points, but it gives you insight into why people might choose C over C++. Rather, people that agree with him might choose C for these reasons.)

http://harmful.cat-v.org/software/c++/linus

Why would anybody use C over C++? [closed]: https://stackoverflow.com/questions/497786/why-would-anybody-use-c-over-c
Joel's answer is good for reasons you might have to use C, though there are a few others:
- You must meet industry guidelines, which are easier to prove and test for in C.
- You have tools to work with C, but not C++ (think not just about the compiler, but all the support tools, coverage, analysis, etc)
- Your target developers are C gurus
- You're writing drivers, kernels, or other low level code
- You know the C++ compiler isn't good at optimizing the kind of code you need to write
- Your app not only doesn't lend itself to be object oriented, but would be harder to write in that form

In some cases, though, you might want to use C rather than C++:
- You want the performance of assembler without the trouble of coding in assembler (C++ is, in theory, capable of 'perfect' performance, but the compilers aren't as good at seeing optimizations a good C programmer will see)
- The software you're writing is trivial, or nearly so - whip out the tiny C compiler, write a few lines of code, compile and you're all set - no need to open a huge editor with helpers, no need to write practically empty and useless classes, deal with namespaces, etc. You can do nearly the same thing with a C++ compiler and simply use the C subset, but the C++ compiler is slower, even for tiny programs.
- You need extreme performance or small code size, and know the C++ compiler will actually make it harder to accomplish due to the size and performance of the libraries
- You contend that you could just use the C subset and compile with a C++ compiler, but you'll find that if you do that you'll get slightly different results depending on the compiler.

Regardless, if you're doing that, you're using C. Is your question really "Why don't C programmers use C++ compilers?" If it is, then you either don't understand the language differences, or you don't understand compiler theory.

--

- Because they already know C
- Because they're building an embedded app for a platform that only has a C compiler
- Because they're maintaining legacy software written in C
- You're writing something on the level of an operating system, a relational database engine, or a retail 3D video game engine.
q-n-a  stackex  programming  engineering  pls  best-practices  impetus  checklists  c(pp)  systems  assembly  compilers  hardware  embedded  oss  links  study  evidence-based  devtools  performance  rant  expert-experience  types  blowhards  linux  git  vcs  debate  rhetoric  worse-is-better/the-right-thing  cracker-prog  multi  metal-to-virtual 
may 2019 by nhaliday
What is "vectorization"? - Stack Overflow
Many CPUs have "vector" or "SIMD" instruction sets which apply the same operation simultaneously to two, four, or more pieces of data. Modern x86 chips have the SSE instructions, many PPC chips have the "Altivec" instructions, and even some ARM chips have a vector instruction set, called NEON.

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

(I chose 4 because it's what modern hardware is most likely to directly support; the term "vectorization" is also used to describe a higher level software transformation where you might just abstract away the loop altogether and just describe operating on arrays instead of the elements that comprise them)
q-n-a  stackex  programming  systems  performance  concurrency  numerics  metal-to-virtual  assembly 
april 2019 by nhaliday
Recitation 25: Data locality and B-trees
The same idea can be applied to trees. Binary trees are not good for locality because a given node of the binary tree probably occupies only a fraction of a cache line. B-trees are a way to get better locality. As in the hash table trick above, we store several elements in a single node -- as many as will fit in a cache line.

B-trees were originally invented for storing data structures on disk, where locality is even more crucial than with memory. Accessing a disk location takes about 5ms = 5,000,000ns. Therefore if you are storing a tree on disk you want to make sure that a given disk read is as effective as possible. B-trees, with their high branching factor, ensure that few disk reads are needed to navigate to the place where data is stored. B-trees are also useful for in-memory data structures because these days main memory is almost as slow relative to the processor as disk drives were when B-trees were introduced!
nibble  org:junk  org:edu  cornell  lecture-notes  exposition  programming  engineering  systems  dbs  caching  performance  memory-management  os  computer-memory  metal-to-virtual 
september 2017 by nhaliday
Anatomy of an SQL Index: What is an SQL Index
“An index makes the query fast” is the most basic explanation of an index I have ever seen. Although it describes the most important aspect of an index very well, it is—unfortunately—not sufficient for this book. This chapter describes the index structure in a less superficial way but doesn't dive too deeply into details. It provides just enough insight for one to understand the SQL performance aspects discussed throughout the book.

B-trees, etc.
techtariat  tutorial  explanation  performance  programming  engineering  dbs  trees  data-structures  nibble  caching  metal-to-virtual  abstraction  applications 
september 2017 by nhaliday

related tags

:)  abstraction  accretion  accuracy  acm  additive  advanced  advice  aesthetics  age-generation  aging  algorithmic-econ  algorithms  amazon  analogy  analysis  analytical-holistic  anglo  aphorism  app  apple  applicability-prereqs  applications  arbitrage  arrows  art  asia  assembly  atoms  backup  bangbang  benchmarks  best-practices  big-picture  blog  blowhards  books  britain  browser  build-packaging  business  c(pp)  caching  caltech  career  carmack  causation  chart  cheatsheet  checking  checklists  civilization  cmu  code-dive  code-organizing  commentary  common-case  communication  comparison  competition  compilers  complex-systems  complexity  composition-decomposition  computation  computer-memory  concrete  concurrency  confluence  conquest-empire  constraint-satisfaction  consumerism  contrarianism  cool  cornell  correctness  correlation  cost-benefit  coupling-cohesion  course  cracker-prog  critique  crypto  cs  culture  dan-luu  dark-arts  data  data-structures  dataviz  dbs  debate  debugging  degrees-of-freedom  desktop  detail-architecture  developing-world  devops  devtools  diogenes  direct-indirect  discussion  distributed  divide-and-conquer  documentation  dotnet  driving  dynamic  economics  ecosystem  editors  electromag  elegance  email  embedded  ends-means  engineering  erik-demaine  error  error-handling  essay  estimate  evidence-based  examples  expert-experience  explanans  explanation  exposition  extrema  facebook  flux-stasis  formal-methods  fourier  frameworks  frequency  frontier  functional  game-theory  games  generalization  gibbon  git  github  golang  google  graph-theory  graphics  graphs  ground-up  growth-econ  guide  hacker  hard-tech  hardware  haskell  hci  heavy-industry  history  hmm  hn  homepage  howto  ideas  idk  IEEE  impact  impetus  incentives  increase-decrease  info-dynamics  info-foraging  infographic  init  input-output  integration-extension  interface  internet  interview-prep  intricacy  investing  ios  iron-age  jargon  javascript  jobs  julia  jvm  keyboard  knowledge  korea  latency-throughput  learning  learning-theory  lecture-notes  lectures  legacy  lens  let-me-see  levers  libraries  links  linux  lisp  list  local-global  lol  long-short-run  machine-learning  magnitude  management  marginal  math  math.CA  math.CO  measure  measurement  mechanism-design  mediterranean  memory-management  metal-to-virtual  metrics  microsoft  minimalism  minimum-viable  mit  mobile  models  mooc  multi  multiplicative  network-structure  networking  nibble  nitty-gritty  no-go  nostalgia  numerics  objektbuch  ocaml-sml  oceans  ocw  oop  orders  org:com  org:edu  org:junk  org:med  os  oscillation  oss  osx  overflow  p:null  p:someday  paradox  parsimony  paste  people  performance  pessimism  philosophy  physics  pic  play  pls  plt  pragmatic  prediction  prepping  presentation  pro-rata  probability  problem-solving  productivity  prof  profile  programming  protocol-metadata  python  q-n-a  qra  quantitative-qualitative  questions  quixotic  rand-approx  random  rant  reading  realness  recommendations  recruiting  reddit  reference  reflection  relativity  repo  retrofit  review  rhetoric  risk  robust  roots  rust  s:**  s:***  scala  scale  scaling-tech  sci-comp  security  sequential  series  simulation  slides  social  software  speed  stackex  startups  static-dynamic  stories  stream  street-fighting  strings  structure  study  sublinear  summary  summer-2014  sv  syntax  synthesis  system-design  systematic-ad-hoc  systems  tainter  tcs  tech  tech-infrastructure  technology  techtariat  telos-atelos  terminal  the-classics  the-world-is-just-atoms  thick-thin  things  thinking  threat-modeling  tidbits  time  time-complexity  tip-of-tongue  tools  top-n  traces  trade  tradeoffs  transportation  trees  trends  tricks  trivia  troll  tutorial  twitter  types  ui  unit  universalism-particularism  unix  ux  vcs  video  virtualization  visual-understanding  visualization  volo-avolo  waves  web  webapp  workflow  working-stiff  world  worrydream  worse-is-better/the-right-thing  worst-case  writing  yoga  🔬  🖥 

Copy this bookmark:



description:


tags: