nhaliday + course   172

One week of bugs
If I had to guess, I'd say I probably work around hundreds of bugs in an average week, and thousands in a bad week. It's not unusual for me to run into a hundred new bugs in a single week. But I often get skepticism when I mention that I run into multiple new (to me) bugs per day, and that this is inevitable if we don't change how we write tests. Well, here's a log of one week of bugs, limited to bugs that were new to me that week. After a brief description of the bugs, I'll talk about what we can do to improve the situation. The obvious answer to spend more effort on testing, but everyone already knows we should do that and no one does it. That doesn't mean it's hopeless, though.


Here's where I'm supposed to write an appeal to take testing more seriously and put real effort into it. But we all know that's not going to work. It would take 90k LOC of tests to get Julia to be as well tested as a poorly tested prototype (falsely assuming linear complexity in size). That's two person-years of work, not even including time to debug and fix bugs (which probably brings it closer to four of five years). Who's going to do that? No one. Writing tests is like writing documentation. Everyone already knows you should do it. Telling people they should do it adds zero information1.

Given that people aren't going to put any effort into testing, what's the best way to do it?

Property-based testing. Generative testing. Random testing. Concolic Testing (which was done long before the term was coined). Static analysis. Fuzzing. Statistical bug finding. There are lots of options. Some of them are actually the same thing because the terminology we use is inconsistent and buggy. I'm going to arbitrarily pick one to talk about, but they're all worth looking into.


There are a lot of great resources out there, but if you're just getting started, I found this description of types of fuzzers to be one of those most helpful (and simplest) things I've read.

John Regehr has a udacity course on software testing. I haven't worked through it yet (Pablo Torres just pointed to it), but given the quality of Dr. Regehr's writing, I expect the course to be good.

For more on my perspective on testing, there's this.

From the perspective of a user, the purpose of Hypothesis is to make it easier for you to write better tests.

From my perspective as the primary author, that is of course also a purpose of Hypothesis. I write a lot of code, it needs testing, and the idea of trying to do that without Hypothesis has become nearly unthinkable.

But, on a large scale, the true purpose of Hypothesis is to drag the world kicking and screaming into a new and terrifying age of high quality software.

Software is everywhere. We have built a civilization on it, and it’s only getting more prevalent as more services move online and embedded and “internet of things” devices become cheaper and more common.

Software is also terrible. It’s buggy, it’s insecure, and it’s rarely well thought out.

This combination is clearly a recipe for disaster.

The state of software testing is even worse. It’s uncontroversial at this point that you should be testing your code, but it’s a rare codebase whose authors could honestly claim that they feel its testing is sufficient.

Much of the problem here is that it’s too hard to write good tests. Tests take up a vast quantity of development time, but they mostly just laboriously encode exactly the same assumptions and fallacies that the authors had when they wrote the code, so they miss exactly the same bugs that you missed when they wrote the code.

Preventing the Collapse of Civilization [video]: https://news.ycombinator.com/item?id=19945452
- Jonathan Blow

NB: DevGAMM is a game industry conference

- loss of technological knowledge (Antikythera mechanism, aqueducts, etc.)
- hardware driving most gains, not software
- software's actually less robust, often poorly designed and overengineered these days
- *list of bugs he's encountered recently*:
- knowledge of trivia becomes more than general, deep knowledge
- does at least acknowledge value of DRY, reusing code, abstraction saving dev time
techtariat  dan-luu  tech  software  error  list  debugging  linux  github  robust  checking  oss  troll  lol  aphorism  webapp  email  google  facebook  games  julia  pls  compilers  communication  mooc  browser  rust  programming  engineering  random  jargon  formal-methods  expert-experience  prof  c(pp)  course  correctness  hn  commentary  video  presentation  carmack  pragmatic  contrarianism  pessimism  sv  unix  rhetoric  critique  worrydream  hardware  performance  trends  multiplicative  roots  impact  comparison  history  iron-age  the-classics  mediterranean  conquest-empire  gibbon  technology  the-world-is-just-atoms  flux-stasis  increase-decrease  graphics  hmm  idk  systems  os  abstraction  intricacy  worse-is-better/the-right-thing  build-packaging  microsoft  osx  apple  reflection  assembly  things  knowledge  detail-architecture  thick-thin  trivia  info-dynamics  caching  frameworks  generalization  systematic-ad-hoc  universalism-particularism  analytical-holistic  structure  tainter  libraries  tradeoffs  prepping  threat-modeling  network-structure  writing  risk  local-glob 
7 weeks ago by nhaliday
Stat 260/CS 294: Bayesian Modeling and Inference
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling 
july 2017 by nhaliday
6.896: Essential Coding Theory
- probabilistic method and Chernoff bound for Shannon coding
- probabilistic method for asymptotically good Hamming codes (Gilbert coding)
- sparsity used for LDPC codes
mit  course  yoga  tcs  complexity  coding-theory  math.AG  fields  polynomials  pigeonhole-markov  linear-algebra  probabilistic-method  lecture-notes  bits  sparsity  concentration-of-measure  linear-programming  linearity  expanders  hamming  pseudorandomness  crypto  rigorous-crypto  communication-complexity  no-go  madhu-sudan  shannon  unit  p:**  quixotic  advanced 
february 2017 by nhaliday
CS 731 Advanced Artificial Intelligence - Spring 2011
- statistical machine learning
- sparsity in regression
- graphical models
- exponential families
- variational methods
- dimensionality reduction, eg, PCA
- Bayesian nonparametrics
- compressive sensing, matrix completion, and Johnson-Lindenstrauss
course  lecture-notes  yoga  acm  stats  machine-learning  graphical-models  graphs  model-class  bayesian  learning-theory  sparsity  embeddings  markov  monte-carlo  norms  unit  nonparametric  compressed-sensing  matrix-factorization  features 
january 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : meta

related tags

aaronson  abstraction  academia  accretion  acm  acmtariat  advanced  adversarial  agriculture  ai  ai-control  albion  alg-combo  algebra  algebraic-complexity  algorithmic-econ  algorithms  alignment  allodium  amazon  analogy  analytical-holistic  anglosphere  ankur-moitra  antidemos  antiquity  aphorism  apollonian-dionysian  apple  applications  approximation  aristos  arrows  art  article  asia  assembly  atmosphere  atoms  authoritarianism  automata-languages  automation  average-case  backup  baez  bandits  bare-hands  barons  bayesian  being-becoming  benevolence  berkeley  best-practices  better-explained  big-peeps  big-picture  binomial  bio  biodet  bioinformatics  biotech  bits  boaz-barak  books  boolean-analysis  bounded-cognition  brands  britain  broad-econ  browser  brunn-minkowski  build-packaging  business  business-models  c(pp)  caching  calculation  california  caltech  cancer  canon  capital  capitalism  carmack  cartoons  certificates-recognition  chaining  characterization  chart  checking  chemistry  chicago  china  circuits  civil-liberty  civilization  class  climate-change  cliometrics  cmu  coarse-fine  coding-theory  cold-war  collaboration  columbia  combo-optimization  commentary  common-case  communication  communication-complexity  community  comparison  compensation  competition  compilers  complement-substitute  complex-systems  complexity  composition-decomposition  compressed-sensing  computation  computer-memory  computer-vision  concentration-of-measure  concept  concrete  concurrency  confidence  conquest-empire  constraint-satisfaction  contrarianism  convexity-curvature  cooperate-defect  cornell  correctness  counting  coupling-cohesion  courage  course  cracker-prog  creative  crime  critique  crooked  crypto  cs  curvature  cycles  cynicism-idealism  dan-luu  dana-moshkovitz  dark-arts  darwinian  data  data-science  data-structures  database  dbs  death  debt  debugging  decision-making  decision-theory  deep-learning  deep-materialism  definite-planning  definition  degrees-of-freedom  democracy  detail-architecture  developing-world  differential  differential-privacy  dimensionality  dirty-hands  discrete  discussion  distribution  divergence  documentation  DP  draft  driving  dropbox  drugs  duality  duplication  dynamic  dynamical  early-modern  economics  econotariat  education  efficiency  egalitarianism-hierarchy  ego-depletion  einstein  elegance  elite  email  embeddings  ems  encyclopedic  energy-resources  engineering  enhancement  ensembles  entanglement  entrepreneurialism  entropy-like  environment  envy  equilibrium  ergodic  erik-demaine  error  essence-existence  estimate  ethics  europe  evolution  examples  existence  expanders  expert  expert-experience  explanans  exploratory  explore-exploit  exposition  extra-introversion  extrema  facebook  fall-2016  fashun  FDA  features  fermi  feudal  fiction  fields  finance  flexibility  fluid  flux-stasis  focus  formal-methods  fourier  frameworks  frequentist  frontend  frontier  functional  futurism  gallic  game-theory  games  gaussian-processes  generalization  generative  genetics  genomics  geoengineering  geography  geometry  georgia  germanic  giants  gibbon  github  gnosis-logos  god-man-beast-victim  google  government  gowers  gradient-descent  graph-theory  graphical-models  graphics  graphs  gravity  greedy  gregory-clark  ground-up  growth-econ  guide  hamming  hard-core  hard-tech  hardness  hardware  harvard  hashing  haskell  heavy-industry  heterodox  heuristic  hi-order-bits  hidden-motives  hierarchy  high-dimension  high-variance  higher-ed  history  hmm  hn  homo-hetero  homogeneity  honor  huge-data-the-biggest  human-ml  hypocrisy  idk  IEEE  impact  impetus  increase-decrease  india  individualism-collectivism  industrial-revolution  inequality  info-dynamics  info-foraging  information-theory  infrastructure  init  innovation  input-output  insight  institutions  intel  interdisciplinary  interests  intricacy  investing  iron-age  ising  israel  iteration-recursion  iterative-methods  janus  japan  jargon  javascript  jelani-nelson  julia  justice  kernels  knowledge  language  latent-variables  latin-america  law  leadership  learning  learning-theory  lecture-notes  lectures  lens  leviathan  libraries  limits  linear-algebra  linear-models  linear-programming  linearity  links  linux  list  literature  local-global  lol  long-term  longevity  love-hate  lower-bounds  luca-trevisan  machine-learning  macro  madhu-sudan  magnitude  malthus  management  marginal  marginal-rev  market-power  markets  markov  martingale  matching  math  math.AG  math.CA  math.CO  math.DS  math.FA  math.GR  math.MG  math.NT  mathtariat  matrix-factorization  measurement  mechanics  mechanism-design  media  medicine  medieval  mediterranean  memory-management  MENA  mental-math  meta:reading  meta:research  meta:science  metabuch  metameta  methodology  metric-space  michael-jordan  micro  microfoundations  microsoft  mihai  mit  mixing  ML-MAP-E  mobile  model-class  models  mokyr-allen-mccloskey  moments  monetary-fiscal  monte-carlo  mooc  morality  mostly-modern  motivation  multi  multiplicative  musk  myth  n-factor  narrative  nationalism-globalism  naturality  nature  network-structure  neuro  new-religion  nietzschean  nitty-gritty  nlp  no-go  noble-lie  nonlinearity  nonparametric  norms  north-weingast-like  northeast  nuclear  nutrition  nyc  occam  occident  ocw  off-convex  old-anglo  oly  online-learning  open-closed  openai  optimism  optimization  order-disorder  ORFE  org:edu  org:fin  org:inst  org:junk  org:mat  org:med  organization  organizing  orient  os  oscillation  oss  osx  outcome-risk  outliers  oxbridge  p:*  p:**  p:***  p:someday  p:whenever  PAC  paradox  parallax  parametric  parsimony  path-dependence  patience  pcp  pdf  peace-violence  pennsylvania  people  percolation  performance  personality  perturbation  pessimism  phalanges  pharma  phase-transition  philosophy  physics  pigeonhole-markov  play  plots  pls  plt  polanyi-marx  polarization  polisci  political-econ  politics  polynomials  postmortem  potential  power  power-law  ppl  pragmatic  pre-ww2  prepping  preprint  presentation  primitivism  princeton  priors-posteriors  privacy  pro-rata  probabilistic-method  probability  problem-solving  prof  programming  proof-systems  proofs  properties  pseudorandomness  psych-architecture  puzzles  quantitative-qualitative  quantum  quantum-info  questions  quixotic  quotes  rand-approx  rand-complexity  random  random-matrices  random-networks  randy-ayndy  ranking  reading  realness  reason  rec-math  recruiting  reddit  redistribution  reduction  reference  reflection  regression  regularization  regulation  reinforcement  relativization  relaxation  religion  rent-seeking  replication  research  revolution  rhetoric  rhythm  rigidity  rigorous-crypto  risk  ritual  robotics  robust  roots  rounding  rust  ryan-odonnell  s:*  salil-vadhan  sample-complexity  sampling  sanjeev-arora  scale  scholar  science  scifi-fantasy  scitariat  SDP  search  securities  security  seminar  sensitivity  shakespeare  shalizi  shannon  shift  SIGGRAPH  signal-noise  signaling  simulation  sinosphere  skeleton  skunkworks  slides  smoothness  social  social-choice  social-norms  social-science  socs-and-mops  software  space  space-complexity  sparsity  spatial  spearhead  spectral  speed  speedometer  spring-2017  stagnation  stanford  startups  stat-mech  statesmen  stats  status  stereotypes  stochastic-processes  stock-flow  stories  strategy  stream  street-fighting  structure  studying  stylized-facts  sublinear  submodular  success  sum-of-squares  summary  summer-2014  sv  symmetry  synchrony  synthesis  system-design  systematic-ad-hoc  systems  tactics  tails  tainter  talks  tcs  tcstariat  teaching  tech  technology  techtariat  telos-atelos  temperature  texas  the-classics  the-devil  the-founding  the-great-west-whale  the-prices  the-watchers  the-west  the-world-is-just-atoms  theory-of-mind  theos  thermo  thick-thin  thiel  things  thinking  threat-modeling  tidbits  tim-roughgarden  time  time-preference  todo  toolkit  topics  topology  track-record  trade  tradeoffs  transportation  trees  trends  tribalism  trivia  troll  trust  truth  turing  tutorial  twitter  UGC  uncertainty  unintended-consequences  unit  universalism-particularism  unix  urban  urban-rural  us-them  usa  valiant  vc-dimension  venture  video  visual-understanding  visualization  vitality  volo-avolo  war  washington  waves  wealth  wealth-of-nations  web  webapp  welfare-state  wigderson  winner-take-all  winter-2015  winter-2016  winter-2017  wire-guided  wisdom  within-without  wonkish  working-stiff  world-war  wormholes  worrydream  worse-is-better/the-right-thing  writing  X-not-about-Y  yoga  zero-positive-sum  zooming  🎩  👳  🔬 

Copy this bookmark: