nhaliday + scaling-tech   44

Why is Software Engineering so difficult? - James Miller
basic message: No silver bullet!

most interesting nuggets:
Scale and Complexity
- Windows 7 > 50 million LOC
Expect a staggering number of bugs.

- Well-written C and C++ code contains some 5 to 10 errors per 100 LOC after a clean compile, but before inspection and testing.
- At a 5% rate any 50 MLOC program will start off with some 2.5 million bugs.

Bug removal
- Testing typically exercises only half the code.

Better bug removal?
- There are better ways to do testing that do produce fantastic programs.”
- Are we sure about this fact?
* No, its only an opinion!
* In general Software Engineering has ....

So why not do this?
- The costs are unbelievable.
- It’s not unusual for the qualification process to produce a half page of documentation for each line of code.
pdf  slides  engineering  nitty-gritty  programming  best-practices  roots  comparison  cost-benefit  software  systematic-ad-hoc  structure  error  frontier  debugging  checking  formal-methods  context  detail-architecture  intricacy  big-picture  system-design  correctness  scale  scaling-tech  shipping  money  data  stylized-facts  street-fighting  objektbuch  pro-rata  estimate  pessimism  degrees-of-freedom  volo-avolo  no-go  things  thinking  summary  quality  density 
may 2019 by nhaliday
How much would it cost to crawl 1 billion sites using rented AWS servers/bandwidth? - Quora
The best way IMHO to do such a crawl would be to recruit a group of say 100-1000 of your friends, and their friends, and write a simple distributed app running in background on their machines, when they sit idle or are lightly used. This way you will be amortizing their monthly broadband bills, with their monthly quotas (e.g. Comcast 250GB) largely unused anyway. I would think that you can get dozens of Mbps of cross bandwidth in such a network, which could do the job in a matter of months.

BTW, if you really meant 1 billion sites, as opposed to pages, multiply the above bills by 100x (average number of pages per site).


There is no need for you to crawl. Someone has already done the job for you. Common Crawl https://commoncrawl.org/ is a periodic crawl of the internet, and the results are stored in Amazon S3. You can directly use the results without any charge for any kink of analysis you want to do.
q-n-a  qra  quixotic  programming  engineering  search  minimum-viable  internet  web  huge-data-the-biggest  howto  init  advice  money  cost-benefit  strategy  scaling-tech  system-design 
may 2019 by nhaliday
- the genetic book of the dead [Dawkins]
- complementarity [Frank Wilczek]
- relative information
- effective theory [Lisa Randall]
- affordances [Dennett]
- spontaneous symmetry breaking
- relatedly, equipoise [Nicholas Christakis]
- case-based reasoning
- population reasoning (eg, common law)
- criticality [Cesar Hidalgo]
- Haldan's law of the right size (!SCALE!)
- polygenic scores
- non-ergodic
- ansatz
- state [Aaronson]: http://www.scottaaronson.com/blog/?p=3075
- transfer learning
- effect size
- satisficing
- scaling
- the breeder's equation [Greg Cochran]
- impedance matching

- reciprocal altruism
- life history [Plomin]
- intellectual honesty [Sam Harris]
- coalitional instinct (interesting claim: building coalitions around "rationality" actually makes it more difficult to update on new evidence as it makes you look like a bad person, eg, the Cathedral)
basically same: https://twitter.com/ortoiseortoise/status/903682354367143936

more: https://www.edge.org/conversation/john_tooby-coalitional-instincts

interesting timing. how woke is this dude?
org:edge  2017  technology  discussion  trends  list  expert  science  top-n  frontier  multi  big-picture  links  the-world-is-just-atoms  metameta  🔬  scitariat  conceptual-vocab  coalitions  q-n-a  psychology  social-psych  anthropology  instinct  coordination  duty  power  status  info-dynamics  cultural-dynamics  being-right  realness  cooperate-defect  westminster  chart  zeitgeist  rot  roots  epistemic  rationality  meta:science  analogy  physics  electromag  geoengineering  environment  atmosphere  climate-change  waves  information-theory  bits  marginal  quantum  metabuch  homo-hetero  thinking  sapiens  genetics  genomics  evolution  bio  GT-101  low-hanging  minimum-viable  dennett  philosophy  cog-psych  neurons  symmetry  humility  life-history  social-structure  GWAS  behavioral-gen  biodet  missing-heritability  ergodic  machine-learning  generalization  west-hunter  population-genetics  methodology  blowhards  spearhead  group-level  scale  magnitude  business  scaling-tech  tech  business-models  optimization  effect-size  aaronson  state  bare-hands  problem-solving  politics 
may 2017 by nhaliday

bundles : engframetechie

related tags

2016-election  aaronson  abstraction  accessibility  accuracy  advanced  advice  aggregator  algorithms  altruism  amazon  analogy  analysis  anthropology  apollonian-dionysian  asia  atmosphere  axelrod  bare-hands  behavioral-gen  being-right  benchmarks  best-practices  big-picture  bio  biodet  bits  blog  blowhards  books  build-packaging  business  business-models  c(pp)  caching  canon  career  carmack  chart  cheatsheet  checking  checklists  christianity  climate-change  cloud  coalitions  code-dive  code-organizing  cog-psych  cohesion  collaboration  commentary  comparison  compilers  computer-memory  computer-vision  concept  conceptual-vocab  concurrency  config  constraint-satisfaction  context  contradiction  contrarianism  cooperate-defect  coordination  correctness  correlation  cost-benefit  counter-revolution  coupling-cohesion  cracker-prog  creative  critique  crypto  cultural-dynamics  culture-war  cynicism-idealism  dan-luu  darwinian  data  data-science  data-structures  database  dataviz  dbs  debate  debugging  degrees-of-freedom  democracy  dennett  density  dependence-independence  detail-architecture  devops  devtools  direct-indirect  discussion  distributed  distribution  diversity  dropbox  duty  dynamic  effect-size  electromag  elegance  embedded-cognition  emergent  empirical  engineering  environment  epistemic  ergodic  erlang  error  error-handling  essay  estimate  europe  evolution  expectancy  expert  expert-experience  explanans  explanation  exposition  fashun  fermi  formal-methods  free  frontend  frontier  functional  generalization  genetics  genomics  geoengineering  git  gnosis-logos  golang  google  government  graphs  group-level  GT-101  guide  GWAS  hardware  hashing  haskell  hidden-motives  history  hn  homepage  homo-hetero  howto  huge-data-the-biggest  human-ml  humility  ideas  identity-politics  ideology  idk  IEEE  illusion  impetus  info-dynamics  information-theory  init  innovation  instinct  internet  interview-prep  intricacy  iron-age  is-ought  javascript  jobs  journos-pundits  jvm  knowledge  kumbaya-kult  latency-throughput  lecture-notes  left-wing  let-me-see  leviathan  libraries  life-history  links  linux  list  low-hanging  machine-learning  magnitude  malaise  management  marginal  matching  matrix-factorization  measure  measurement  media  mediterranean  memory-management  meta:science  metabuch  metal-to-virtual  metameta  methodology  metrics  michael-nielsen  minimalism  minimum-viable  missing-heritability  money  morality  multi  multiplicative  mystic  n-factor  narrative  nascent-state  networking  neurons  news  nibble  nitty-gritty  no-go  noble-lie  numerics  objektbuch  occident  oop  optimization  org:biz  org:bleg  org:com  org:edge  org:edu  org:junk  org:lite  org:med  organization  organizing  orient  os  oss  osx  overflow  paradox  parsimony  paste  pdf  people  performance  pessimism  phase-transition  philosophy  physics  planning  pls  polarization  polisci  politics  population-genetics  postmortem  power  pragmatic  preference-falsification  pro-rata  problem-solving  profile  programming  progression  project  prudence  psychology  python  q-n-a  qra  quality  quantitative-qualitative  quantum  questions  quixotic  quora  ranking  rant  rationality  realness  reason  recruiting  reference  reflection  religion  repo  retrofit  review  rhetoric  robust  roots  rot  rsc  rust  saas  safety  sanctity-degradation  sapiens  scala  scale  scaling-tech  schelling  sci-comp  science  scitariat  search  security  sequential  shipping  SIGGRAPH  signaling  sinosphere  slides  slippery-slope  social  social-psych  social-structure  software  space  space-complexity  span-cover  spearhead  speed  stackex  stanford  stat-mech  state  stats  status  stories  strategy  straussian  stream  street-fighting  structure  stylized-facts  summary  sv  symmetry  synthesis  system-design  systematic-ad-hoc  systems  tcs  tcstariat  tech  tech-infrastructure  technology  techtariat  the-basilisk  the-classics  the-great-west-whale  the-trenches  the-watchers  the-world-is-just-atoms  theos  things  thinking  tim-roughgarden  time  tools  top-n  toxoplasmosis  traces  tradeoffs  transitions  trees  trends  tribalism  truth  tutorial  twitter  types  unaffiliated  unintended-consequences  unit  unix  urban  urban-rural  us-them  valiant  vcs  visual-understanding  visualization  volo-avolo  waves  web  west-hunter  westminster  white-paper  wire-guided  workflow  working-stiff  worse-is-better/the-right-thing  yak-shaving  yoga  zeitgeist  🎓  🔬  🖥 

Copy this bookmark: