nhaliday + programming   508

quality - Is the average number of bugs per loc the same for different programming languages? - Software Engineering Stack Exchange
Contrary to intuition, the number of errors per 1000 lines of does seem to be relatively constant, reguardless of the specific language involved. Steve McConnell, author of Code Complete and Software Estimation: Demystifying the Black Art goes over this area in some detail.

I don't have my copies readily to hand - they're sitting on my bookshelf at work - but a quick Google found a relevant quote:

Industry Average: "about 15 - 50 errors per 1000 lines of delivered code."
(Steve) further says this is usually representative of code that has some level of structured programming behind it, but probably includes a mix of coding techniques.

Quoted from Code Complete, found here: http://mayerdan.com/ruby/2012/11/11/bugs-per-line-of-code-ratio/

If memory serves correctly, Steve goes into a thorough discussion of this, showing that the figures are constant across languages (C, C++, Java, Assembly and so on) and despite difficulties (such as defining what "line of code" means).

Most importantly he has lots of citations for his sources - he's not offering unsubstantiated opinions, but has the references to back them up.
q-n-a  stackex  programming  engineering  nitty-gritty  error  flux-stasis  books  recommendations  software  checking  debugging  pro-rata  pls  comparison  parsimony  measure 
1 hour ago by nhaliday
coding style - C++ code in header files - Stack Overflow
There is occasionally some merit to putting code in the header, this can allow more clever inlining by the compiler. But at the same time, it can destroy your compile times since all code has to be processed every time it is included by the compiler.

Finally, it is often annoying to have circular object relationships (sometimes desired) when all the code is the headers.

Bottom line, you were right, he is wrong.

EDIT: I have been thinking about your question. There is one case where what he says is true. templates. Many newer "modern" libraries such as boost make heavy use of templates and often are "header only." However, this should only be done when dealing with templates as it is the only way to do it when dealing with them.
q-n-a  stackex  programming  best-practices  c(pp)  pls  compilers  types 
3 days ago by nhaliday
c - What REALLY happens when you don't free after malloc? - Stack Overflow
keep this stuff in mind when writing competition stuff, can usually just omit deletes/frees unless you're really running up against the memory limit:
Just about every modern operating system will recover all the allocated memory space after a program exits.


On the other hand, the similar admonition to close your files on exit has a much more concrete result - if you don't, the data you wrote to them might not get flushed, or if they're a temp file, they might not get deleted when you're done. Also, database handles should have their transactions committed and then closed when you're done with them. Similarly, if you're using an object oriented language like C++ or Objective C, not freeing an object when you're done with it will mean the destructor will never get called, and any resources the class is responsible might not get cleaned up.


I really consider this answer wrong.One should always deallocate resources after one is done with them, be it file handles/memory/mutexs. By having that habit, one will not make that sort of mistake when building servers. Some servers are expected to run 24x7. In those cases, any leak of any sort means that your server will eventually run out of that resource and hang/crash in some way. A short utility program, ya a leak isn't that bad. Any server, any leak is death. Do yourself a favor. Clean up after yourself. It's a good habit.


Allocation Myth 4: Non-garbage-collected programs should always deallocate all memory they allocate.

The Truth: Omitted deallocations in frequently executed code cause growing leaks. They are rarely acceptable. but Programs that retain most allocated memory until program exit often perform better without any intervening deallocation. Malloc is much easier to implement if there is no free.

In most cases, deallocating memory just before program exit is pointless. The OS will reclaim it anyway. Free will touch and page in the dead objects; the OS won't.

Consequence: Be careful with "leak detectors" that count allocations. Some "leaks" are good!
q-n-a  stackex  programming  memory-management  performance  systems  c(pp)  oly-programming 
12 days ago by nhaliday
parsing - lexers vs parsers - Stack Overflow
Yes, they are very different in theory, and in implementation.

Lexers are used to recognize "words" that make up language elements, because the structure of such words is generally simple. Regular expressions are extremely good at handling this simpler structure, and there are very high-performance regular-expression matching engines used to implement lexers.

Parsers are used to recognize "structure" of a language phrases. Such structure is generally far beyond what "regular expressions" can recognize, so one needs "context sensitive" parsers to extract such structure. Context-sensitive parsers are hard to build, so the engineering compromise is to use "context-free" grammars and add hacks to the parsers ("symbol tables", etc.) to handle the context-sensitive part.

Neither lexing nor parsing technology is likely to go away soon.

They may be unified by deciding to use "parsing" technology to recognize "words", as is currently explored by so-called scannerless GLR parsers. That has a runtime cost, as you are applying more general machinery to what is often a problem that doesn't need it, and usually you pay for that in overhead. Where you have lots of free cycles, that overhead may not matter. If you process a lot of text, then the overhead does matter and classical regular expression parsers will continue to be used.
q-n-a  stackex  programming  compilers  automata  explanation  comparison  jargon  strings 
november 2017 by nhaliday
python - Short Description of the Scoping Rules? - Stack Overflow
Actually, a concise rule for Python Scope resolution, from Learning Python, 3rd. Ed.. (These rules are specific to variable names, not attributes. If you reference it without a period, these rules apply)

LEGB Rule.

L, Local — Names assigned in any way within a function (def or lambda)), and not declared global in that function.

E, Enclosing-function locals — Name in the local scope of any and all statically enclosing functions (def or lambda), from inner to outer.

G, Global (module) — Names assigned at the top-level of a module file, or by executing a global statement in a def within the file.

B, Built-in (Python) — Names preassigned in the built-in names module : open,range,SyntaxError,...

As a caveat to Global access - reading a global variable can happen without explicit declaration, but writing to it without declaring global(var_name) will instead create a new local instance.


Essentially, the only thing in Python that introduces a new scope is a function definition. Classes are a bit of a special case in that anything defined directly in the body is placed in the class's namespace, but they are not directly accessible from within the methods (or nested classes) they contain.
q-n-a  stackex  programming  intricacy  gotchas  python  pls  objektbuch  cheatsheet 
november 2017 by nhaliday
Homebrew: List only installed top level formulas - Stack Overflow
Use brew leaves: show installed formulae that are not dependencies of another installed formula.
q-n-a  stackex  howto  yak-shaving  programming  osx  terminal  network-structure  graphs  trivia  tip-of-tongue  workflow  build-packaging 
november 2017 by nhaliday
awk - Assigning system command's output to variable - Stack Overflow
awk 'BEGIN {"date" | getline mydate; close("date"); print "returns", mydate}'
q-n-a  stackex  howto  yak-shaving  terminal  programming  gotchas 
november 2017 by nhaliday
Is the keyboard faster than the mouse?

It’s entirely possible that the mysterious studies Tog’s org spent $50M on prove that the mouse is faster than the keyboard for all tasks other than raw text input, but there doesn’t appear to be enough information to tell what the actual studies were. There are many public studies on user input, but I couldn’t find any that are relevant to whether or not I should use the mouse more or less at the margin.

When I look at various tasks myself, the results are mixed, and they’re mixed in the way that most programmers I polled predicted. This result is so boring that it would barely be worth mentioning if not for the large groups of people who believe that either the keyboard is always faster than the mouse or vice versa.

Please let me know if there are relevant studies on this topic that I should read! I’m not familiar with the relevant fields, so it’s possible that I’m searching with the wrong keywords and reading the wrong papers.
techtariat  dan-luu  engineering  programming  productivity  workflow  hci  hardware  working-stiff  benchmarks 
november 2017 by nhaliday
functions - What are the use cases for different scoping constructs? - Mathematica Stack Exchange
As you mentioned there are many things to consider and a detailed discussion is possible. But here are some rules of thumb that I apply the majority of the time:

Module[{x}, ...] is the safest and may be needed if either

There are existing definitions for x that you want to avoid breaking during the evaluation of the Module, or
There is existing code that relies on x being undefined (for example code like Integrate[..., x]).
Module is also the only choice for creating and returning a new symbol. In particular, Module is sometimes needed in advanced Dynamic programming for this reason.

If you are confident there aren't important existing definitions for x or any code relying on it being undefined, then Block[{x}, ...] is often faster. (Note that, in a project entirely coded by you, being confident of these conditions is a reasonable "encapsulation" standard that you may wish to enforce anyway, and so Block is often a sound choice in these situations.)

With[{x = ...}, expr] is the only scoping construct that injects the value of x inside Hold[...]. This is useful and important. With can be either faster or slower than Block depending on expr and the particular evaluation path that is taken. With is less flexible, however, since you can't change the definition of x inside expr.
q-n-a  stackex  programming  CAS  trivia  howto  best-practices  checklists 
november 2017 by nhaliday
design patterns - What is MVC, really? - Software Engineering Stack Exchange
The model manages fundamental behaviors and data of the application. It can respond to requests for information, respond to instructions to change the state of its information, and even to notify observers in event-driven systems when information changes. This could be a database, or any number of data structures or storage systems. In short, it is the data and data-management of the application.

The view effectively provides the user interface element of the application. It'll render data from the model into a form that is suitable for the user interface.

The controller receives user input and makes calls to model objects and the view to perform appropriate actions.


Though this answer has 21 upvotes, I find the sentence "This could be a database, or any number of data structures or storage systems. (tl;dr : it's the data and data-management of the application)" horrible. The model is the pure business/domain logic. And this can and should be so much more than data management of an application. I also differentiate between domain logic and application logic. A controller should not ever contain business/domain logic or talk to a database directly.
q-n-a  stackex  explanation  concept  conceptual-vocab  structure  composition-decomposition  programming  engineering  best-practices  pragmatic  jargon  thinking  metabuch  working-stiff  tech  🖥  checklists 
october 2017 by nhaliday
man page - Wikipedia
The name of the command or function, followed by a one-line description of what it does.
In the case of a command, a formal description of how to run it and what command line options it takes. For program functions, a list of the parameters the function takes and which header file contains its definition.
A textual description of the functioning of the command or function.
Some examples of common usage.
A list of related commands or functions.
explanation  programming  engineering  documentation  howto  terminal  unix  wiki  reference  cheatsheet  trivia  info-foraging 
september 2017 by nhaliday
Recitation 25: Data locality and B-trees
The same idea can be applied to trees. Binary trees are not good for locality because a given node of the binary tree probably occupies only a fraction of a cache line. B-trees are a way to get better locality. As in the hash table trick above, we store several elements in a single node -- as many as will fit in a cache line.

B-trees were originally invented for storing data structures on disk, where locality is even more crucial than with memory. Accessing a disk location takes about 5ms = 5,000,000ns. Therefore if you are storing a tree on disk you want to make sure that a given disk read is as effective as possible. B-trees, with their high branching factor, ensure that few disk reads are needed to navigate to the place where data is stored. B-trees are also useful for in-memory data structures because these days main memory is almost as slow relative to the processor as disk drives were when B-trees were introduced!
nibble  org:junk  org:edu  cornell  lecture-notes  exposition  programming  engineering  systems  dbs  caching  performance  memory-management  os 
september 2017 by nhaliday
Anatomy of an SQL Index: What is an SQL Index
“An index makes the query fast” is the most basic explanation of an index I have ever seen. Although it describes the most important aspect of an index very well, it is—unfortunately—not sufficient for this book. This chapter describes the index structure in a less superficial way but doesn't dive too deeply into details. It provides just enough insight for one to understand the SQL performance aspects discussed throughout the book.

B-trees, etc.
techtariat  tutorial  explanation  performance  programming  engineering  dbs  trees  data-structures  nibble 
september 2017 by nhaliday
bash - How to find/replace and increment a matched number with sed/awk? - Stack Overflow
/e allows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.
why you need to get first and last part of lines: https://unix.stackexchange.com/questions/180783/sed-e-and-g-flags-not-working-together
That is a bit tortuously written. What it means is that, after the completion of a s/// command for this line, if there was a change, the (new) line is executed as a command and its output used as the replacement for this line.

example of what I had to do to get this to work w/ embedded quotes:
gsed -E 's/^\("(.*)", ([0-9]+)(.*)/echo "(\\\\"\1\\\\", $((\2+54))\3"/e'
maps ("foo", 3... -> ("foo", 57..
q-n-a  stackex  programming  howto  terminal  unix  yak-shaving  multi  gotchas 
september 2017 by nhaliday
linux - How do I replace the last occurrence of a character in a string using sed? - Unix & Linux Stack Exchange
You can do it with single command:

sed 's/\(.*\)-/\1 /'
The point is that sed is very greedy, so matches as many characters before - as possible, including others -.
q-n-a  stackex  howto  workflow  yak-shaving  terminal  unix  programming 
august 2017 by nhaliday
unix - How to split a delimited string into an array in awk? - Stack Overflow
To split a string to an array in awk we use the function split():

awk '{split($0, a, ":")}'
# ^^ ^ ^^^
# | | |
# string | delimiter
# |
# array to store the pieces
If no separator is given, it uses the FS, which defaults to the space:

$ awk '{split($0, a); print a[2]}' <<< "a:b c:d e"
q-n-a  stackex  programming  howto  yak-shaving  terminal  unix  workflow 
august 2017 by nhaliday
What is the best way to parse command-line arguments with Python? - Quora
- Anders Kaseorg

Use the standard optparse library.

It’s important to uphold your users’ expectation that your utility will parse arguments in the same way as every other UNIX utility. If you roll your own parsing code, you’ll almost certainly break that expectation in obvious or subtle ways.

Although the documentation claims that optparse has been deprecated in favor of argparse, which supports more features like optional option arguments and configurable prefix characters, I can’t recommend argparse until it’s been fixed to parse required option arguments in the standard UNIX way. Currently, argparse uses an unexpected heuristic which may lead to subtle bugs in other scripts that call your program.

consider also click (which uses the optparse behavior)
q-n-a  qra  oly  best-practices  programming  terminal  unix  python  libraries  protocol  gotchas  howto  pls  yak-shaving  integration-extension 
august 2017 by nhaliday
Broadcasting — NumPy v1.13 Manual
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when

they are equal, or
one of them is 1
If these conditions are not met, a ValueError: frames are not aligned exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the maximum size along each dimension of the input arrays.

Arrays do not need to have the same number of dimensions. For example, if you have a 256x256x3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values.
python  libraries  programming  howto  protocol  numerics  pls  linear-algebra 
august 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : frametechie

related tags

-_-  aaronson  ability-competence  abstraction  academia  accessibility  accretion  accuracy  acm  acmtariat  adversarial  advertising  advice  age-generation  aggregator  ai  ai-control  akrasia  algorithms  alien-character  altruism  ama  analysis  analytical-holistic  announcement  anomie  anthropology  aphorism  api  apollonian-dionysian  app  apple  applications  approximation  architecture  art  asia  assembly  atoms  audio  auto-learning  automata  automation  backup  bangbang  barons  bayesian  beauty  behavioral-gen  benchmarks  best-practices  better-explained  bifl  big-list  big-picture  bio  biodet  biohacking  bioinformatics  biotech  bitcoin  blockchain  blog  books  bots  bounded-cognition  brands  bret-victor  broad-econ  browser  build-packaging  business  business-models  c(pp)  c:***  caching  calculator  caltech  canada  cancer  capital  capitalism  carcinisation  career  carmack  CAS  chart  cheatsheet  checking  checklists  chicago  circuits  class-warfare  classic  classification  clever-rats  clojure  cloud  coalitions  cocoa  code-dive  cog-psych  collaboration  columbia  comics  commentary  communication  community  comparison  compensation  competition  compilers  composition-decomposition  computation  computer-vision  concept  conceptual-vocab  concurrency  conference  confluence  confusion  context  contracts  contradiction  contrarianism  cool  cooperate-defect  coordination  cornell  corporation  correlation  course  cracker-econ  creative  CRISPR  critique  crooked  crosstab  crypto  crypto-anarchy  cryptocurrency  cs  cultural-dynamics  culture  current-events  d3  dan-luu  data  data-science  data-structures  database  dataset  dataviz  dbs  death  debate  debugging  decentralized  deep-learning  definite-planning  degrees-of-freedom  dennett  design  desktop  developing-world  devops  devtools  differential  dirty-hands  discipline  discovery  discussion  disease  distributed  distribution  diversity  divide-and-conquer  diy  documentation  DP  draft  drama  dropbox  dumb-ML  duplication  dynamic  econ-metrics  econometrics  economics  econotariat  eden-heaven  editors  education  eh  emacs  email  embedded  embeddings  embodied  empirical  endogenous-exogenous  engineering  enhancement  epidemiology  ergo  erlang  error  essay  essence-existence  evan-miller  evidence-based  examples  exocortex  experiment  expert  expert-experience  explanation  exploratory  exposition  externalities  facebook  faq  farmers-and-foragers  features  fermi  ffi  finance  flexibility  flux-stasis  foreign-lang  formal-methods  forum  frameworks  free  free-riding  freelance  frontend  frontier  functional  futurism  games  garett-jones  gavisti  gender  generalization  generative  genetics  genomics  geoengineering  geometry  gif  git  github  gnon  gnosis-logos  golang  google  gotchas  government  gradient-descent  graph-theory  graphical-models  graphics  graphs  gray-econ  greedy  ground-up  GT-101  gtd  guide  gwern  hacker  haidt  hanson  hardware  hashing  haskell  hci  health  heavy-industry  heuristic  hg  hi-order-bits  higher-ed  history  hmm  hn  homepage  homo-hetero  howto  hsu  huge-data-the-biggest  human-capital  hypothesis-testing  ide  ideas  idk  industrial-org  info-dynamics  info-foraging  infographic  init  innovation  insight  integration-extension  intelligence  interdisciplinary  internet  intersection  intersection-connectedness  intervention  interview  interview-prep  intricacy  intuition  invariance  investing  ios  iq  iron-age  iteration-recursion  japan  jargon  javascript  jobs  judaism  julia  jvm  kaggle  keyboards  knowledge  labor  language  latent-variables  law  leadership  learning  lecture-notes  lectures  left-wing  legacy  len:long  len:short  lens  let-me-see  letters  lexical  libraries  limits  linear-algebra  linear-models  liner-notes  links  linux  list  live-coding  llvm  lmao  local-global  lol  long-short-run  long-term  longitudinal  lower-bounds  machine-learning  macro  madisonian  magnitude  maker  malaise  management  maps  marginal  marginal-rev  market-failure  market-power  markets  markov  math  math.CO  matrix-factorization  measure  mechanics  media  mediterranean  memory-management  meta-analysis  metabuch  metameta  metaprogramming  methodology  metric-space  metrics  michael-nielsen  microbiz  micropayments  minimalism  minimum-viable  mit  ML-MAP-E  mobile  model-class  models  money  money-for-time  monte-carlo  mostly-modern  msr  multi  music  mutation  n-factor  nationalism-globalism  network-structure  networking  neurons  news  nibble  nitty-gritty  nl-and-so-can-you  nlp  nootropics  notation  notetaking  null-result  numerics  objektbuch  ocaml-sml  occam  ocw  oly  oly-programming  optimization  order-disorder  org:anglo  org:biz  org:bleg  org:edge  org:edu  org:health  org:junk  org:mag  org:mat  org:med  org:ngo  org:popup  org:rec  org:sci  organization  os  oss  osx  overflow  p2p  p:*  packaging  papers  parable  parsimony  paste  pdf  pennsylvania  people  performance  philosophy  physics  pic  piracy  plan9  planning  play  pls  plt  policy  politics  poll  popsci  postmortem  ppl  practice  pragmatic  prediction  prediction-markets  prejudice  preprint  presentation  primitivism  priors-posteriors  pro-rata  probabilistic-method  probability  problem-solving  productivity  profile  programming  progression  project  property-rights  protocol  pseudorandomness  psychology  psychometrics  public-goodish  puzzles  python  q-n-a  qra  quantified-self  quantitative-qualitative  quantum  quantum-info  questions  quixotic  quiz  quora  r-lang  random  randy-ayndy  ranking  rant  rationality  ratty  reading  realness  reason  rec-math  recommendations  recruiting  reference  reflection  regression  regularizer  regulation  reinforcement  religion  rent-seeking  replication  repo  reputation  research  review  revolution  rhetoric  rigidity  rigor  roadmap  roots  rot  rsc  ruby  russia  rust  s:*  saas  sampling  scala  scale  scaling-tech  science  scitariat  search  securities  security  sequential  shipping  SIGGRAPH  signal-noise  similarity  simulation  skeleton  skunkworks  sleuthin  slides  slippery-slope  smart-contracts  social  social-norms  sociality  software  spatial  speed  speedometer  spring-2016  ssc  stackex  stagnation  stamina  stanford  startups  stats  stories  stream  street-fighting  strings  structure  students  study  studying  subculture  summary  supply-demand  sv  synthesis  system-design  systematic-ad-hoc  systems  tactics  talks  tcs  tcstariat  teaching  tech  technology  techtariat  terminal  the-bones  the-classics  the-world-is-just-atoms  theory-of-mind  thick-thin  thiel  things  thinking  tidbits  time  time-series  time-use  tip-of-tongue  tools  top-n  topology  trade  tradeoffs  transitions  trees  trends  tribalism  tricki  tricks  trivia  troll  tutorial  tutoring  twitter  types  ui  unaffiliated  uncertainty  unintended-consequences  uniqueness  unit  unix  urbit  us-them  usa  usaco-ioi  ux  vague  vcs  via:popular  via:Soft  video  virginia-DC  virtu  virtualization  visual-understanding  visualization  visuo  volo-avolo  vulgar  water  wealth  web  webapp  west-hunter  white-paper  wiki  winner-take-all  wonkish  wordlessness  workflow  working-stiff  world  worrydream  writing  wtf  wut  yak-shaving  yc  yoga  zeitgeist  🎓  🎩  👳  🔬  🖥  🤖 

Copy this bookmark: