nhaliday + rand-approx   39

Which of Haskell and OCaml is more practical? For example, in which aspect will each play a key role? - Quora
- Tikhon Jelvis,


This is a question I'm particularly well-placed to answer because I've spent quite a bit of time with both Haskell and OCaml, seeing both in the real world (including working at Jane Street for a bit). I've also seen the languages in academic settings and know many people at startups using both languages. This gives me a good perspective on both languages, with a fairly similar amount of experience in the two (admittedly biased towards Haskell).

And so, based on my own experience rather than the languages' reputations, I can confidently say it's Haskell.

Parallelism and Concurrency




Typeclasses vs Modules


In some sense, OCaml modules are better behaved and founded on a sounder theory than Haskell typeclasses, which have some serious drawbacks. However, the fact that typeclasses can be reliably inferred whereas modules have to be explicitly used all the time more than makes up for this. Moreover, extensions to the typeclass system enable much of the power provided by OCaml modules.


Of course, OCaml has some advantages of its own as well. It has a performance profile that's much easier to predict. The module system is awesome and often missed in Haskell. Polymorphic variants can be very useful for neatly representing certain situations, and don't have an obvious Haskell analog.

While both languages have a reasonable C FFI, OCaml's seems a bit simpler. It's hard for me to say this with any certainty because I've only used the OCaml FFI myself, but it was quite easy to use—a hard bar for Haskell's to clear. One really nice use of modules in OCaml is to pass around values directly from C as abstract types, which can help avoid extra marshalling/unmarshalling; that seemed very nice in OCaml.

However, overall, I still think Haskell is the more practical choice. Apart from the reasoning above, I simply have my own observations: my Haskell code tends to be clearer, simpler and shorter than my OCaml code. I'm also more productive in Haskell. Part of this is certainly a matter of having more Haskell experience, but the delta is limited especially as I'm working at my third OCaml company. (Of course, the first two were just internships.)

Both Haskell and OCaml are uniquivocally superb options—miles ahead of any other languages I know. While I do prefer Haskell, I'd choose either one in a pinch.

I've looked at F# a bit, but it feels like it makes too many tradeoffs to be on .NET. You lose the module system, which is probably OCaml's best feature, in return for an unfortunate, nominally typed OOP layer.

I'm also not invested in .NET at all: if anything, I'd prefer to avoid it in favor of simplicity. I exclusively use Linux and, from the outside, Mono doesn't look as good as it could be. I'm also far more likely to interoperate with a C library than a .NET library.

If I had some additional reason to use .NET, I'd definitely go for F#, but right now I don't.

Thinking about it now, it boils down to a single word: expressiveness. When I'm writing OCaml, I feel more constrained than when I'm writing Haskell. And that's important: unlike so many others, what first attracted me to Haskell was expressiveness, not safety. It's easier for me to write code that looks how I want it to look in Haskell. The upper bound on code quality is higher.


Perhaps it all boils down to OCaml and its community feeling more "worse is better" than Haskell, something I highly disfavor.


Laziness or, more strictly, non-strictness is big. A controversial start, perhaps, but I stand by it. Unlike some, I do not see non-strictness as a design mistake but as a leap in abstraction. Perhaps a leap before its time, but a leap nonetheless. Haskell lets me program without constantly keeping the code's order in my head. Sure, it's not perfect and sometimes performance issues jar the illusion, but they are the exception not the norm. Coming from imperative languages where order is omnipresent (I can't even imagine not thinking about execution order as I write an imperative program!) it's incredibly liberating, even accounting for the weird issues and jinks I'd never see in a strict language.

This is what I imagine life felt like with the first garbage collectors: they may have been slow and awkward, the abstraction might have leaked here and there, but, for all that, it was an incredible advance. You didn't have to constantly think about memory allocation any more. It took a lot of effort to get where we are now and garbage collectors still aren't perfect and don't fit everywhere, but it's hard to imagine the world without them. Non-strictness feels like it has the same potential, without anywhere near the work garbage collection saw put into it.


The other big thing that stands out are typeclasses. OCaml might catch up on this front with implicit modules or it might not (Scala implicits are, by many reports, awkward at best—ask Edward Kmett about it, not me) but, as it stands, not having them is a major shortcoming. Not having inference is a bigger deal than it seems: it makes all sorts of idioms we take for granted in Haskell awkward in OCaml which means that people simply don't use them. Haskell's typeclasses, for all their shortcomings (some of which I find rather annoying), are incredibly expressive.

In Haskell, it's trivial to create your own numeric type and operators work as expected. In OCaml, while you can write code that's polymorphic over numeric types, people simply don't. Why not? Because you'd have to explicitly convert your literals and because you'd have to explicitly open a module with your operators—good luck using multiple numeric types in a single block of code! This means that everyone uses the default types: (63/31-bit) ints and doubles. If that doesn't scream "worse is better", I don't know what does.


There's more. Haskell's effect management, brought up elsewhere in this thread, is a big boon. It makes changing things more comfortable and makes informal reasoning much easier. Haskell is the only language where I consistently leave code I visit better than I found it. Even if I hadn't worked on the project in years. My Haskell code has better longevity than my OCaml code, much less other languages.

One observation about purity and randomness: I think one of the things people frequently find annoying in Haskell is the fact that randomness involves mutation of state, and thus be wrapped in a monad. This makes building probabilistic data structures a little clunkier, since you can no longer expose pure interfaces. OCaml is not pure, and as such you can query the random number generator whenever you want.

However, I think Haskell may get the last laugh in certain circumstances. In particular, if you are using a random number generator in order to generate random test cases for your code, you need to be able to reproduce a particular set of random tests. Usually, this is done by providing a seed which you can then feed back to the testing script, for deterministic behavior. But because OCaml's random number generator manipulates global state, it's very easy to accidentally break determinism by asking for a random number for something unrelated. You can work around it by manually bracketing the global state, but explicitly handling the randomness state means providing determinism is much more natural.
q-n-a  qra  programming  pls  engineering  nitty-gritty  pragmatic  functional  haskell  ocaml-sml  dotnet  types  arrows  cost-benefit  tradeoffs  concurrency  libraries  performance  expert-experience  composition-decomposition  comparison  critique  multi  reddit  social  discussion  techtariat  reflection  review  random  data-structures  numerics  rand-approx  sublinear  syntax  volo-avolo  causation  scala  jvm  ecosystem  metal-to-virtual 
11 weeks ago by nhaliday
Equivalence between counting and sampling
also: every counting problem either has FPTRAS or no approx. w/i polynomial factor
pdf  exposition  lecture-notes  berkeley  nibble  tcs  counting  sampling  characterization  complexity  approximation  rand-approx  proofs 
february 2017 by nhaliday
Lecture 16
In which we define a multi-commodity flow problem, and we see that its dual is the relaxation of a useful graph partitioning problem. The relaxation can be rounded to yield an approximate graph partitioning algorithm.
pdf  lecture-notes  exposition  optimization  linear-programming  graphs  graph-theory  algorithms  duality  rounding  stanford  approximation  rand-approx  luca-trevisan  relaxation  nibble  stock-flow  constraint-satisfaction  tcs  tcstariat 
january 2017 by nhaliday
Computational Complexity: Favorite Theorems: The Yao Principle
The Yao Principle applies when we don't consider the algorithmic complexity of the players. For example in communication complexity we have two players who each have a separate half of an input string and they want to compute some function of the input with the minimum amount of communication between them. The Yao principle states that the best probabilistic strategies for the players will achieve exactly the communication bounds as the best deterministic strategy over a worst-case distribution of inputs.

The Yao Principle plays a smaller role where we measure the running time of an algorithm since applying the Principle would require solving an extremely large linear program. But since so many of our bounds are in information-based models like communication and decision-tree complexity, the Yao Principle, though not particularly complicated, plays an important role in lower bounds in a large number of results in our field.
tcstariat  tcs  complexity  adversarial  rand-approx  algorithms  game-theory  yoga  levers  communication-complexity  random  lower-bounds  average-case  nibble  org:bleg 
january 2017 by nhaliday
Probabilistic Filters By Example: Cuckoo Filter and Bloom Filters
Bloom filters have been in use since the 1970s and are well understood. Implementations are widely available. Variants exist that support deletion and counting, though with expanded storage requirements.

Cuckoo filters were described in Cuckoo Filter: Practically Better Than Bloom, a paper by researchers at CMU in 2014. Cuckoo filters improve on Bloom filters by supporting deletion, limited counting, and bounded FPP with similar storage efficiency as a standard Bloom filter.
comparison  data-structures  tutorial  visualization  explanation  engineering  mihai  visual-understanding  techtariat  rand-approx 
september 2016 by nhaliday
Bayesianism, frequentism, and the planted clique, or do algorithms believe in unicorns? | Windows On Theory
But if you consider probabilities as encoding beliefs, then it’s quite likely that a computationally bounded observer is not certain whether {17} is in the clique or not. After all, finding a maximum clique is a hard computational problem. So if T is much smaller than the time it takes to solve the k-clique problem (which is n^{const\cdot k} as far as we know), then it might make sense for time T observers to assign a probability between 0 and 1 to this event. Can we come up with a coherent theory of such probabilities?
research  tcs  complexity  probability  stats  algorithms  yoga  speculation  tcstariat  frontier  insight  exposition  rand-approx  synthesis  big-picture  boaz-barak  org:bleg  nibble  frequentist  bayesian  subjective-objective 
april 2016 by nhaliday

bundles : academetcsvague

related tags

aaronson  abstraction  academia  accretion  acm  acmtariat  advanced  adversarial  advice  age-generation  aging  algorithmic-econ  algorithms  amortization-potential  announcement  applicability-prereqs  applications  approximation  arrows  art  atoms  average-case  bayesian  berkeley  best-practices  big-picture  big-surf  bio  boaz-barak  books  boolean-analysis  business  career  causation  certificates-recognition  characterization  chart  checking  circuits  clarity  clever-rats  cmu  code-dive  code-organizing  coding-theory  communication-complexity  comparison  competition  complexity  composition-decomposition  computation  concentration-of-measure  concept  concurrency  confluence  constraint-satisfaction  contiguity-proximity  cool  correctness  cost-benefit  counterexample  counting  coupling-cohesion  course  critique  crypto  cs  culture  dan-luu  dana-moshkovitz  dark-arts  data-structures  debugging  differential-privacy  dimensionality  discrete  discussion  dotnet  duality  ecosystem  electromag  elegance  embeddings  encyclopedic  engineering  erik-demaine  error  essay  estimate  evolution  examples  expanders  expectancy  expert  expert-experience  explanation  exposition  extrema  flux-stasis  fourier  frequentist  frontier  functional  game-theory  geometry  georgia  gowers  gradient-descent  graph-theory  graphs  greedy  ground-up  hardness  hardware  harvard  hashing  haskell  heuristic  hi-order-bits  hierarchy  high-dimension  hmm  homepage  IEEE  info-foraging  init  insight  intricacy  israel  iteration-recursion  iterative-methods  jobs  jvm  knowledge  latency-throughput  learning-theory  lecture-notes  lectures  legacy  len:long  lens  lesswrong  levers  libraries  linear-algebra  linear-programming  linearity  linux  list  lower-bounds  luca-trevisan  machine-learning  madhu-sudan  management  markov  math  math.CA  math.CO  math.GR  math.NT  measurement  mechanism-design  metabuch  metal-to-virtual  metameta  metric-space  microsoft  mihai  mit  mixing  monte-carlo  motivation  multi  multiplicative  naturality  news  nibble  nitty-gritty  numerics  ocaml-sml  ocw  oly  open-problems  optimization  org:bleg  org:inst  org:mag  org:mat  org:sci  os  overflow  p:**  p:***  p:null  p:someday  p:whenever  papers  pcp  pdf  people  performance  philosophy  pls  polynomials  popsci  pragmatic  princeton  probabilistic-method  probability  problem-solving  productivity  prof  profile  programming  proof-systems  proofs  pseudorandomness  puzzles  q-n-a  qra  quantum  quantum-info  questions  quixotic  rand-approx  rand-complexity  random  rationality  ratty  reading  recommendations  recruiting  reddit  reflection  relativization  relaxation  research  review  rhetoric  rigor  rigorous-crypto  rounding  ryan-odonnell  s:**  s:***  salil-vadhan  sampling  scala  SDP  sequential  skeleton  slides  social  space-complexity  spatial  spectral  speculation  spock  stanford  state  stats  stock-flow  subjective-objective  sublinear  submodular  sum-of-squares  survey  sv  syntax  synthesis  system-design  systems  talks  tcs  tcstariat  teaching  tech  techtariat  tensors  texas  the-prices  thinking  tim-roughgarden  time  time-complexity  toolkit  top-n  topics  tradeoffs  trees  tricki  tutorial  twitter  types  UGC  unit  unix  valiant  vazirani  visual-understanding  visualization  volo-avolo  washington  wigderson  working-stiff  yoga  👳  🖥  🤖 

Copy this bookmark: