jm + coding   242

repo
'The multiple repository tool'. How Google kludged around the split-repo problem when you don't have a monorepo.
kludges  git  monorepo  monorepi  google  android  aosp  repo  coding  version-control  dvcs 
10 days ago by jm
Input: Fonts for Code
Non-monospaced coding fonts! I'm all in favour...
As writing and managing code becomes more complex, today’s sophisticated coding environments are evolving to include everything from breakpoint markers to code folding and syntax highlighting. The typography of code should evolve as well, to explore possibilities beyond one font style, one size, and one character width.
input  fonts  via:its  typography  code  coding  font  text  ide  monospace 
12 days ago by jm
streamtools: a graphical tool for working with streams of data | nytlabs
Visual programming, Yahoo! Pipes style, back again:
we have created streamtools – a new, open source project by The New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It provides a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system.


via Aman
via:akohli  streaming  data  nytimes  visual-programming  coding 
13 days ago by jm
The Injector: A new Executor for Java
This honestly fits a narrow niche, but one that is gaining in popularity. If your messages take > 100μs to process, or your worker threads are consistently saturated, the standard ThreadPoolExecutor is likely perfectly adequate for your needs. If, on the other hand, you’re able to engineer your system to operate with one application thread per physical core you are probably better off looking at an approach like the LMAX Disruptor. However, if you fall in the crack in between these two scenarios, or are seeing a significant portion of time spent in futex calls and need a drop in ExecutorService to take the edge off, the injector may well be worth a look.
performance  java  executor  concurrency  disruptor  algorithms  coding  threads  threadpool  injector 
16 days ago by jm
Kappa
'a command line tool that (hopefully) makes it easier to deploy, update, and test functions for AWS Lambda.' much needed IMO -- Lambda is too closed
aws  lambda  mitch-garnaat  coding  testing  cli  kappa 
26 days ago by jm
'Microservice AntiPatterns'
presentation from last week's Craft Conference in Budapest; Tammer Saleh of Pivotal with a few antipatterns observed in dealing with microservices.
microservices  soa  architecture  design  coding  software  presentations  slides  tammer-saleh  pivotal  craft 
26 days ago by jm
ShellCheck
Static code analysis for shell scripts (via Tony Finch)
bash  cli  sh  linux  shell  coding  static-analysis  lint 
26 days ago by jm
AWS Lambda Event-Driven Architecture With Amazon SNS
Any message posted to an SNS topic can trigger the execution of custom code you have written, but you don’t have to maintain any infrastructure to keep that code available to listen for those events and you don’t have to pay for any infrastructure when the code is not being run. This is, in my opinion, the first time that Amazon can truly say that AWS Lambda is event-driven, as we now have a central, independent, event management system (SNS) where any authorized entity can trigger the event (post a message to a topic) and any authorized AWS Lambda function can listen for the event, and neither has to know about the other.
aws  ec2  lambda  sns  events  cep  event-processing  coding  cloud  hacks  eric-hammond 
5 weeks ago by jm
Rob Pike's 5 rules of optimization
these are great. I've run into rule #3 ("fancy algorithms are slow when n is small, and n is usually small") several times...
twitter  rob-pike  via:igrigorik  coding  rules  laws  optimization  performance  algorithms  data-structures  aphorisms 
5 weeks ago by jm
OG-Commons/Guavate.java
'Utilities that help bridge the gap between Java 8 and Google Guava. Guava has the {@link FluentIterable} concept which is similar to streams. In many ways, fluent iterable is nicer, because it directly binds to the immutable collection classes. However, on balance it seems wise to use the stream API rather than {@code FluentIterable} in Java 8.'
guava  java-8  java  fluentiterable  streams  fluent  coding 
6 weeks ago by jm
Stack Overflow Developer Survey 2015
wow, 52.5% of developers prefer a dark IDE theme?!
coding  jobs  work  careers  software  stack-overflow  surveys 
6 weeks ago by jm
On Ruby
The horrors of monkey-patching:
I call out the Honeybadger gem specifically because was the most recent time I'd been bit by a seemingly good thing promoted in the community: monkey patching third party code. Now I don't fault Honeybadger for making their product this way. It provides their customers with direct business value: "just require 'honeybadger' and you're done!" I don't agree with this sort of practice. [....]

I distrust everything [in Ruby] but a small set of libraries I've personally vetted or are authored by people I respect. Why is this important? Without a certain level of scrutiny you will introduce odd and hard to reproduce bugs. This is especially important because Ruby offers you absolutely zero guarantee whatever the state your program is when a given method is dispatched. Constants are not constants. Methods can be redefined at run time. Someone could have written a time sensitive monkey patch to randomly undefined methods from anything in ObjectSpace because they can. This example is so horribly bad that no one should every do, but the programming language allows this. Much worse, this code be arbitrarily inject by some transitive dependency (do you even know what yours are?).
ruby  monkey-patching  coding  reliability  bugs  dependencies  libraries  honeybadger  sinatra 
7 weeks ago by jm
Reactive Programming for a demanding world
"building event-driven and responsive applications with RxJava", slides by Mario Fusco. Good info on practical Rx usage in Java
rxjava  rx  reactive  coding  backpressure  streams  observables 
7 weeks ago by jm
Bug Prediction at Google
LOL. grepping commit logs for /bug|fix/ does the job, apparently:
In the literature, Rahman et al. found that a very cheap algorithm actually performs almost as well as some very expensive bug-prediction algorithms. They found that simply ranking files by the number of times they've been changed with a bug-fixing commit (i.e. a commit which fixes a bug) will find the hot spots in a code base. Simple! This matches our intuition: if a file keeps requiring bug-fixes, it must be a hot spot because developers are clearly struggling with it.
bugs  rahman-algorithm  heuristics  source-code-analysis  coding  algorithms  google  static-code-analysis  version-control 
8 weeks ago by jm
Bazel
Google open sources a key part of their internal build system (internally called "Blaze" it seems for a while). Very nice indeed!
blaze  bazel  build-tools  building  open-source  google  coding  packaging 
8 weeks ago by jm
Combining static model checking with dynamic enforcement using the Statecall Policy Language
This looks quite nice -- a model-checker "for regular programmers". Example model for ping(1):

<pre>01 automaton ping (int max_count, int count, bool can_timeout) {
02 Initialize;
03 during {
04 count = 0;
05 do {
06 Transmit_Ping;
07 either {
08 Receive_Ping;
09 } or (can_timeout) {
10 Timeout_Ping;
11 };
12 count = count + 1;
13 } until (count &gt;= max_count);
14 } handle {
15 SIGINFO;
16 Print_Summary;
17 };</pre>
ping  model-checking  models  formal-methods  verification  static  dynamic  coding  debugging  testing  distcomp  papers 
8 weeks ago by jm
ben-manes/caffeine
'Caffeine is a Java 8 based concurrency library that provides specialized data structures, such as a high performance cache.'
cache  java8  java  guava  caching  concurrency  data-structures  coding 
8 weeks ago by jm
Explanation of the Jump Consistent Hash algorithm
I blogged about the amazing stateless Jump Consistent Hash algorithm last year, but this is a good walkthrough of how it works.

Apparently one author, Eric Veach, is legendary -- https://news.ycombinator.com/item?id=9209891 : "Eric Veach is huge in the computer graphics world for laying a ton of the foundations of modern physically based rendering in his PhD thesis [1]. He then went on to work for Pixar and did a ton of work on Renderman (for which he recently got an Academy Award), and then in the early 2000ish left Pixar to go work for Google, where he was the lead on developing AdWords [2]. In short, he's had quite a career, and seeing a new paper from him is always interesting."
eric-veach  consistent-hashing  algorithms  google  adwords  renderman  pixar  history  coding  c  c++ 
9 weeks ago by jm
Halcyon Days
Fantastic 1997-era book of interviews with the programmers behind some of the greatest games in retrogaming history:
Halcyon Days: Interviews with Classic Computer and Video Game Programmers was released as a commercial product in March 1997. At the time it was one of the first retrogaming projects to focus on lost history rather than game collecting, and certainly the first entirely devoted to the game authors themselves. Now a good number of the interviewees have their own web sites, but none of them did when I started contacting them in 1995. [...] If you have any of the giddy anticipation that I did whenever I picked up a magazine containing an interview with Mark Turmell or Dan [M.U.L.E.] Bunten, then you want to start reading.
book  games  history  coding  interviews  via:walter 
11 weeks ago by jm
Release Protocol Buffers v3.0.0-alpha-2 · google/protobuf
New major-version track for protobuf, with some interesting new features:

Removal of field presence logic for primitive value fields, removal of required fields, and removal of default values. This makes proto3 significantly easier to implement with open struct representations, as in languages like Android Java, Objective C, or Go.
Removal of unknown fields.
Removal of extensions, which are instead replaced by a new standard type called Any.
Fix semantics for unknown enum values.
Addition of maps.
Addition of a small set of standard types for representation of time, dynamic data, etc.
A well-defined encoding in JSON as an alternative to binary proto encoding.
protobuf  binary  marshalling  serialization  google  grpc  proto3  coding  open-source 
12 weeks ago by jm
how Curator fixed issues with the Hive ZooKeeper Lock Manager Implementation
Ugh, ZK is a bear to work with.
Apache Curator is open source software which is able to handle all of the above scenarios transparently. Curator is a Netflix ZooKeeper Library and it provides a high-level API, CuratorFramework, that simplifies using ZooKeeper. By using a singleton CuratorFramework instance in the new ZooKeeperHiveLockManager implementation, we not only fixed the ZooKeeper connection issues, but also made the code easy to understand and maintain.  
zookeeper  apis  curator  netflix  distributed-locks  coding  hive 
12 weeks ago by jm
Programmer IS A Career Path, Thank You
Well said -- Amazon had a good story around this btw
programming  coding  career  work  life 
12 weeks ago by jm
Why we run an open source program - Walmart Labs
This is a great exposition of why it's in a company's interest to engage with open source. Not sure I agree with 'engineers are the artists of our generation' but the rest are spot on
development  open-source  walmart  node  coding  via:hn  hiring 
12 weeks ago by jm
RateLimitedLogger
Our latest open source release from Swrve Labs: an Apache-licensed, SLF4J-compatible, simple, fluent API for rate-limited logging in Java:

'A RateLimitedLog object tracks the rate of log message emission, imposes an internal rate limit, and will efficiently suppress logging if this is exceeded. When a log is suppressed, at the end of the limit period, another log message is output indicating how many log lines were suppressed. This style of rate limiting is the same as the one used by UNIX syslog; this means it should be comprehensible, easy to predict, and familiar to many users, unlike more complex adaptive rate limits.'

We've been using this in production for months -- it's pretty nifty ;) Never fear your logs again!
logs  logging  coding  java  open-source  swrve  slf4j  rate-limiting  libraries 
february 2015 by jm
google/error-prone
Nice looking static code validation tool for Java, from Google. I recognise a few of these errors ;)
google  static  code-validation  lint  testing  java  coding 
february 2015 by jm
Google Java Style
A good set of basic, controversy-free guidelines for clean java code style
style  java  google  coding  guidelines  formatting  coding-standards 
february 2015 by jm
A Quiet Defense of Patterns
Marc Brooker: 'When it comes to building working software in the long term, the emotional pursuit of craft is not as important as the human pursuit of teamwork, or the intellectual pursuit of correctness. Patterns is one of the most powerful ideas we have. The critics may be right that it devalues the craft, but we would all do well to remember that the craft of software is a means, not an end.'
marc-brooker  design-patterns  coding  software  teamwork 
february 2015 by jm
8 gdb tricks you should know (Ksplice Blog)
These are very good -- bookmarking for the next time I'm using gdb, probably about 3 years from now
c  debugging  gdb  c++  tips  coding 
january 2015 by jm
Your anonymous code contributions probably aren't
Scraping the work of successful contributors to the Google Code Jam competition, the researchers found that a mere eight training files with 70 lines of code each were enough to identify authors based in their syntactic, lexical, and layout habits.
anonymous  coding  open-source  google-code-jam  research  fingerprinting 
january 2015 by jm
Functional Programming Patterns (BuildStuff '14)
Good, and very accessible even for FP noobs like myself ;)
clojure  fp  functional  patterns  coding  scala 
january 2015 by jm
Are you better off running your big-data batch system off your laptop?
Heh, nice trolling.
Here are two helpful guidelines (for largely disjoint populations):

If you are going to use a big data system for yourself, see if it is faster than your laptop.
If you are going to build a big data system for others, see that it is faster than my laptop. [...]

We think everyone should have to do this, because it leads to better systems and better research.
graph  coding  hadoop  spark  giraph  graph-processing  hardware  scalability  big-data  batch  algorithms  pagerank 
january 2015 by jm
A Case Study of Toyota Unintended Acceleration and Software Safety
I drive a Toyota, and this is scary stuff. Critical software systems need to be coded with care, and this isn't it -- they don't even have a bug tracking system!
Investigations into potential causes of Unintended Acceleration (UA) for Toyota vehicles have made news several times in the past few years. Some blame has been placed on floor mats and sticky throttle pedals. But, a jury trial verdict was based on expert opinions that defects in Toyota's Electronic Throttle Control System (ETCS) software and safety architecture caused a fatal mishap.  This talk will outline key events in the still-ongoing Toyota UA litigation process, and pull together the technical issues that were discovered by NASA and other experts. The results paint a picture that should inform future designers of safety critical software in automobiles and other systems.
toyota  safety  realtime  coding  etcs  throttle-control  nasa  code-review  embedded 
january 2015 by jm
'Uncertain<T>: A First-Order Type for Uncertain Data' [paper, PDF]
'Emerging applications increasingly use estimates such as sensor
data (GPS), probabilistic models, machine learning, big
data, and human data. Unfortunately, representing this uncertain
data with discrete types (floats, integers, and booleans)
encourages developers to pretend it is not probabilistic, which
causes three types of uncertainty bugs. (1) Using estimates
as facts ignores random error in estimates. (2) Computation
compounds that error. (3) Boolean questions on probabilistic
data induce false positives and negatives.
This paper introduces Uncertain<T>, a new programming
language abstraction for uncertain data. We implement a
Bayesian network semantics for computation and conditionals
that improves program correctness. The runtime uses sampling
and hypothesis tests to evaluate computation and conditionals
lazily and efficiently. We illustrate with sensor and
machine learning applications that Uncertain<T> improves
expressiveness and accuracy.'

(via Tony Finch)
via:fanf  uncertainty  estimation  types  strong-typing  coding  probability  statistics  machine-learning  sampling 
december 2014 by jm
A Virtual Machine in Excel
'Ádám was trying his hand at a problem in Excel, but the official rules prohibit the use of Excel macros. In a daze, he came up with one of the most clever uses of Excel: building an assembly interpreter with the most popular spreadsheet program. This is a virtual Harvard architecture machine without writable RAM; the stack is only lots and lots of IFs.'
vms  excel  hacks  spreadsheets  coding 
december 2014 by jm
coz
A causal profiler for C++.
Causal profiling is a novel technique to measure optimization potential. This measurement matches developers' assumptions about profilers: that optimizing highly-ranked code will have the greatest impact on performance. Causal profiling measures optimization potential for serial, parallel, and asynchronous programs without instrumentation of special handling for library calls and concurrency primitives. Instead, a causal profiler uses performance experiments to predict the effect of optimizations. This allows the profiler to establish causality: "optimizing function X will have effect Y," exactly the measurement developers had assumed they were getting all along.


I can see this being a good technique to stochastically discover race conditions and concurrency bugs, too.
optimization  c++  performance  coding  profiling  speed  causal-profilers 
december 2014 by jm
Working Effectively with Unit Tests
$14.99 ebook, recommended by Steve Vinoski, looks good
unit-testing  testing  ebooks  jay-fields  tests  steve-vinoski  coding 
december 2014 by jm
Java for Everything
Actually, I'm really agreeing with a lot of this. Particularly this part:
Programmers will cringe at writing some kind of command dispatch list:

if command = "up":
up()
elif command = "status":
status()
elif command = "revert":
revert()
...

so they’ll go off and write some introspecting auto-dispatch cleverness, but that takes longer to write and will surely confuse future readers who’ll wonder how the heck revert() ever gets called. Yet the programmer will incorrectly feel as though he saved himself time. This is the trap of the dynamic language. It feels like you’re being more productive, but aside from the first 10 minutes of a new program, you’re not. Just write the stupid dispatch manually and get on with the real work.


I've also gone right off dynamic languages for any kind of non-toy work.

Mind you he needs to get around to ditching Vim for a proper IDE. That's the key thing that makes coding in a statically-typed language really pleasant -- when graphical refactoring becomes easy and usable, and errors are visible as you type them...
java  coding  static-typing  python  unit-tests 
november 2014 by jm
ExecutorService - 10 tips and tricks
Excellent advice from Tomasz Nurkiewicz' blog for anyone using java.util.concurrent.ExecutorService regularly. The whole blog is full of great posts btw
concurrency  java  jvm  threading  threads  executors  coding 
november 2014 by jm
Flow, a new static type checker for JavaScript
Unlike the (excellent) Typescript, it'll infer types:
Flow’s type checking is opt-in — you do not need to type check all your code at once. However, underlying the design of Flow is the assumption that most JavaScript code is implicitly statically typed; even though types may not appear anywhere in the code, they are in the developer’s mind as a way to reason about the correctness of the code. Flow infers those types automatically wherever possible, which means that it can find type errors without needing any changes to the code at all. On the other hand, some JavaScript code, especially frameworks, make heavy use of reflection that is often hard to reason about statically. For such inherently dynamic code, type checking would be too imprecise, so Flow provides a simple way to explicitly trust such code and move on. This design is validated by our huge JavaScript codebase at Facebook: Most of our code falls in the implicitly statically typed category, where developers can check their code for type errors without having to explicitly annotate that code with types.
facebook  flow  javascript  coding  types  type-inference  ocaml  typescript 
november 2014 by jm
How “Computer Geeks” replaced “Computer Girls"
As historian Nathan Ensmenger explained to a Stanford audience, as late as the 1960s many people perceived computer programming as a natural career choice for savvy young women. Even the trend-spotters at Cosmopolitan Magazine urged their fashionable female readership to consider careers in programming. In an article titled “The Computer Girls,” the magazine described the field as offering better job opportunities for women than many other professional careers. As computer scientist Dr. Grace Hopper told a reporter, programming was “just like planning a dinner. You have to plan ahead and schedule everything so that it’s ready when you need it…. Women are ‘naturals’ at computer programming.” James Adams, the director of education for the Association for Computing Machinery, agreed: “I don’t know of any other field, outside of teaching, where there’s as much opportunity for a woman.”
history  programming  sexism  technology  women  feminism  coding 
november 2014 by jm
How I reverse-engineered Google Docs to play back any document's keystrokes « James Somers (jsomers.net)
Excellent write-up of this little-known undocumented GDocs behaviour, an artifact of its operational-transformation sync mechanism
operational-transformation  ot  google  gdocs  coding  docs  sync  undocumented  reversing 
november 2014 by jm
Please grow your buffers exponentially
Although in some cases x1.5 is considered good practice. YMMV I guess
malloc  memory  coding  buffers  exponential  jemalloc  firefox  heap  allocation 
november 2014 by jm
A Teenager Gets Grilled By Her Dad About Why She’s Not That Into Coding
Jay Rosen interviews his 17-year-old daughter. it's pretty eye-opening. Got to start them early!
culture  tech  coding  girls  women  feminism  teenagers  school  jay-rosen  stem 
october 2014 by jm
AtScript
a new "types for Javascript" framework, from the team behind Angular.js -- they plan to "harmonize" it with TypeScript and pitch it for standardization, which would be awesome.

(via Rob Clancy)
via:robc  atscript  javascript  typescript  types  languages  coding  google  angular 
october 2014 by jm
Cuckoo Filters
'In many networking systems, Bloom filters are used for high-speed set membership tests. They permit a small fraction of false positive answers with very good space efficiency. However, they do not permit deletion of items from the set, and previous attempts to extend “standard” Bloom filters to support deletion all degrade either space or performance. We propose a new data structure called the cuckoo filter that can replace Bloom filters for approximate set member- ship tests. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters. For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Our experimental results also show that cuckoo filters out-perform previous data structures that extend Bloom filters to support deletions substantially in both time and space.'
algorithms  cs  coding  cuckoo-filters  bloom-filters  sets  data-structures 
october 2014 by jm
Falsehoods programmers believe about time
I have repeatedly been confounded to discover just how many mistakes in both test and application code stem from misunderstandings or misconceptions about time. By this I mean both the interesting way in which computers handle time, and the fundamental gotchas inherent in how we humans have constructed our calendar — daylight savings being just the tip of the iceberg.

In fact I have seen so many of these misconceptions crop up in other people’s (and my own) programs that I thought it would be worthwhile to collect a list of the more common problems here.


See also the follow-up: http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time-wisdom

(via Marc)
via:marcomorain  time  dates  timezones  coding  gotchas  calendar  bugs 
october 2014 by jm
Move Fast and Break Nothing
Great presentation about Github dev culture and building software without breakage, but still with real progress.
github  programming  communication  process  coding  teams  management  dev-culture  breakage 
october 2014 by jm
"Quantiles on Streams" [paper, 2009]
'Chiranjeeb Buragohain and Subhash Suri: "Quantiles on Streams" in Encyclopedia of Database Systems, Springer, pp 2235–2240, 2009. ISBN: 978-0-387-35544-3', cited by Martin Kleppman in http://mail-archives.apache.org/mod_mbox/kafka-dev/201402.mbox/%3C131A7649-ED57-45CB-B4D6-F34063267664@linkedin.com%3E as a good, short literature survey re estimating percentiles with a small memory footprint.
latency  percentiles  coding  quantiles  streams  papers  algorithms 
october 2014 by jm
Validate SQL queries at compile-time in Rust
The sql! macro will validate that its string literal argument parses as a valid Postgres query.


Based on https://pganalyze.com/blog/parse-postgresql-queries-in-ruby.html , which links the PostgreSQL server code directly into a C extension. Mad stuff, Ted!

(via Rob Clancy)
macros  postgres  compile  validation  sql  rust  coding 
october 2014 by jm
A Linear-Time, One-Pass Majority Vote Algorithm
This algorithm, which Bob Boyer and I invented in 1980, decides which element of a sequence is in the majority, provided there is such an element.
algorithms  one-pass  o(1)  coding  majority  top-k  sorting 
september 2014 by jm
Alex Payne — Thoughts On Five Years of Emerging Languages
One could read the success of Go as an indictment of contemporary PLT, but I prefer to see it as a reminder of just how much language tooling matters. Perhaps even more critical, Go’s lean syntax, selective semantics, and cautiously-chosen feature set demonstrate the importance of a strong editorial voice in a language’s design and evolution.

Having co-authored a book on Scala, it’s been painful to see systems programmers in my community express frustration with the ambitious hybrid language. I’ve watched them abandon ship and swim back to the familiar shores of Java, or alternately into the uncharted waters of Clojure, Go, and Rust. A pity, but not entirely surprising if we’re being honest with ourselves.

Unlike Go, Scala has struggled with tooling from its inception. More than that, Scala has had a growing editorial problem. Every shop I know that’s been successful with Scala has limited itself to some subset of the language. Meanwhile, in pursuit of enterprise developers, its surface area has expanded in seemingly every direction. The folks behind Scala have, thankfully, taken notice: upcoming releases are promised to focus on simplicity, clarity, and better tooling.
scala  go  coding  languages 
september 2014 by jm
on using JSON as a config file format
Ben Hughes on twitter:

"JSON is fine for config files, if you don't want to comment your config file. Which is a way of saying, it isn't fine for config files."
ben-hughes  funny  json  file-formats  config-files  configuration  software  coding 
september 2014 by jm
CLion – Brand New IDE for C and C++ Developers
JetBrains (makers of the excellent Intelli/J) have come out with a C/C++ refactoring IDE which looks utterly fantastic. If I wind up hacking on C/C++ again in future, I'll be using this one
c  c++  refactoring  ide  intelli-j  clion  jetbrains  editors  coding 
september 2014 by jm
"Invertible Bloom Lookup Tables" [paper]
'We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of key-value pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of key- value pairs is below a designed threshold. Our structure allows the number of key-value pairs to greatly exceed this threshold during normal operation. Exceeding the threshold simply temporarily prevents content listing and reduces the probability of a successful lookup. If entries are later deleted to return the structure below the threshold, everything again functions appropriately. We also show that simple variations of our structure are robust to certain standard errors, such as the deletion of a key without a corresponding insertion or the insertion of two distinct values for a key. The properties of our structure make it suitable for several applications, including database and networking applications that we highlight.'
iblt  bloom-filters  data-structures  performance  algorithms  coding  papers  probabilistic 
september 2014 by jm
3 Rules of thumb for Bloom Filters
I often need to do rough back-of-the-envelope reasoning about things, and I find that doing a bit of work to develop an intuition for how a new technique performs is usually worthwhile. So, here are three broad rules of thumb to remember when discussing Bloom filters down the pub:

One byte per item in the input set gives about a 2% false positive rate.

The optimal number of hash functions is about 0.7 times the number of bits per item.

3 - The number of hashes dominates performance.

But see also http://stackoverflow.com/a/9554448 , http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf (thanks Tony Finch!)
bloom-filters  algorithm  probabilistic  rules  reasoning  via:norman-maurer  false-positives  hashing  coding 
august 2014 by jm
Collection Pipeline
a nice summarisation of the state of pipe/stream-oriented collection operations in various languages, from Martin Fowler
martin-fowler  patterns  coding  ruby  clojure  streams  pipelines  pipes  unix  lambda  fp  java  languages 
july 2014 by jm
Metrics-Driven Development
we believe MDD is equal parts engineering technique and cultural process. It separates the notion of monitoring from its traditional position of exclusivity as an operations thing and places it more appropriately next to its peers as an engineering process. Provided access to real-time production metrics relevant to them individually, both software engineers and operations engineers can validate hypotheses, assess problems, implement solutions, and improve future designs.


Broken down into the following principles: 'Instrumentation-as-Code', 'Single Source of Truth', 'Developers Curate Visualizations and Alerts', 'Alert on What You See', 'Show me the Graph', 'Don’t Measure Everything (YAGNI)'.

We do all of these at Swrve, naturally (a technique I happily stole from Amazon).
metrics  coding  graphite  mdd  instrumentation  yagni  alerting  monitoring  graphs 
july 2014 by jm
"Pitfalls of Object Oriented Programming", SCEE R&D
Good presentation discussing "data-oriented programming" -- the concept of optimizing memory access speed by laying out large data in a columnar format in RAM, rather than naively in the default layout that OOP design suggests
columnar  ram  memory  optimization  coding  c++  oop  data-oriented-programming  data  cache  performance 
july 2014 by jm
stout
a C++ library adding some modern language features like Option, Try, Stopwatch, and other Guava-ish things (via @cscotta)
c++  library  stout  option  try  guava  coding 
july 2014 by jm
ThreadSanitizer
Google's purify/valgrind-like concurrency checking tool:

'As a bonus, ThreadSanitizer finds some other types of bugs: thread leaks, deadlocks, incorrect uses of mutexes, malloc calls in signal handlers, and more. It also natively understands atomic operations and thus can find bugs in lock-free algorithms. [...] The tool is supported by both Clang and GCC compilers (only on Linux/Intel64). Using it is very simple: you just need to add a -fsanitize=thread flag during compilation and linking. For Go programs, you simply need to add a -race flag to the go tool (supported on Linux, Mac and Windows).'
concurrency  bugs  valgrind  threadsanitizer  threading  deadlocks  mutexes  locking  synchronization  coding  testing 
june 2014 by jm
How to make breaking changes and not break all the things
Well-written description of the "several backward-compatible changes" approach to breaking-change schema migration (via Marc)
databases  coding  compatibility  migration  schemas  sql  continuous-deployment 
june 2014 by jm
quotly/test/acceptance/adding_quotes_spec.rb at master · cavalle/quotly · GitHub
Decent demo of acceptance testing using rspec (and some syntactic sugar to make it read like Steak code, I think)
rspec  acceptance-testing  bdd  testing  ruby  coding 
june 2014 by jm
ScalaTest
Scala's BDD approach -- very similar to Steak in Rubyland I think
scala  testing  bdd  acceptance-testing  steak  coding  scalatest 
june 2014 by jm
cavalle/steak · GitHub
a minimal extension of RSpec-Rails that adds several conveniences to do acceptance testing of Rails applications using Capybara. It's an alternative to Cucumber in plain Ruby.


Good approach here to copy, but very tied to Rails.
rails  ruby  testing  acceptance-testing  steak  bdd  rspec  coding 
june 2014 by jm
PetRegistrationAndPurchase.cs
A good example of "raw" BDD, without using a framework like Cucumber, Steak etc.
bdd  testing  csharp  acceptance-tests  coding 
june 2014 by jm
Cap'n Proto, FlatBuffers, and SBE
a feature comparison of these new serialization formats from Kenton, the capnp dude
serialization  protobuf  capnproto  sbe  flatbuffers  google  coding  storage 
june 2014 by jm
#AltDevBlog » Parallel Implementations
John Carmack describes this code-evolution approach to adding new code:
The last two times I did this, I got the software rendering code running on the new platform first, so everything could be tested out at low frame rates, then implemented the hardware accelerated version in parallel, setting things up so you could instantly switch between the two at any time.  For a mobile OpenGL ES application being developed on a windows simulator, I opened a completely separate window for the accelerated view, letting me see it simultaneously with the original software implementation.  This was a very significant development win.

If the task you are working on can be expressed as a pure function that simply processes input parameters into a return structure, it is easy to switch it out for different implementations.  If it is a system that maintains internal state or has multiple entry points, you have to be a bit more careful about switching it in and out.  If it is a gnarly mess with lots of internal callouts to other systems to maintain parallel state changes, then you have some cleanup to do before trying a parallel implementation.

There are two general classes of parallel implementations I work with:  The reference implementation, which is much smaller and simpler, but will be maintained continuously, and the experimental implementation, where you expect one version to “win” and consign the other implementation to source control in a couple weeks after you have some confidence that it is both fully functional and a real improvement.

It is completely reasonable to violate some generally good coding rules while building an experimental implementation – copy, paste, and find-replace rename is actually a good way to start.  Code fearlessly on the copy, while the original remains fully functional and unmolested.  It is often tempting to shortcut this by passing in some kind of option flag to existing code, rather than enabling a full parallel implementation.  It is a  grey area, but I have been tending to find the extra path complexity with the flag approach often leads to messing up both versions as you work, and you usually compromise both implementations to some degree.


(via Marc)
via:marc  coding  john-carmack  parallel  development  evolution  lifecycle  project-management 
june 2014 by jm
"Taking the hotdog"
aka. lock acquisition. ex-Amazon-Dublin lingo, observed in the wild ;)
language  hotdog  archie-mcphee  amazon  dublin  intercom  coding  locks  synchronization 
may 2014 by jm
The programming error that cost Mt Gox 2609 bitcoins
Digging into broken Bitcoin scripts in the blockchain. Fascinating:
While analyzing coinbase transactions, I came across another interesting bug that lost bitcoins. Some transactions have the meaningless and unredeemable script:

OP_IFDUP
OP_IF
OP_2SWAP
OP_VERIFY
OP_2OVER
OP_DEPTH

That script turns out to be the ASCII text script. Instead of putting the redemption script into the transaction, the P2Pool miners accidentally put in the literal word "script". The associated bitcoins are lost forever due to this error.


(via Nelson)
programming  script  coding  bitcoin  mtgox  via:nelson  scripting  dsls 
may 2014 by jm
« earlier      
per page:    204080120160

related tags

1980s  acceleration  acceptance-testing  acceptance-tests  actors  admin  advice  adwords  aes  age  agile  akka  alerting  algorithm  algorithms  allan-klumpp  allocation  amazon  android  angular  annotations  anonymous  aosp  aphorisms  api  apis  apollo-program  apple  approximate  approximation  apps  archie-mcphee  architecture  architecture-astronauts  archival  arrays  articles  ascii  assembly  async  atomic  atscript  austerity  automation  autosave  aws  backoff  backpressure  bash  batch  bazel  bdd  ben-hughes  benchmarking  benchmarks  best-practices  big-data  big-o  binary  binary-tree  biology  bitcoin  bitrot  bits  blaze  block-oriented  bloom-filter  bloom-filters  book  books  bpf  branch  branch-prediction  branching  breakage  brogrammers  bsd  bst  buffers  bug-tracking  bugs  build  build-tools  building  business  c  c++  c-i  c=64  cache  caching  calendar  camry  cap  capn-proto  capnproto  cardinality  career  careers  cas  causal-profilers  cbc  cep  cheat-sheet  checklists  chef  children  chrome  chronon  clean-code  clearcase  cli  client-side  clion  clojure  closures  cloud  cms  code  code-digger  code-review  code-reviews  code-smells  code-validation  code-verification  code.org  coderdojo  coding  coding-standards  collaboration  collections  columnar  combinatorial  communication  community  compatibility  compilation  compile  compiler  compilers  complexity  compression  computation  computational-biology  computer-science  computing  concurrency  config-files  configuration  consistent-hashing  const  constraint-solving  continous-integration  continuous-deployment  contracts  conversion  cork  corrupt  cost  craft  crash-only-software  crashing  crdts  cron  crypto  cryptography  cs  csail  csharp  css  ctr  cuckoo-filters  cucumber  cuda  culture  curator  currency  cyclomatic-complexity  cyoa  dashcode  data  data-oriented-programming  data-structures  databases  dates  david-ungar  deadlocks  debt  debugger  debugging  decay  demos  dependencies  dependency-injection  deploy  deployment  deplyment  design  design-patterns  dev  dev-culture  development  devops  disk  display  disruptor  dist-sys  distcomp  distributed  distributed-locks  distributed-systems  djb  dmitry-vyukov  docs  don-eyles  dot-net  doug-lea  download  dry  dsl  dsls  dublin  duct-tape  dvcs  dvr  dynamic  ebooks  ec2  ecb  ecc  eclipse  economics  economy  editors  education  eiffel  elitism  embedded  embedded-systems  emulation  encapsulation  encryption  engineering  engines  entropy  eric-hammond  eric-veach  erlang  error-checking  errors  essay  estimation  estonia  etcs  etsy  event-processing  event-sourcing  events  eventual-consistency  evernote  evolution  excel  exceptions  executor  executors  experts  exploits  exponential  exponential-backoff  extensions  facebook  fail  false-positives  fault-tolerance  feature-team  feminism  file-formats  final  finance  findbugs  fingerprinting  firefighting  firefox  firmware  fixing  flake8  flatbuffers  flickr  float  floating-point  flow  fluent  fluent-interfaces  fluentiterable  font  fonts  formal-methods  formats  formatting  fortran  fp  free  frequency-tables  frp  fsm  functional  functional-programming  funny  fuzzy-matching  g1  ga  games  gaming  garbage-collection  gc  gdb  gdocs  geek  gems  genetic-algorithms  gerrit  gil  gil-tene  giraph  girls  git  github  gmail  go  google  google-code-jam  google-drive  gotchas  goto  goto-fail  gpu  gradle  graph  graph-processing  graphite  graphs  grpc  guardian  guava  guidelines  hacker-news  hackers  hacking  hacks  hadoop  hardware  hash-tables  hashing  hashtables  hax  hdr  head-mounted-display  heap  hero-coder  hero-culture  heuristics  hex  hijack  hiring  histogram  history  hive  hll  hmac  hobbies  honeybadger  honeypots  horror  hotdog  hotspot  html  http  https  humor  hyperloglog  i7  i14y  iblt  ibm  ide  idea  ides  immutability  incident-response  indexing  inept  injector  input  instrumentation  integers  integration  intel  intel-core  intelli-j  intellij  interactive  intercom  interfaces  internet  interoperability  interpreters  interviews  invalid  invariants  ios  iphone  irb  ireland  james-hamilton  jargon  java  java-8  java8  javascript  jay-fields  jay-kreps  jay-rosen  jeff-atwood  jemalloc  jenkins  jersey  jetbrains  jetty  jgc  jitter  jobs  joel-spolsky  john-carmack  jokes  jpl  jpmorgan  jq  js  js1k  json  justin-bieber  jvm  jwz  k-8  kafka  kappa  kernel  kids  kludges  knowledge  lambda  language  languages  latency  laws  learning  lectures  legal  leonard-richardson  let-it-fail  libraries  library  life  lifecycle  like  linkedin  lint  linux  linux-journal  lisp  live  load-balancing  lock-free  locking  locks  lockstep  log  log4j  logging  loglog  logs  london-whale  lookup3  lua  lucene  machine-learning  macros  magic  majority  make  makefiles  malloc  mame  management  marc-brooker  marshalling  martin-fowler  martin-thompson  mathematics  maths  matrix  mdd  measurement  measuring  mechanical-sympathy  meebo  memory  messaging  metrics  microreboot  microservices  microsoft  migration  minecraft  misra-c  mit  mitch-garnaat  mobile  mocking  mocks  model-checking  models  money  monitoring  monkey-patching  monorepi  monorepo  monospace  mozilla  mtgox  multicore  multiprocessing  murmurhash  mutexes  mysql  nasa  nbta  ncsu  neologisms  netflix  netty  network  networking  node  node.js  nostalgia  numbers  nytimes  o(1)  observable  observables  ocaml  ohurley  one-liners  one-pass  oo  oop  open-source  openbsd  openssl  operational-transformation  ops  optimization  option  organisations  osx  ot  ouch  overengineering  packaging  pagerank  pair-programming  papers  parallel  parallelism  partitions  patents  patterns  paul-krugman  pdf  peer-pressure  percentiles  percona  performance  periodic  persistence  philosophy  pickling  ping  pipelines  pipes  pivotal  pixar  play  plos  politics  postgres  poul-henning-kemp  preconditions  premature-flexibilization  presentations  printf  private-keys  probabilistic  probability  process  processors  production  profiling  programming  programming-languages  project-management  prophet  proto3  protobuf  protobufs  protocol-buffers  protocols  provisioning  proxies  pt-query-digest  pthreads  puzzles  python  q-digest  qa  qnx  quake-3  quality  quantiles  quants  querying  questions  queue  queues  quotes  race-and-repair  radix-sort  rafe-colburn  rahman-algorithm  rails  rake  ram  random  randomness  rants  raspberry-pi  rate-limiting  reactive  real-time  realtime  reasoning  record  recordinality  recovery  recruiting  redis  redo  refactoring  reference  reform  refs  refuctoring  relearning  release  reliability  remote  remote-work  renderman  repl  replay  replication  repo  repository  reputation  research  rest  restful  retries  reversing  revert  reviews  rips  rob-pike  ross-anderson  rpc  rr  rsa  rspec  rtos  ruby  rubygems  rules  rust  rusty-russell  rx  rxjava  s3  safety  sampling  sbe  scala  scalability  scalatest  scaling  schemas  school  schools  science  scm  script  scripting  scrum  sd  sde  sde-fundamentals  search  searching  security  sed  semantics  semaphores  senior  serialization  server  server-side  services  set  set-cover  sets  sexism  sh  sharding  shell  shell-scripts  shellcode  side-channels  silicon-valley  simd  sinatra  sip  sketching  skills  skiplists  slang  slf4j  slides  sns  soa  society  software  software-development  solver  sorting  soundcloud  source-code  source-code-analysis  source-control  space  space-saving  spacex  spaghetti-code  spark  specifications  speech  speed  spores  spreadsheets  spy-hunter  sql  ssds  sse  ssh  ssl  stack-overflow  staffing  starcraft  static  static-analysis  static-code-analysis  static-typing  statistics  steak  stem  steve-vinoski  steven-skiena  storage  stout  strchr  stream-processing  streaming  streams  string-matching  string-search  stringly-typed  strings  strlen  strong-typing  strstr  students  studies  style  succinct  succinct-encoding  surveys  sux  swpats  swrve  symbol-alphabets  sync  synchronization  syntax  sysadmin  system-tests  systems  tags  takedowns  tammer-saleh  tcpdump  tdd  teaching  teams  teamwork  tech  tech-debt  techdirt  technology  tee  teenagers  telecommuting  testability  testing  tests  text  text-matching  the-duck  thomas-ptacek  thread-safety  threading  threadpool  threads  threadsanitizer  throttle-control  time  time-warp  timecop  timezones  timing  tips  tls  tools  top-k  toread  toyota  trac  trading  transactions  trees  tricks  tridge  tries  try  tuning  turing-complete  twisted  twitter  type-inference  types  typescript  typography  ui  uncertainty  undocumented  unit-testing  unit-tests  unix  usa  user-scripts  vagrant  valgrind  validation  value-at-risk  varnish  verification  version-control  via:akohli  via:ben  via:bos  via:cjhorn  via:cliffc  via:fanf  via:hn  via:iamcal  via:igrigorik  via:its  via:janl  via:jzawodny  via:marc  via:marcomorain  via:Mozai  via:nelson  via:norman-maurer  via:oisin  via:peakscale  via:preddit  via:proggit  via:robc  via:sergio-bossa  via:tom  via:twitter  via:walter  video  vietnam  vim  virtual-clock  vision  visual-programming  vms  vnc  volatile  walmart  web  web-services  witchcraft  women  work  workflows  wtf  xp  yagni  youtube  zerg-rush  zookeeper 

Copy this bookmark:



description:


tags: