Notes on Distributed Systems for Young Bloods
'Below is a list of some lessons I’ve learned as a distributed systems engineer that are worth being told to a new engineer. Some are subtle, and some are surprising, but none are controversial. This list is for the new distributed systems engineer to guide their thinking about the field they are taking on. It’s not comprehensive, but it’s a good beginning.' This is a pretty nice list, a little over-stated, but that's the format. I particularly like the following: 'Exploit data-locality'; 'Learn to estimate your capacity'; 'Metrics are the only way to get your job done'; 'Use percentiles, not averages'; 'Extract services'.
systems  distributed  distcomp  cap  metrics  coding  guidelines  architecture  backpressure  design  twitter 
january 2013
Effective Scala
Twitter's Scala style guide. 'While highly effective, Scala is also a large language, and our experiences have taught us to practice great care in its application. What are its pitfalls? Which features do we embrace, which do we eschew? When do we employ “purely functional style”, and when do we avoid it? In other words: what have we found to be an effective use of the language? This guide attempts to distill our experience into short essays, providing a set of best practices. Our use of Scala is mainly for creating high volume services that form distributed systems — and our advice is thus biased — but most of the advice herein should translate naturally to other domains.'
twitter  scala  coding  style 
january 2013
OmniTI's Experiences Adopting Chef
A good, in-depth writeup of OmniTI's best practices with respect to build-out of multiple customer deployments, using multi-tenant Chef from a version-controlled repo. Good suggestions, and I am really looking forward to this bit:

'Chef tries to turn your system configuration into code. That means you now inherit all the woes of software engineering: making changes in a coordinated manner and ensuring that changes integrate well are now an even greater concern. In part three of this series, we’ll look at applying software quality assurance and release management practices to Chef cookbooks and roles.'
chef  deployment  ops  omniti  systems  vagrant  automation 
january 2013
'uses DNS witchcraft to allow you to access US/UK-only audio and video services like Hulu.com, BBC iPlayer, etc. without using a VPN or Web proxy.' According to http://superuser.com/questions/461316/how-does-tunlr-work , it proxies the initial connection setup and geo-auth, then mangles the stream address to stream directly, not via proxy. Sounds pretty useful
proxy  network  vpn  dns  tunnel  content  video  audio  iplayer  bbc  hulu  streaming  geo-restriction 
january 2013
Dan McKinley :: Whom the Gods Would Destroy, They First Give Real-time Analytics
'It's important to divorce the concepts of operational metrics and product analytics. [..] Funny business with timeframes can coerce most A/B tests into statistical significance.' 'The truth is that there are very few product decisions that can be made in real time.'

HN discussion: http://news.ycombinator.com/item?id=5032588
real-time  analytics  statistics  a-b-testing 
january 2013
The Justin Masonic Lodge
whoa. (via Dave O'Riordan)
wtf  masons  names  me  texas 
january 2013
What happened to KHTML after Apple announced Safari
'There was a huge amount of excitement at the announcement that Safari would be using KHTML. At that time, it was almost a given that the OSS rendering engine was Gecko. KHTML was KDE's little engine that could. But nobody ever expected it to be picked up by other folks. One of the original parts of the KHTML-to-OS X port was KWQ (pronounced, "quack") that abstracted out the KDE API portions that were used in KHTML.
Folks were pretty ecstatic at first. It seemed very validating.
But that changed quickly. As Zack's post indicates, WebKit became a thing of unmergable code-drops. Even inside of the KDE community there became a split between the KHTML purists and the WebKit faction. They'd previously more or less all been KHTML developers, but post-WebKit there was something of a pragmatists vs. idealists split. Zack fell on the latter side of that (for understandable reasons: there was an existing community project, with its own set of values, and that was hijacked to a large extent by WebKit).
A few years later WebKit transformed itself into a more or less valid open source project (see webkit.org), but that didn't close the rift in the KDE community between the two, at that point rather divergent, rendering engines. There's still some remaining melancholy that stems from that initial hope and what could have potentially been, but wasn't.'
history  safari  open-source  code-drops  over-the-wall  webkit  khtml  kde  oss  apple 
january 2013
paperplanes. The Virtues of Monitoring, Redux
A rather vague and touchy-feely "state of the union" post on monitoring. Good set of links at the end, though; I like the look of Sensu and Tasseo, but am still unconvinced about the value of Boundary's offering
monitoring  metrics  ops 
january 2013
'a Nagios plugin to poll Graphite'. Necessary, since service metrics are the true source of service health information
nagios  graphite  service-metrics  ops 
january 2013
Pushover: Simple Mobile Notifications for Android and iOS
'Pushover makes it easy to send real-time notifications to your Android and iOS devices.' extremely simple HTTPS API; 'Pushover has no monthly subscription fees and users will always be able to receive unlimited messages for free. Most applications can send messages for free, subject to monthly limits.' Also supported by ifttt.com
ios  android  iphone  push  messaging 
january 2013
Greyhound agrees to change consumer contracts and make refunds - National Consumer Agency
Take note, switchers:

'The National Consumer Agency (NCA) has received a commitment from Greyhound that it will amend certain terms in its standard consumer contract, which the NCA thinks are unfair to consumers. This will be done by January 18 2013.

Among the terms considered unfair by the NCA are that consumers must forfeit their credit balance and pay a €45 administration fee, if they cancel their contract with Greyhound within 12 months. If you were charged money in these circumstances, Greyhound has agreed to refund you.

Greyhound will communicate these changes to all of its consumers by 18 January 2013. If you have any questions about the changes or getting a refund, you should contact Greyhound directly.'
greyhound  consumer  ireland  dublin  rubbish 
january 2013
Surprisingly Good Evidence That Real Name Policies Fail To Improve Comments
'Enough theorizing, there’s actually good evidence to inform the debate. For 4 years, Koreans enacted increasingly stiff real-name commenting laws, first for political websites in 2003, then for all websites receiving more than 300,000 viewers in 2007, and was finally tightened to 100,000 viewers a year later after online slander was cited in the suicide of a national figure. The policy, however, was ditched shortly after a Korean Communications Commission study found that it only decreased malicious comments by 0.9%. Korean sites were also inundated by hackers, presumably after valuable identities.

Further analysis by Carnegie Mellon’s Daegon Cho and Alessandro Acquisti, found that the policy actually increased the frequency of expletives in comments for some user demographics. While the policy reduced swearing and “anti-normative” behavior at the aggregate level by as much as 30%, individual users were not dismayed. “Light users”, who posted 1 or 2 comments, were most affected by the law, but “heavy” ones (11-16+ comments) didn’t seem to mind.

Given that the Commission estimates that only 13% of comments are malicious, a mere 30% reduction only seems to clean up the muddied waters of comment systems a depressingly negligent amount.

The finding isn’t surprising: social science researchers have long known that participants eventually begin to ignore cameras video taping their behavior. In other words, the presence of some phantom judgmental audience doesn’t seem to make us better versions of ourselves.'

(via Ronan Lyons)
anonymity  identity  policy  comments  privacy  politics  new-media  via:ronanlyons 
january 2013
Requests: HTTP for Humans
'an elegant and simple HTTP library for Python, built for human beings.' 'Requests is an Apache2 Licensed HTTP library, written in Python, for human beings. Python’s standard urllib2 module provides most of the HTTP capabilities you need, but the API is thoroughly broken. It was built for a different time — and a different web. It requires an enormous amount of work (even method overrides) to perform the simplest of tasks. Requests takes all of the work out of Python HTTP/1.1 — making your integration with web services seamless. There’s no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, powered by urllib3, which is embedded within Requests.'
python  http  urllib  libraries  requests  via:mikeste 
january 2013
A Non-Blocking HashTable by Dr. Cliff Click : programming
Proggit discovers the NonBlockingHashMap. This comment from Boundary's cscotta is particularly interesting: "The code is intricate and curiously-formatted, but NBHM is quite excellent. The majority of our analytics platform is backed by NBHMs updated rapidly in parallel. Cliff's a great, friendly, approachable guy; if you have any specific questions about the approaches or implementation, he may be happy to answer."
data-structures  algorithms  non-blocking  concurrency  threading  multicore  cliff-click  azul  maps  java  boundary 
january 2013
Efficient In-Memory Indexing with Generalized Prefix Trees [PDF]
'Efficient data structures for in-memory indexing gain in importance due to
(1) the exponentially increasing amount of data, (2) the growing main-memory capacity, and (3) the gap between main-memory and CPU speed. In consequence, there are
high performance demands for in-memory data structures. Such index structures are
used -- with minor changes -- as primary or secondary indices in almost every DBMS.
Typically, tree-based or hash-based structures are used, while structures based on
prefix-trees (tries) are neglected in this context. For tree-based and hash-based structures, the major disadvantages are inherently caused by the need for reorganization
and key comparisons. In contrast, the major disadvantage of trie-based structures in
terms of high memory consumption (created and accessed nodes) could be improved.
In this paper, we argue for reconsidering prefix trees as in-memory index structures
and we present the generalized trie, which is a prefix tree with variable prefix length
for indexing arbitrary data types of fixed or variable length. The variable prefix length
enables the adjustment of the trie height and its memory consumption. Further, we
introduce concepts for reducing the number of created and accessed trie levels. This
trie is order-preserving and has deterministic trie paths for keys, and hence, it does
not require any dynamic reorganization or key comparisons. Finally, the generalized
trie yields improvements compared to existing in-memory index structures, especially
for skewed data. In conclusion, the generalized trie is applicable as general-purpose
in-memory index structure in many different OLTP or hybrid (OLTP and OLAP) data
management systems that require balanced read/write performance.' (via Tony Finch)
via:fanf  prefix-trees  tries  data-structures 
january 2013
The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases [PDF]
'Main memory capacities have grown up to a point where most databases fit into RAM. For main-memory database systems, index structure performance is a critical bottleneck. Traditional in-memory data structures like balanced binary search trees are not efficient on modern hardware, because they do not optimally utilize on-CPU caches. Hash tables, also often used for main-memory indexes, are fast but only support point queries. To overcome these shortcomings, we present ART, an adaptive radix tree (trie) for efficient indexing in main memory. Its lookup performance surpasses highly tuned, read-only search trees, while supporting very efficient insertions and deletions as well. At the same time, ART is very space efficient and solves the problem of excessive worst-case space consumption, which plagues most radix trees, by adaptively choosing compact and efficient data structures for internal nodes. Even though ART’s performance is comparable to hash tables, it maintains the data in sorted order, which enables additional operations like range scan and prefix lookup.' (via Tony Finch)
via:fanf  data-structures  trees  indexing  cache-aware  tries 
january 2013
HAT-trie: A Cache-conscious Trie-based Data Structure for Strings [PDF]
'Tries are the fastest tree-based data structures for managing strings in-memory, but are space-intensive. The burst-trie is almost as fast but reduces space by collapsing trie-chains into buckets. This is not however, a cache-conscious approach and can lead to poor performance on current processors. In this paper, we introduce the HAT-trie, a cache-conscious trie-based data structure that is formed by carefully combining existing components. We evaluate performance using several real-world datasets and against other highperformance data structures. We show strong improvements in both time and space; in most cases approaching that of the cache-conscious hash table. Our HAT-trie is shown to be the most efficient trie-based data structure for managing variable-length strings in-memory while maintaining sort order.' (via Tony Finch)
via:fanf  data-structures  tries  cache-aware  trees 
january 2013
"Matters Computational - Ideas, Algorithms, Source Code"
A hefty tome (in PDF format) containing lots of interesting algorithms and computational tricks; code is GPLv3 licensed
coding  algorithms  computation  via:cliffc  pdf  books 
january 2013
airlift/airline · GitHub
Annotations-based git-like CLI helper for Java
git  cli  java 
january 2013
Why did infinite scroll fail at Etsy?
'A/B testing must be done in a modularized fashion. The “fail” case he gave was when Etsy spent months developing and testing infinite scroll to their search listings, only to find that it had a negative impact on engagement.' [...] 'instead of having the goal of “test infinite scroll,” Etsy realized it needed to test each assumption separately, and this going forward is their game plan.'
usability  testing  design  etsy  ab-testing  test  modularization  via:hn 
january 2013
'a HTTP client mock library for Python, 100% inspired on ruby's FakeWeb [ https://github.com/chrisk/fakeweb ].' 'HTTPretty monkey patches Python's socket core module, reimplementing the HTTP protocol by mocking requests and responses.'
mocking  testing  http  python  ruby  unit-tests  tests  monkey-patching 
january 2013
English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU
Amazing how consistent the n-gram counts are between Peter Norvig's analysis (here) against the 20120701 Google Books corpus, and Mark Mayzner's 20,000-word corpus from the early 1960s
english  statistics  n-grams  words  etaoin-shrdlu  peter-norvig  mark-mayzner 
january 2013
Lesser known crimes: do you own that copyright?
A very interesting crime on the Irish statute books:

Section 141 of the Copyright and Related Rights Act 2000 provides: A person who, for financial gain, makes a claim to enjoy a right under this Part [ie. copyright] which is, and which he or she knows or has reason to believe is, false, shall be guilty of an offence and shall be liable on conviction on indictment to a fine not exceeding £100,000, or to imprisonment for a term not exceeding 5 years, or both.
ireland  copyright  ip  false-claims  law 
january 2013
Dan McKinley :: Effective Web Experimentation as a Homo Narrans
Good demo from Etsy's A/B testing, of how the human brain can retrofit a story onto statistically-insignificant results. To fix: 'avoid building tooling that enables fishing expeditions; limit our post-hoc rationalization by explicitly constraining it before the experiment. Whenever we test a feature on Etsy, we begin the process by identifying metrics that we believe will change if we 1) understand what is happening and 2) get the effect we desire.'
testing  etsy  statistics  a-b-testing  fishing  ulysses-contract  brain  experiments 
january 2013
Grand Grand SALE
Makers of "Feck It Sure It's Grand" merchandise are flogging stuff for 50 cent (+ shipping) in their "out with the old" sale
sales  posters  prints  feck  ireland  grand-grand 
january 2013
Keep predicting and you’ll be right eventually?
debunking Ken Ring, the kiwi “long term weather prediction” “scientist” who gets trundled out every year around this time
ken-ring  weather  predictions  ireland  rain 
january 2013
Patent trolls want $1,000 for using scanners
We are truly living in the future -- a dystopian future, but one nonetheless. A patent troll manages to obtain "gobbledigook" patents on using a scanner to scan to PDF, then attempts to shake down a bunch of small companies before eventually running into resistance, at which point it "forks" into a bunch of algorithmically-named shell companies, spammer-style, sending the same demands. Those demands in turn contain this beauty of Stockholm-syndrome-inducing prose:

'You should know also that we have had a positive response from the business community to our licensing program. As you can imagine, most businesses, upon being informed that they are infringing someone’s patent rights, are interested in operating lawfully and taking a license promptly. Many companies have responded to this licensing program in such a manner. Their doing so has allowed us to determine that a fair price for a license negotiated in good faith and without the need for court action is a payment of $900 per employee. We trust that your organization will agree to conform your behavior to respect our patent rights by negotiating a license rather than continuing to accept the benefits of our patented technology without a license. Assuming this is the case, we are prepared to make this pricing available to you.'

And here's an interesting bottom line:

The best strategy for target companies? It may be to ignore the letters, at least for now. “Ignorance, surprisingly, works,” noted Prof. Chien in an e-mail exchange with Ars.

Her study of startups targeted by patent trolls found that when confronted with a patent demand, 22 percent ignored it entirely. Compare that with the 35 percent that decided to fight back and 18 percent that folded. Ignoring the demand was the cheapest option ($3,000 on average) versus fighting in court, which was the most expensive ($870,000 on average).

Another tactic that clearly has an effect: speaking out, even when done anonymously. It hardly seems a coincidence that the Project Paperless patents were handed off to a web of generic-sounding LLCs, with demand letters signed only by “The Licensing Team,” shortly after the “Stop Project Paperless” website went up. It suggests those behind such low-level licensing campaigns aren’t proud of their behavior. And rightly so.
patents  via:fanf  networks  printing  printers  scanning  patent-trolls  project-paperless  adzpro  gosnel  faslan 
january 2013
Systemd, systemd-nspawn, and namespaces for Linux service compartmentalization
"Using ReadOnlyDirectories= andInaccessibleDirectories= you may setup a file system namespace jail for your service. Initially, it will be identical to your host OS' file system namespace. By listing directories in these directives you may then mark certain directories or mount points of the host OS as read-only or even completely inaccessible to the daemon."
compartmentalisation  security  systemd  jails  namespaces  linux 
january 2013
Scaling Crashlytics: Building Analytics on Redis 2.6
How one analytics/metrics co is using Redis on the backend
analytics  redis  presentation  metrics 
january 2013
29c3 HashDOS presentation slides (PDF)
Summary: MurmurHash still vulnerable, likewise Cityhash and Python's hash -- use SipHash
via:fanf  cityhash  siphash  hash  dos  security  hashdos  murmurhash 
january 2013
AWS Advent 2012
'an annual exploration of Amazon Web Services.' Some great hacks here
aws  amazon  advent  sysadmin  s3  ec2  chef  puppet  ops 
december 2012
How We Vagrant
the enStratus “solo installer”; what they use for one-box testing, staging, and customer stack deployment, using chef-solo and Vagrant
chef  virtualization  vagrant  chef-solo  deployment  enstratus  cluster  stack 
december 2012
Baklava code
'thin software layers don’t add much value, especially when you have many such layers piled on each other. Each layer has to be pushed onto your mental stack as you dive into the code. Furthermore, the layers of phyllo dough are permeable, allowing the honey to soak through. But software abstractions are best when they don’t leak. When you pile layer on top of layer in software, the layers are bound to leak.'
code  design  terminology  food  antipatterns 
december 2012
Cliff Click in "A JVM Does What?"
interesting YouTubed presentation from Azul's Cliff Click on some java/JVM innards
presentation  concurrency  jvm  video  java  youtube  cliff-click 
december 2012
HBase Real-time Analytics & Rollbacks via Append-based Updates
Interesting concept for scaling up the write rate on massive key-value counter stores:
'Replace update (Get+Put) operations at write time with simple append-only writes and defer processing of updates to periodic jobs or perform aggregations on the fly if user asks for data earlier than individual additions are processed. The idea is simple and not necessarily novel, but given the specific qualities of HBase, namely fast range scans and high write throughput, this approach works very well.'
counters  analytics  hbase  append  sematext  aggregation  big-data 
december 2012
The innards of Evernote's new business analytics data warehouse
replacing a giant MySQL star-schema reporting server with a Hadoop/Hive/ParAccel cluster
horizontal-scaling  scalability  bi  analytics  reporting  evernote  via:highscalability  hive  hadoop  paraccel 
december 2012
Bunnie Huang is building a once-off custom laptop design
As one commenter says, "it's like watching a Jedi construct his own light-saber.” Quad-core ARM chips, on-board FPGA (!), and lots of other amazing hacker-friendly features; sounds like a one-of-a-kind device
laptop  hardware  bunnie-huang  arm  fpga  hackers 
december 2012
The Eire Markings
An attempt to catalogue some Emergency-era (ie. WWII) ground markings, used to notify US pilots that they were overflying the neutral Republic of Ireland
ireland  eire  history  wwii  the-emergency  war  geography  mapping 
december 2012
Shell Scripts Are Like Gremlins
Shell Scripts are like Gremlins. You start out with one adorably cute shell script. You commented it and it does one thing really well. It’s easy to read, everyone can use it. It’s awesome! Then you accidentally spill some water on it, or feed it late one night and omgwtf is happening!?

+1. I have to wean myself off the habit of automating with shell scripts where a clean, well-unit-tested piece of code would work better.
shell-scripts  scripting  coding  automation  sysadmin  devops  chef  deployment 
december 2012
The Aggregate Magic Algorithms
Obscure, low-level bit-twiddling tricks -- specifically:
Absolute Value of a Float, Alignment of Pointers, Average of Integers, Bit Reversal, Comparison of Float Values, Comparison to Mask Conversion, Divide Rounding, Dual-Linked List with One Pointer Field, GPU Any, GPU SyncBlocks, Gray Code Conversion, Integer Constant Multiply, Integer Minimum or Maximum, Integer Power, Integer Selection, Is Power of 2, Leading Zero Count, Least Significant 1 Bit, Log2 of an Integer, Next Largest Power of 2, Most Significant 1 Bit, Natural Data Type Precision Conversions, Polynomials, Population Count (Ones Count), Shift-and-Add Optimization, Sign Extension, Swap Values Without a Temporary, SIMD Within A Register (SWAR) Operations, Trailing Zero Count.

Many of these would be insane to use in anything other than the hottest of hot-spots, but good to have on file. (via Toby diPasquale)
hot-spots  optimisation  bit-twiddling  algorithms  via:codeslinger  snippets 
december 2012
The Mathematical Hacker
'The trouble with the Lisp-hacker tradition is that it is overly focused on the problem of programming -- compilers, abstraction, editors, and so forth -- rather than the problems outside the programmer's cubicle. I conjecture that the Lisp-school essayists -- Raymond, Graham, and Yegge -- have not “needed mathematics” because they spend their time worrying about how to make code more abstract. This kind of thinking may lead to compact, powerful code bases, but in the language of economics, there is an opportunity cost.'
mathematics  coding  maths  essay  hackers  lisp  fortran 
december 2012
Hotels to pay royalties on music - The Irish Times - Fri, Dec 14, 2012
'The operators of hotels, guesthouses and bed & breakfasts will have to pay royalties for any copyright music played in guest bedrooms [in Ireland]. [...] Under the agreement, the music charges will be set by Phonographic Performance Ireland Ltd (PPI). [...] When it initiated its case in 2010, the PPI said it was seeking payment of about €1 per bedroom per week or about 14 cent a night.'

I don't understand this. Most hotels do not play music in the rooms themselves. Does this apply if there is no music playing in the bedroom? Does it apply if the customer brings their own music? Are Dublin Bus to be next?
hotels  ppi  ireland  music  money  royalties 
december 2012
Authentication is machine learning
This may be the most insightful writing about authentication in years:
From my brief time at Google, my internship at Yahoo!, and conversations with other companies doing web authentication at scale, I’ve observed that as authentication systems develop they gradually merge with other abuse-fighting systems dealing with various forms of spam (email, account creation, link, etc.) and phishing. Authentication eventually loses its binary nature and becomes a fuzzy classification problem.</p><p>This is not a new observation. It’s generally accepted for banking authentication and some researchers like Dinei Florêncio and Cormac Herley have made it for web passwords. Still, much of the security research community thinks of password authentication in a binary way [..]. Spam and phishing provide insightful examples: technical solutions (like Hashcash, DKIM signing, or EV certificates), have generally failed but in practice machine learning has greatly reduced these problems. The theory has largely held up that with enough data we can train reasonably effective classifiers to solve seemingly intractable problems.

(via Tony Finch.)
passwords  authentication  big-data  machine-learning  google  abuse  antispam  dkim  via:fanf 
december 2012
Inside the Mcor IRIS
'The results are startlingly good. This 3D printed skull [see pic] looks almost real. This is the print quality everyone will be able to access when Mcor’s deal with Staples enables 3D printing from copy centers.'
mcor  staples  irish  tech  3d-printing  paper 
december 2012
Linux nukes 386 support
"there's a nostalgic cost: your old original 386 DX33 system from early 1991 won't be able to boot modern Linux kernels anymore. Sniff."

Now *THAT* is backwards compatibility.
linux  backwards-compatibility  386  history  linus-torvalds 
december 2012
GMail partial outage - Dec 10 2012 incident report [PDF]
TL;DR: a bad load balancer change was deployed globally, causing the impact. 21 minute time to detection. Single-location rollout is now on the cards
gmail  google  coe  incidents  postmortems  outages 
december 2012
Two Sides For Salvation « Code as Craft
Etsy's MySQL master-master pair configuration, and how it allows no-downtime schema changes
database  etsy  mysql  replication  schema  availability  downtime 
december 2012
BBC News - The hum that helps to fight crime
'Dr Harrison said: "If we have we can extract [the hum of the mains AC power's 50Hz wave] and compare it with the database, if it is a continuous recording, it will all match up nicely. "If we've got some breaks in the recording, if it's been stopped and started, the profiles won't match or there will be a section missing. Or if it has come from two different recordings looking as if it is one, we'll have two different profiles within that one recording." In the UK, because one national grid supplies the country with electricity, the fluctuations in frequency are the same the country over. So it does not matter if the recording has been made in Aberdeen or Southampton, the comparison will work.'
buzz  hum  uk  mains  power  50hz  crime  forensics  bbc 
december 2012
Damn Fine Print
lovely signed and editioned prints by Dublin's best illustrators at good prices. Turns out this was in connection with a show a few days ago, so the best ones are now sold out -- I love the Chris Judge Liberty Hall print -- but there's still a few good ones left. Brian Gallagher's Georgian doorway is a beauty.
illustration  dublin  prints  art  chris-judge 
december 2012
Back-up Tut and other decoy spatial antiquities
I like this idea -- a complete facsimile of King Tut's burial chamber. Bldgblog comments:
“On the 90th anniversary of the discovery of King Tut’s tomb, an “authorized facsimile of the burial chamber” has been created, complete “with sarcophagus, sarcophagus lid and the missing fragment from the south wall.” The resulting duplicate, created with the help of high-res cameras and lasers, is “an exact facsimile of the burial chamber,” one that is now “being sent to Cairo by The Ministry of Tourism of Egypt.” [...]
'Interestingly, we read that this was "done under a licence to the University of Basel," which implies the very real possibility that unlicensed duplicate rooms might also someday be produced—that is, pirate interiors ripped or printed from the original data set, like building-scale "physibles," a kind of infringed architecture of object torrents taking shape as inhabitable rooms.' [...]
'In their book Anachronic Renaissance, for instance, Alexander Nagel and Christopher Wood write of what they call a long "chain of effective substitutions" or "effective surrogates for lost originals" that nonetheless reached the value and status of an icon in medieval Europe. "[O]ne might know that [these objects] were fabricated in the present or in the recent past," Nagel and Wood write, "but at the same time value them and use them as if they were very old things." They call this seeing in "substitutional terms".'
via:new-aesthetic  bldgblog  archaeology  facsimiles  copying  king-tut  egypt  history  3d-printing  physibles 
december 2012
A map of Dublin from 1686
<p>via Come Here To Me -- 'The whole population of the county at the time was under 60,000. Ringsend, Merrion, Monkstown, Bullock and Dalkey on the Southside and Ballybough, Clontarf, Sutton and Hoath/Howth on the Northside are marked. Taken from the book Dublin: through space and time (2001).'</p><p>
Massive tracts of land were reclaimed since then, clearly -- the North bay comes all the way in to Ballybough!</p>
via:chtm  maps  dublin  ireland  history 
december 2012
"This project aims at creating a simple efficient building block for "Big Data" libraries, applications and frameworks; thing that can be used as an in-memory, bounded queue with opaque values (sequence of JDK primitive values): insertions at tail, removal from head, single entry peeks), and that has minimal garbage collection overhead. Insertions and removals are as individual entries, which are sub-sequences of the full buffer.

GC overhead minimization is achieved by use of direct ByteBuffers (memory allocated outside of GC-prone heap); and bounded nature by only supporting storage of simple primitive value (byte, `long') sequences where size is explicitly known.

Conceptually memory buffers are just simple circular buffers (ring buffers) that hold a sequence of primitive values, bit like arrays, but in a way that allows dynamic automatic resizings of the underlying storage. Library supports efficient reusing and sharing of underlying segments for sets of buffers, although for many use cases a single buffer suffices."
gc  java  jvm  bytebuffer 
december 2012
The "MIG-in-the-middle" attack
or, a very effective demonstration of a man-in-the-middle interception and replay attack, from a 1980s Namibia-Angola war, via Ross Anderson
security  mig  war  mitm 
december 2012
Efficient concurrent long set and map
An ordered set and map data structure and algorithm for long keys and values, supporting concurrent reads by multiple threads and updates by a single thread.

Some good stuff in the linked blog posts about Clojure's PersistentHashMap and PersistentVector data structures, too.
arrays  java  tries  data-structures  persistent  clojure  concurrent  set  map 
december 2012
Hamming weight
Wikipedia page.
The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string. In this binary case, it is also called the population count, popcount or sideways sum. It is the digit sum of the binary representation of a given number.

Contains an efficient algorithm to compute this for a given long value, by 'adding counts in a tree pattern.'
algorithms  hamming-distance  bits  hamming  weight  binary 
december 2012
Irish mobile phone companies: still spammy
'Pro tip: if you're going to spam, try not to spam the DPC's Director of Investigations.' -- lolz
funny  oh-dear  three  hutchinson  ireland  mobile  spam  dpc  law 
december 2012
Scoop! The inside story of the news website that saved the BBC
The Register's take on the early days of www.bbc.co.uk. Lots of politics, unsurprisingly.
Fifteen years ago this month the BBC launched its News Online website. Developed internally with a skeleton team, the web service rapidly became the face of the BBC on the internet, and its biggest success story – winning four successive BAFTA awards.
Remarkably, it operated at a third of the cost of rival commercial online news operations – unheard of in public-sector IT projects. Devised before there were really any content management systems, the technical architecture became a template for all major news systems, and one that’s still in use today. The team endured some furious internal politicking and sabotage to survive.
bbc  news  history  web  uk  the-register 
december 2012
James Hamilton - Failures at Scale & How to Ride Through Them - AWS re:Invent 2012 - Cpn208
mostly an update of his classic USENIX paper, but pretty cool to come across a mention of a network monitoring system we've built on page 21 ;)
amazon  james-hamilton  reliabilty  slides  aws 
december 2012
Everything I Ever Learned About JVM Performance Tuning @Twitter
presentation by Attila Szegedi of Twitter from last year. Some good tips here, well-presented
tuning  jvm  java  gc  cms  presentations  slides  twitter 
december 2012
_The Pauseless GC Algorithm_ [pdf]
Paper from USENIX VEE '05, by Cliff Click, Gil Tene, and Michael Wolf of Azul Systems, describing some details of the Azul secret sauce (via b6n)
via:b3n  azul  gc  jvm  java  usenix  papers 
december 2012
The Rise And Fall Of The Obscure Music Download Blog: A Roundtable
One internet music "sharing" trend largely unnoticed by the powers that sue was the niche explosion of obscure music download blogs, lasting roughly from 2004-2008. Using free filesharing services like Rapidshare and Mediafire, and setting up sites on Blogspot and similar providers, these internet hubs stayed hidden in the open by catering to more discerning kleptomaniac audiophiles. Their specialty: parceling out ripped recordings — many of them copyrighted — from the more collectible and unknown corners of music's oddball, anomalous past.

While the RIAA was suing dead people for downloading Michael Jackson songs (and Madonna was using Soulseek to curse at teenagers), obscure music blogs racked up millions of hits, ripping and sharing 80s Japanese noise, 70s German prog, 60s San Francisco hippie freak-outs, 50s John Cage bootlegs, 30s gramophone oddities, Norwegian death metal, cold wave cassettes made by kids in their garages, and the like. It was the mid aughts, and the advent of digitization had inadvertently put the value of the music industry's "Top Ten" commercial product in peril. That same process transformed the value of old, collectible music as well. If one smart record collector was able to share the entire contents—music, artwork and all—of one vinyl LP on his blog, for free, and upload another item from his 1,000+ collection the next day, for weeks and years, and others like him did the same, competing with each other about who could upload the rarest and most sought-after record, and anyone who downloaded it could then share it again and again… Suddenly everyone in the world had the coolest record collection in the world; and soon, nobody in the world had the coolest record collection in the world.

Obscure music download blogs weren't shut down like Napster or Megaupload were (though they were indirectly affected by that crackdown); they just, mysteriously, seemed to burn out on their own sometime around 2008. While some are still around, their number represents only a fraction of that mid-00s heyday. Was this because obscure music blogs had overshared the underexposed and blown the whole thing into oblivion? Is the fact that a guy in Japan will no longer pay $500 on eBay for a first pressing of the No New York compilation because he can find it for free on the internet good for the world? Was the commodity-lost but the knowledge-gained an even exchange? To explore what was going on then, I assembled this email roundtable discussion between creators of some of the most popular blogs of the time: Eric Lumbleau of Mutant Sounds, Liam Elms of 8 Days in April, Frank of Systems of Romance and Brian Turner, Music Director of WFMU.

(via Loreana Rushe)
music  mp3  blogs  obscure  via-loreana-rushe  history  2000s 
november 2012
HTTP Error 403: The service you requested is restricted - Vodafone Community
Looks like Vodafone Ireland are failing to scale their censorware; clients on their network reporting "HTTP Error 403: The service you requested is restricted". According to a third-party site, this error is produced by the censorship software they use when it's insufficiently scaled for demand:

"When you try to use HTTP Vodafone route a request to their authentication server to see if your account is allow to connect to the site. By default they block a list of adult/premium web sites (this is service you have switched on or off with your account). The problem is at busy times this validation service is overloaded and so their systems get no response as to whether the site is allowed, so assume the site you asked for is restricted and gives the 403 error. Once this happens you seem to have to make new 3G data connection (reset the phone, move cell or let the connection time out) to get it to try again."

Sample: http://pic.twitter.com/N1lAwBjW
scaling  ireland  vodafone  fail  censorware  scalability  customer-service 
november 2012
Special encoding of small aggregate data types in Redis
Nice performance trick in Redis on hash storage:

'In theory in order to guarantee that we perform lookups in constant time (also known as O(1) in big O notation) there is the need to use a data structure with a constant time complexity in the average case, like an hash table. But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small, the amortized time for HGET and HSET commands is still O(1): the hash will be converted into a real hash table as soon as the number of elements it contains will grow too much (you can configure the limit in redis.conf). This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than an hash table).'
memory  redis  performance  big-o  hash-tables  storage  coding  cache  arrays 
november 2012
The trench talk that is now entrenched in the English language
'From cushy to crummy and blind spot to binge drink, a new study reveals the impact the First World War had on the English language and the words it introduced.' Incredible comments, too...
english  etymology  history  wwi  great-war  via:sinead-gleeson  words  language 
november 2012
Nintendo's work on Miiverse Penis Drawing Detection

'The unique feature of the Miiverse is being able to send drawings, not just text. But since the advent of the internet, there have always been those who have used it for unsavory purposes.'
'Motoyama: we never had such a problem with our Hatena services. But, when we brought Hatena Flipnote to the West, we were caught off-guard by the amount of penises drawn by people.
Kurisu: So the team and I had to come up with a way to create a system that auto-detects those types of pictures. [...]
'Motoyama: After a week, we made very good progress on the system. Then we tested the system with Nintendo of America and told them to start drawing. It went horribly.
Kurisu: What we learned is that people enjoy drawing penises. Multiple ones. (laughs) The system was not prepared to handle that.'

See also the "time-to-penis" metric in MMO games: http://www.joystiq.com/2009/03/24/overheard-gdc09-ttp-time-to-penis/
nintendo  image-detection  ttp  metrics  games  gaming  mmo  miiverse  drawing 
november 2012
Conor’s 2012 Raspberry Pi Christmas Gift Guide
Ah, memories! Wish my kiddies were old enough for one of these...

I really think this Christmas could be a lovely replay of 1982 for a lot of people, like me, who got their first home computer that year. You could have so much fun on Christmas Day messing with the RPi rather than falling asleep in front of the fire. Just don’t fight over who gets the telly when Doctor Who is on.

Whilst the bare-bones nature of the Raspberry Pi is wonderful, it is unusable out of the box unless you are a house with smartphones, digital cameras and existing PCs already that you can raid for components. What you want to avoid is a repeat of me that December in 1982 with my brand-new 16K ZX Spectrum which didn’t work on our Nordmende TV until two weeks later when the RTV Rentals guy came and replaced the TV Tuner. Two weeks typing Beep 1,2 to make sure it wasn’t broken.
raspberry-pi  gifts  computers  kids  hacking  education  gadgets  christmas 
november 2012
Memory Barriers/Fences
Martin Thompson with a good description of the x86 memory barrier model and how it interacts with Java's JSR-133 memory model
architecture  hardware  programming  java  concurrency  volatile  jsr-133 
november 2012
Does it run Minecraft? Well, since you ask…
Going by the number of Minecraft fans among my friends' sons and daughters in the 8-12 age group, this is a great idea:
We sent a bunch of [Raspberry Pi] boards out to Notch and the guys at Mojang in Stockholm a little while back, and they’ve produced a port of Minecraft: Pocket Edition which they’re calling  Minecraft: Pi Edition. It’ll carry a revised feature set and support for several programming languages, so you can code direct into Minecraft before you start playing. (Or you can just – you know – play.)
minecraft  gaming  programming  coding  raspberry-pi  kids  learning  education 
november 2012
IBM insider: How I caught my wife while bug-hunting on OS/2 • The Register
Wow, working for IBM in the 80's was truly shitty.

'IBM HR came up with a plan that summed up the department's view of tech staff: a dinner dance. In Southsea. For our non-British readers this is not a glamorous location.

As a scumbag contractor I wasn’t invited, but since I was dating one of the seven women on the project, I went anyway and was impressed by the way IBM had tried so very hard to make the inside of a municipal leisure centre look like Hawaii. This is so crap that the integrity checks I’ve installed to watch myself for incipient senility keep flagging it as a false memory.

The only way I can force myself to believe the idea that the richest corporation on the planet behaved that way is that the girl who took me is now a reassuringly expensive lawyer who was kind enough to marry me and so we have photographic evidence.

(I wish to make it clear that I’m not saying IBM had the worst HR of any firm in the world, merely that my 28 years in technology and banking have never exposed a worse one to me.)'

And indeed, so were MS:

'We, on the other hand, were regarded as hopelessly bureaucratic. After Microsoft lost the source code for the actual build of OS/2 we shipped, I reported a bug triggered when you double-clicked on Chkdsk twice: the program would fire up twice and both would try to fix the disk at the same time, causing corruption. I noted that this “may not be consistent with the user's goals as he sees them at this time”. This was labelled a user error, and some guy called Ballmer questioned why I had this “obsession” with perfect code.'

(thanks, Conor!)
via:conor-delaney  os2  ibm  microsoft  work  1980s  pc  uk  steve-ballmer 
november 2012
John Carmack's .plan update from 10/14/98
John Carmack presciently defines the benefits of an event sourcing architecture in 1998, as a key part of Quake 3's design:

"The key point: Journaling of time along with other inputs turns a
realtime application into a batch process, with all the attendant
benefits for quality control and debugging. These problems, and
many more, just go away. With a full input trace, you can accurately
restart the session and play back to any point (conditional
breakpoint on a frame number), or let a session play back at an
arbitrarily degraded speed, but cover exactly the same code paths."

(This was the first time I'd heard of the concept, at least.)
john-carmack  design  software  coding  event-sourcing  events  quake-3 
november 2012
How Team Obama’s tech efficiency left Romney IT in dust | Ars Technica
The web-app dev and ops best practices used by the Obama campaign's tech team. Some key tools: Puppet, EC2, Asgard, Cacti, Opsview, StatsD, Graphite, Seyren, Route53, Loggly, etc.
obama  campaigns  tools  ops  asgard  ec2  aws  route53 
november 2012
Unlike other tools intended to solve the JVM startup problem (e.g. Nailgun, Cake), Drip does not use a persistent JVM. There are many pitfalls to using a persistent JVM, which we discovered while working on the Cake build tool for Clojure. The main problem is that the state of the persistent JVM gets dirty over time, producing strange errors and requiring liberal use of cake kill whenever any error is encountered, just in case dirty state is the cause.

Instead of going down this road, Drip uses a different strategy. It keeps a fresh JVM spun up in reserve with the correct classpath and other JVM options so you can quickly connect and use it when needed, then throw it away. Drip hashes the JVM options and stores information about how to connect to the JVM in a directory with the hash value as its name.

(via HN)
java  command-line  tools  startup  speed 
november 2012
Australian VCE exam question accidentally includes photoshopped Battletech mech
File under New Aesthetic:
Exams for the popular History: Revolution subject were original supposed to include the artwork Storming the Winter palace on 25th October 1917 by Nikolai Kochergin, which depicts events during the October Revolution, which was instrumental in the larger Russian Revolution of 1917. When students opened their exam this morning they found an altered version of the work with what appear to be a large "BattleTech Marauder" robot aiding the rising revolutionaries in the background.
new-aesthetic  funny  photoshop  russia  1917  battletech  mechs  vcaa 
november 2012
Building an Impenetrable ZooKeeper (PDF)
great presentation on operational tips for a reliable ZK cluster (via Bill deHora)
via:bill-dehora  zookeeper  ops  syadmin 
november 2012
What can data scientists learn from DevOps?

'Rather than continuing to pretend analysis is a one-time, ad hoc action, automate it. [...] you need to maintain the automation machinery, but a cost-benefit analysis will show that the effort rapidly pays off — particularly for complex actions such as analysis that are nontrivial to get right.' (via @fintanr)
via:fintanr  data-science  data  automation  devops  analytics  analysis 
november 2012
WebTechStacks by martharotter - Kippt
A good set of infrastructure/devops tech blogs, collected by Martha Rotter
via:martharotter  blogs  infrastructure  devops  ops  web  links 
november 2012
AnandTech - The Intel SSD DC S3700: Intel's 3rd Generation Controller Analyzed
Interesting trend; Intel moved from a btree to an array-based data structure for their logical-block address indirection map, in order to reduce worst-case latencies (via Martin Thompson)
latency  intel  via:martin-thompson  optimization  speed  p99  data-structures  arrays  btrees  ssd  hardware 
november 2012
« earlier      later »
abuse ads ai algorithms amazon analytics android anti-spam apache apple apps architecture art automation aws banking big-data bitcoin books bugs build business cars cassandra censorship children china cli coding compression concurrency containers copyright crime crypto culture cycling data data-protection data-structures databases dataviz debugging deployment design devops distcomp distributed dns docker driving dublin ec2 email eu europe exploits facebook fail false-positives fault-tolerance filesharing filtering food fraud funny future games gaming gc gchq git github go google government graphics hacking hacks hadoop hardware hashing health history home http https images internet ios ip iphone ireland isps java javascript journalism jvm kafka kids lambda languages latency law legal libraries life linux load-balancing logging machine-learning malware mapping maps medicine memory metrics microsoft ml mobile money monitoring movies mp3 music mysql netflix network networking news nosql nsa open-source ops optimization outages packaging papers patents pdf performance phones photos piracy politics presentations privacy programming protocols python realtime recipes redis reference reliability replication research ruby russia s3 safety scala scalability scaling scams science search security shopping slides snooping social-media society software space spam sql ssl startups statistics storage streaming surveillance swpats sysadmin tcp tech testing time tips tls tools travel tuning tv twitter ui uk unix us-politics via:fanf via:nelson video web wifi work youtube

Copy this bookmark: