jm + postgres   8

glibc changed their UTF-8 character collation ordering across versions, breaking postgres
This is terrifying:
Streaming replicas—and by extension, base backups—can become dangerously broken when the source and target machines run slightly different versions of glibc. Particularly, differences in strcoll and strcoll_l leave "corrupt" indexes on the slave. These indexes are sorted out of order with respect to the strcoll running on the slave. Because postgres is unaware of the discrepancy is uses these "corrupt" indexes to perform merge joins; merges rely heavily on the assumption that the indexes are sorted and this causes all the results of the join past the first poison pill entry to not be returned. Additionally, if the slave becomes master, the "corrupt" indexes will in cases be unable to enforce uniqueness, but quietly allow duplicate values.


Moral of the story -- keep your libc versions in sync across storage replication sets!
postgresql  scary  ops  glibc  collation  utf-8  characters  indexing  sorting  replicas  postgres 
6 weeks ago by jm
When Boring is Awesome: Building a scalable time-series database on PostgreSQL
Nice. we built something along these lines atop MySQL before -- partitioning by timestamp is the key. (via Nelson)
database  postgresql  postgres  timeseries  tsd  storage  state  via:nelson 
april 2017 by jm
Why Uber Engineering Switched from Postgres to MySQL
Uber bringing the smackdown for the HN postgres fanclub, with some juicy technical details of issues that caused them pain. FWIW, I was bitten by crappy postgres behaviour in the past (specifically around vacuuming and pgbouncer), so I've long been a MySQL fan ;)
database  mysql  postgres  postgresql  uber  architecture  storage  sql 
july 2016 by jm
jgc on Cloudflare's log pipeline
Cloudflare are running a 40-machine, 50TB Kafka cluster, ingesting at 15 Gbps, for log processing. Also: Go producers/consumers, capnproto as wire format, and CitusDB/Postgres to store rolled-up analytics output. Also using Space Saver (top-k) and HLL (counting) estimation algorithms.
logs  cloudflare  kafka  go  capnproto  architecture  citusdb  postgres  analytics  streaming 
june 2015 by jm
Mnesia and CAP
A common “trick” is to claim:

'We assume network partitions can’t happen. Therefore, our system is CA according to the CAP theorem.'

This is a nice little twist. By asserting network partitions cannot happen, you just made your system into one which is not distributed. Hence the CAP theorem doesn’t even apply to your case and anything can happen. Your system may be linearizable. Your system might have good availability. But the CAP theorem doesn’t apply. [...]
In fact, any well-behaved system will be “CA” as long as there are no partitions. This makes the statement of a system being “CA” very weak, because it doesn’t put honesty first. I tries to avoid the hard question, which is how the system operates under failure. By assuming no network partitions, you assume perfect information knowledge in a distributed system. This isn’t the physical reality.
cap  erlang  mnesia  databases  storage  distcomp  reliability  ca  postgres  partitions 
october 2014 by jm
Validate SQL queries at compile-time in Rust
The sql! macro will validate that its string literal argument parses as a valid Postgres query.


Based on https://pganalyze.com/blog/parse-postgresql-queries-in-ruby.html , which links the PostgreSQL server code directly into a C extension. Mad stuff, Ted!

(via Rob Clancy)
macros  postgres  compile  validation  sql  rust  coding 
october 2014 by jm
Database Migrations Done Right
The rule is simple. You should never tie database migrations to application deploys or vice versa. By minimising dependencies you enable faster, easier and cleaner deployments.


A solid description of why this is a good idea, from an ex-Guardian dev.
migrations  database  sql  mysql  postgres  deployment  ops  dependencies  loose-coupling 
may 2014 by jm

Copy this bookmark:



description:


tags: