jm + checkpointing   2

A higher order estimate of the optimum checkpoint interval for restart dumps
tl;dr:
the bottom line is as follows:
If the time it takes to create a dump, δ < M/2 then use τopt = √(2δM) – δ
Otherwise (it takes longer than M/2 to create a dump), just use τopt = M.
dumping  periodic-tasks  scheduling  frequency  maths  optimal  interval  checkpointing 
june 2015 by jm
Why I'm Walking Away From CouchDB
In practice there are two gotchas that are so painful I am  looking for a replacement with a different featureset than couchdb provides. The location tracking project icecondor.com uses couchdb to store 20,000 new records per day. It has more write traffic than read traffic and runs on modest hardware. Those two gotchas are:

1. View Index updates.

While I have a vague understanding of why view index updates are slow and bulky and important, in practice it is unworkable. Every write sets up a trap for the first reader to come along after the write. The more writes there are, the bigger the trap for the first reader which has to wait on the couchdb process that refreshes the view index on an as-needed basis. I believe this trade-off was made to keep writes fast. No need to update the view index until all writes are actually complete, right? Write traffic is heavier than read traffic and the time needed for that index refresh causes the webapp to crash because its not setup to handle timeouts from a database query. The workaround is as hackish as one can imagine -  cron jobs to hit every  map/reduce query to keep indexes fresh.

2. Append only database file

Append only is in theory a great way to ensure on-disk reliability. A system crash during an append should only affect that append. Its a crash during an update to existing parts of the file that risks the integrity of more than whats being updated. With so many layers of caching and optimizations in the kernel and the filesystem and now in the workings of SSD drives, I'm not sure append-only gives extra protection anymore.

What it does do is a create a huge operational headache. The on-disk file can never grow beyond half the available storage space. Record deletion uses new disk space and if the half-full mark approaches, vacuuming must be done. The entire database is rewritten to the filesystem, leaving out no longer needed records. If the data file should happen to grow beyond half the partition, the system has esentially crashed because there is no way to compact the file and soon the partition will be full. This is a likely scenario when there is a lot of record deletion activity.

The system in question does a lot of writes of temporary data that is followed up by deletes a few days later. There is also a lot of permanent storage that hardly gets used. Rewriting every byte of the records that are long-lived due to compaction is an enormous amount of wasted I/O - doubly so given SSD drives have a short write-cycle lifespan.
nosql  couchdb  consistency  checkpointing  databases  data-stores  indexing 
april 2013 by jm

Copy this bookmark:



description:


tags: