jm + elasticsearch   14

Developing a time-series "database" based on HdrHistogram
Histogram aggregation is definitely a sensible way to store this kind of data
storage  elasticsearch  metrics  hdrhistogram  histograms  tideways 
april 2017 by jm
Syscall Auditing at Scale
auditd -> go-audit -> elasticsearch at Slack
elasticsearch  auditd  syscalls  auditing  ops  slack 
november 2016 by jm
Building a Regex Search Engine for DNA | Hacker News
The original post is pretty mediocre -- a search engine which handles a corpus of "thousands" of plasmids from "a scientist's personal library", and which doesn't handle fuzzy matches? I think that's called grep -- but the HN comments are good
grep  regular-expressions  hacker-news  strings  dna  genomics  search  elasticsearch 
april 2016 by jm
Heroic
Spotify wrote their own metrics store on ElasticSearch and Cassandra. Sounds very similar to Prometheus
cassandra  elasticsearch  spotify  monitoring  metrics  heroic 
december 2015 by jm
Intercom Engineering Insights - Scale and Reliability 2015
next Intercom hiring^Wevent coming up, Dec 10th in Dublin, talking about how they scale and ops their ElasticSearch and Mongo clusters
elasticsearch  mongodb  intercom  engineering  talks  dublin 
december 2015 by jm
Elasticsearch and data loss
"@alexbfree @ThijsFeryn [ElasticSearch is] fine as long as data loss is acceptable. https://aphyr.com/posts/317-call-me-maybe-elasticsearch . We lose ~1% of all writes on average."
elasticsearch  data-loss  reliability  data  search  aphyr  jepsen  testing  distributed-systems  ops 
october 2015 by jm
Call me maybe: Elasticsearch 1.5.0
tl;dr: Elasticsearch still hoses data integrity on partition, badly
elasticsearch  reliability  data  storage  safety  jepsen  testing  aphyr  partition  network-partitions  cap 
may 2015 by jm
Maintaining performance in distributed systems [slides]
Great slide deck from Elasticsearch on JVM/dist-sys performance optimization
performance  elasticsearch  java  jvm  ops  tuning 
january 2015 by jm
Call me maybe: Elasticsearch
Wow, these are terrible results. From the sounds of it, ES just cannot deal with realistic outage scenarios and is liable to suffer catastrophic damage in reasonably-common partitions.
If you are an Elasticsearch user (as I am): good luck. Some people actually advocate using Elasticsearch as a primary data store; I think this is somewhat less than advisable at present. If you can, store your data in a safer database, and feed it into Elasticsearch gradually. Have processes in place that continually traverse the system of record, so you can recover from ES data loss automatically.
elasticsearch  ops  storage  databases  jepsen  partition  network  outages  reliability 
june 2014 by jm
Resiliency And Elasticsearch
Blog post from the ES team. They use "evil tests" -- basically unit/system tests, particularly using randomized error-injecting mock infrastructure. Good practices; I've done the same myself quite recently for Swrve's realtime infrastructure
elasticsearch  resiliency  network-partitions  reliability  testing  mocking  error-injection 
april 2014 by jm
Behind the Screens at Loggly
Boost ASIO at the front end (!), Kafka 0.8, Storm, and ElasticSearch
boost  scalability  loggly  logging  ingestion  cep  stream-processing  kafka  storm  architecture  elasticsearch 
september 2013 by jm
Interview with the Github Elasticsearch Team
good background on Github's Elasticsearch scaling efforts. Some rather horrific split-brain problems under load, and crashes due to OpenJDK bugs (sounds like OpenJDK *still* isn't ready for production). painful
elasticsearch  github  search  ops  scaling  split-brain  outages  openjdk  java  jdk  jvm 
september 2013 by jm
ElasticSearch
nifty; Apache-licensed distributed, RESTful, JSON-over-HTTP, schemaless search server with multi-tenancy
search  distributed  rest  json  apache  elasticsearch  http  from delicious
february 2010 by jm

Copy this bookmark:



description:


tags: