ML   39776

« earlier    

We enable the next generation of therapeutic devices, clinical services, and disease-modifying therapeutic drugs for patients with neurological disease.
pharmaceutical  software  ml  USA 
3 days ago by shalmaneser
The Case for Learned Index Structures
'Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. The key idea is that a model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets. More importantly though, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work just provides a glimpse of what might be possible.'

Excellent follow-up thread from Henry Robinson:

'The fact that the learned representation is more compact is very neat. But also it's not really a surprise that, given the entire dataset, we can construct a more compact function than a B-tree which is *designed* to support efficient updates.' [...] 'given that the model performs best when trained on the whole data set - I strongly doubt B-trees are the best we can do with the current state-of-the art.'
data-structures  ml  google  b-trees  storage  indexes  deep-learning  henry-robinson 
3 days ago by jm
Scooped by AI » Nieman Journalism Lab
already happening:

ProPublica’s Jeremy Merrill used machine learning to detect the issues uniquely important to each member of Congress.
BuzzFeed News’s Peter Aldhous, Christian Stork, and Charles Seife used machine learning to identify surveillance aircraft run by the U.S. Marshals and military contractors.
The Atlantic’s Andrew McGill used machine learning to figure out whether Donald Trump is writing his own tweets.
dj  ml  ai 
3 days ago by paulbradshaw
Natural Language Processing in the kitchen - Data Desk - Los Angeles Times
We had a pile of a couple thou­sand re­cords – news stor­ies, columns and more – and each re­cord con­tained one or more re­cipes. We needed to do the fol­low­ing:

Sep­ar­ate the re­cipes from the rest of the story, while keep­ing the story in­tact for dis­play along­side the re­cipe later.
De­term­ine how many re­cipes there were – more than one in many cases, and counts up to a dozen wer­en’t par­tic­u­larly un­usu­al.
For each re­cipe, find the name, in­gredi­ents, steps, prep time, servings, nu­tri­tion and more.
Load these in­to a data­base, pre­serving the re­la­tion­ships between the re­cipes that ran to­geth­er in the news­pa­per.
nlp  ml  dj  recipes  food 
3 days ago by paulbradshaw
How we reported this story - LA Times
The computer program pulled crime data from the previous Times review to learn key words that identified an assault as serious or minor. The algorithm then analyzed nearly eight years of data in search of classification errors.
ml  dj  sl  latimes  crime 
3 days ago by paulbradshaw
Using machine learning to extract quotes from text | Reveal
This is mine: the citizen-quotes project, an app that uses simple machine learning techniques to extract more than 40,000 quotes from every article that ran on The Bay Citizen since it launched in 2010. The goal was to build something that accounts for the limitations of the traditional method of solving quote extraction – regular expressions and pattern matching. And sure enough, it does a pretty good job.
ml  dj  chasedavis  quotes 
3 days ago by paulbradshaw
AJC News Apps
But as time passes during a session, our old model becomes less and less useful, as the graph below shows2. Unlike the last chart, this shows how the model fares when you ask how good the forecasts were on any given day during a legislative session–in real time. As bills pass and leave the real-time pool of bills or run out of time towards the end of the session, the model falls out of whack.
ml  sl  ai  dj 
3 days ago by paulbradshaw
Brown signs dozens of bills previously vetoed by Schwarzenegger | California Watch
The California Watch analysis identified the bills based on the similarity of their text, not bill titles or sponsors, which often change between sessions. Multiple bills that were consolidated into one or those that changed dramatically since the last session may not have appeared in the analysis.
ml  ai  ul 
3 days ago by paulbradshaw

« earlier    

related tags

accountability  accounting  ai  algorithm  alphago  analog  analysis  apple  art  augmentation  automatic  autonomous  autonomousdriving  autonomousvehicles  b-trees  benchmarks  bias  bigdata  bitcoin  blockchain  blog  book  bucklescript  byom  card  chasedavis  cnn  computer  computervision  computing  courses  crime  crm  crypto  cs  cv  data-structures  data  database  databases  datacenter  datascience  datastructures  dean  deep-learning  deep  deeplearning  deeplearnjs  demo  design  discrimination  disruption  diversity  dj  dl  docker  education  elm  embeddings  energy  engineering  environment  ethics  example  examples  eye  fairness  fat  fatml  filesystem  fintech  food  format  free  fsharp  google  gp  gpu  graphics  hardware  hashing  henry-robinson  ideas  index  indexes  indexing  industrial  interface  ios  iot  javascript  jeff  karpathy  keras  kubernetes  language  lasvegas  latimes  learn  learned  learning  lifesciences  lisp  machine-learning  machine  machine_learning  machinelearning  maintenance  marketplace  materials  mcmc  microsoft  mining  ml  mobile  music  network  neural  nips  nips2017  nlg  nlp  nytimes  onnx  optimization  paper  papers  pharmaceutical  politics  presentation  programming  psi  python  pytorch  quotes  r  racism  rcn  reason  recipes  recommendation  reference  reinforcement-learning  research  resource  resources  rnn  science  security  selfdrivingcar  sl  society  software  speech  standard  startup  storage  structures  system  systems  tensorflow  text  to  tpu  training  transparency  tryme  tutorial  udacity  uk  ul  usa  utilities  video  vision  water  web 

Copy this bookmark: