papers   23956

« earlier    

[1802.07000] CASPaxos: Replicated State Machines without logs
CASPaxos is a replicated state machine (RSM) protocol, an extension of Synod. Unlike Raft and Multi-Paxos, it doesn't use leader election and log replication, thus avoiding associated complexity. Its symmetric peer-to-peer approach achieves optimal commit latency in wide-area networks and doesn't cause transient unavailability when any ⌊N−12⌋ of N nodes crash.
The lightweight nature of CASPaxos allows new combinations of RSMs in the designs of distributed systems. For example, a representation of a key-value storage as a hashtable with independent RSM per key increases fault tolerance and improves performance on multi-core systems compared with a hashtable behind a single RSM.
This paper describes CASPaxos protocol, formally proves its safety properties, covers cluster membership change and evaluates the benefits of a CASPaxos-based key-value storage.
distributed-algorithms  papers  to-read 
4 hours ago by absfac
[1802.08195] Adversarial Examples that Fool both Human and Computer Vision
"Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we create the first adversarial examples designed to fool humans, by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by modifying models to more closely match the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers."
papers  adversarial-examples  via:jseppanen 
19 hours ago by arsyed
[1802.07814] Learning to Explain: An Information-Theoretic Perspective on Model Interpretation
"We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given the input is the model to be explained. We develop an efficient variational approximation to the mutual information, and show that the resulting method compares favorably to other model explanation methods on a variety of synthetic and real data sets using both quantitative metrics and human evaluation."
papers  information-theory  interpretation  feature-selection 
19 hours ago by arsyed
MillWheel: Fault-Tolerant Stream Processing at Internet Scale
MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of the framework's fault-tolerance guarantees. This paper describes MillWheel's programming model as well as its implementation. The case study of a continuous anomaly detector in use at Google serves to motivate how many of MillWheel's features are used. MillWheel's programming model provides a notion of logical time, making it simple to write time-based aggregations. MillWheel was designed from the outset with fault tolerance and scalability in mind. In practice, we find that MillWheel's unique combination of scalability, fault tolerance, and a versatile programming model lends itself to a wide variety of problems at Google.
big-data  papers 
yesterday by foodbaby
The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing
Unbounded, unordered, global-scale datasets are increasingly common in day-to-day business (e.g. Web logs, mobile usage statistics, and sensor networks). At the same time, consumers of these datasets have evolved sophisticated requirements, such as event-time ordering and windowing by features of the data themselves, in addition to an insatiable hunger for faster answers. Meanwhile, practicality dictates that one can never fully optimize along all dimensions of correctness, latency, and cost for these types of input. As a result, data processing practitioners are left with the quandary of how to reconcile the tensions between these seemingly competing propositions, often resulting in disparate implementations and systems. We propose that a fundamental shift of approach is necessary to deal with these evolved requirements in modern data processing. We as a field must stop trying to groom unbounded datasets into finite pools of information that eventually become complete, and instead live and breathe under the assumption that we will never know if or when we have seen all of our data, only that new data will arrive, old data may be retracted, and the only way to make this problem tractable is via principled abstractions that allow the practitioner the choice of appropriate tradeoffs along the axes of interest: correctness, latency, and cost. In this paper, we present one such approach, the Dataflow Model, along with a detailed examination of the semantics it enables, an overview of the core principles that guided its design, and a validation of the model itself via the real-world experiences that led to its development.
big-data  papers 
yesterday by foodbaby
Deep Learning, Structure and Innate Priors | Abigail See
Video. Yann LeCun and Christopher Manning discuss the role of priors/structure in machine learning.
yannlecun  ChrisManning  watchlist  papers  prior  statistics  NeuralNetworks  MachineLearning 
yesterday by csantos
[1506.02640] You Only Look Once: Unified, Real-Time Object Detection
We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance.
Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is far less likely to predict false detections where nothing exists. Finally, YOLO learns very general representations of objects. It outperforms all other detection methods, including DPM and R-CNN, by a wide margin when generalizing from natural images to artwork on both the Picasso Dataset and the People-Art Dataset.
papers  cv  deeplearning 
4 days ago by jblocksom
[1612.08242] YOLO9000: Better, Faster, Stronger
We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don't have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.
yolo  cv  papers 
4 days ago by jblocksom
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky, University of Toronto
Ilya Sutskever, University of Toronto
Geoffrey E. Hinton, University of Toronto
neuralnetworks  deeplearning  research  papers  cnn  2012 
5 days ago by ianchanning

« earlier    

related tags

-  10  2012  2017  academia  academic-papers  academics  adversarial-examples  advice  ai  algorithms  api-design  apple  arch  armpl  arthistory  article  articles  asr  audio  baidu  bayesian  big-data  buddhism  c  capitalism  chain  china  chrismanning  citation  cloudkit  cnn  compsci  computer-vision  computer  computing  craq  criticalthinking  cs  culture  cv  cyclegan  cypherpunk  database  datasets  datenbank  deeplearning  defined  dense  design  distributed-algorithms  distsys  dl  doom  dsp  dynamicanalysis  education  embedded  engineering  ethics  extension  f0  fatml  feature-selection  finance  flash  free  gan  gcn  globus  google  graphics  greatpapers  grid  history  image_processing  imageprocessing  information-theory  initialization  intepretable  interpretation  journals  kunstgeschichte  lists  machine-learning  machine.learning  machine_learning  machinelearning  management  mask-rcnn  masters  mathematics  mentor  ml  mobile  networking  networks  neural-net  neuralnetworks  nlp  nn  of  opensource  optimization  paper  paperswelove  patterns  paywall  piracy  pitch  pose  prior  probability  programming-for-everyone  programming  programming_languages  project_ideas  prosody  protocols  read  reading  relu  replication  research  resnet  rest  review  rl  robotics  safeprogramming  safeprogramminglanguages  scalability  sci-hub  science  science_execution_innovations  scihub  sdf  security  semantics  software  speech  statistics  storage  systems  timestretch  to-read  tool  top  toread  tracking  tts  types  user-interface  vectorgraphics  voice-conversion  watchlist  wave  white  wissenschaft  writing  xavier  yannlecun  yolo 

Copy this bookmark: