amy + machine_learning   1259

An algorithm that learns through rewards may show how our brain does too - MIT Technology Review
By optimizing reinforcement-learning algorithms, DeepMind uncovered new details about how dopamine helps the brain learn.
machine_learning  neuroscience 
9 days ago by amy
An Adversarial Approach for the Robust Classification of Pneumonia from Chest Radiographs
While deep learning has shown promise in the domain of disease
classification from medical images, models based on state-of-the-art
convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one
hospital system achieve high predictive performance when tested
on data from the same hospital, but perform significantly worse
when they are tested in different hospital systems. Furthermore,
even within a given hospital system, deep learning models have
been shown to depend on hospital- and patient-level confounders
rather than meaningful pathology to make classifications. In order
for these models to be safely deployed, we would like to ensure that
they do not use confounding variables to make their classification,
and that they will work well even when tested on images from
hospitals that were not included in the training data. We attempt
to address this problem in the context of pneumonia classification
from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that
do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia
classifier by training a model that is invariant to the view position
of chest radiographs (anterior-posterior vs. posterior-anterior). Our
approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed
methods to handle confounding, and also suggests a method for
identifying models that may rely on confounders.
machine_learning  bias 
11 days ago by amy
AI NEXTCon 2020 - Seattle
Serverless Machine Learning with TensorFlow 2.0
events  moi  machine_learning  TensorFlow 
13 days ago by amy
GoogleCloudPlatform/cloudml-hypertune
Metric Reporting Python Package for CloudML Hypertune
machine_learning  gcp  cmle 
19 days ago by amy
jarokaz/mlops-labs
This repo manages a set of labs designed to demonstrate best practices and patterns for implementing and operationalizing production grade ML workflows on Google Cloud Platform.

With a few exceptions the labs are self-contained - they don't rely on other labs. The goal is to create a portoflio of labs that can be utilized in development and delivery of scenario specific demos and workshops.
machine_learning  kubeflow  ml_ops 
19 days ago by amy
Advances and Open Problems in Federated Learning
Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or
whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service
provider), while keeping the training data decentralized. FL embodies the principles of focused data
collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting
from traditional, centralized machine learning and data science approaches. Motivated by the explosive
growth in FL research, this paper discusses recent advances and presents an extensive collection of open
problems and challenges.
machine_learning 
20 days ago by amy
To Balance or Not to Balance: An Embarrassingly Simple Approach for Learning with Long-Tailed Distributions
Real-world visual data often exhibits a long-tailed distribution, where some “head” classes have a large number of samples, yet only a few samples are available for
the “tail” classes. Such imbalanced distribution causes a
great challenge for learning a deep neural network, which
can be boiled down into a dilemma: on the one hand, we
prefer to increase the exposure of the tail class samples to
avoid the excessive dominance of head classes in the classifier training. On the other hand, oversampling tail classes
makes the network prone to over-fitting, since the head class
samples are often consequently under-represented. To resolve this dilemma, in this paper, we propose an embarrassingly simple-yet-effective approach. The key idea is to
split a network into a classifier part and a feature extractor
part, and then employ different training strategies for each
part. Specifically, to promote the awareness of tail-classes,
a class-balanced sampling scheme is utilised for training
both the classifier and the feature extractor. For the feature extractor, we also introduce an auxiliary training task,
which is to train a classifier under the regular random sampling scheme. In this way, the feature extractor is jointly
trained from both sampling strategies and thus can take advantage of all training data and avoid the over-fitting issue. Apart from this basic auxiliary task, we further explore
the benefit of using self-supervised learning as the auxiliary task. Without using any bells and whistles, our model
achieves superior performance over the state-of-the-art solutions.
machine_learning  bias 
5 weeks ago by amy
Open source NLU and NLP tool for natural language understanding - formerly Rasa NLU - Rasa
Open source language understanding for conversational bots and assistants (formerly Rasa NLU)
nlp  machine_learning 
6 weeks ago by amy
SeldonIO/mlgraph: Machine Learning Inference Graph Spec
MLGraph defines a graph of machine learning components. The goal is to provide a simple machine learning focused specification for defining:

Easy model Experimentation and AB tests
Advanced routing with Multi-Armed Bandits
Ensembling of models
Explanations, Outlier Detection, Skew and Bias detection
Builds upon KFServing and other ML Serving Components
Flexible graph nodes:
References or inline specs
Custom user provided components *Auto-validation of graph
machine_learning  kubeflow  seldon 
7 weeks ago by amy
pipelines/TFX_pipeline.ipynb at master · kubeflow/pipelines
TFX Components
This notebook shows how to create pipeline that uses TFX components:

CsvExampleGen
StatisticsGen
SchemaGen
ExampleValidator
Transform
Trainer
Evaluator
kubeflow  machine_learning  tfx 
7 weeks ago by amy
[1811.10959] Dataset Distillation
Model distillation aims to distill the knowledge of a complex model into a simpler one. In this paper, we consider an alternative formulation called dataset distillation: we keep the model fixed and instead attempt to distill the knowledge from a large training dataset into a small one. The idea is to synthesize a small number of data points that do not need to come from the correct data distribution, but will, when given to the learning algorithm as training data, approximate the model trained on the original data. For example, we show that it is possible to compress 60,000 MNIST training images into just 10 synthetic distilled images (one per class) and achieve close to original performance with only a few gradient descent steps, given a fixed network initialization. We evaluate our method in various initialization settings and with different learning objectives. Experiments on multiple datasets show the advantage of our approach compared to alternative methods.
machine_learning 
7 weeks ago by amy
Self-training with Noisy Student improves ImageNet classification
We present a simple self-training method that achieves
87.4% top-1 accuracy on ImageNet, which is 1.0% better
than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves
ImageNet-A top-1 accuracy from 16.6% to 74.2%, reduces
ImageNet-C mean corruption error from 45.7 to 31.2, and
reduces ImageNet-P mean flip rate from 27.8 to 16.1.
To achieve this result, we first train an EfficientNet model
on labeled ImageNet images and use it as a teacher to generate pseudo labels on 300M unlabeled images. We then
train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate
this process by putting back the student as the teacher. During the generation of the pseudo labels, the teacher is not
noised so that the pseudo labels are as good as possible.
But during the learning of the student, we inject noise such
as data augmentation, dropout, stochastic depth to the student so that the noised student is forced to learn harder from
the pseudo labels
machine_learning  TensorFlow  google 
10 weeks ago by amy
nlpyang/PreSumm: code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
This code is for EMNLP 2019 paper Text Summarization with Pretrained Encoders
nlp  bert  machine_learning 
10 weeks ago by amy
[1910.13038] Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

Much of model-based reinforcement learning involves learning a model of an agent's world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware---e.g., a brain---arose as the byproduct of competing evolutionary pressures for survival, not minimization of a supervised forward-predictive loss via gradient descent. That useful models can arise out of the messy and slow optimization process of evolution suggests that forward-predictive modeling can arise as a side-effect of optimization under the right circumstances. Crucially, this optimization process need not explicitly be a forward-predictive loss. In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep. In doing so, we can coerce an agent into learning a world model to fill in the observation gaps during reinforcement learning. We show that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment. Videos of our results available at this https URL
machine_learning 
11 weeks ago by amy
Custom layers  |  TensorFlow Core
    self.bn2a = tf.keras.layers.BatchNormalization()
TensorFlow  machine_learning 
12 weeks ago by amy
(31) Simplifying Model Management with MLflow - Matei Zaharia (Databricks) Corey Zumar (Databricks) - YouTube
Last summer, Databricks launched MLflow, an open source platform to manage the machine learning lifecycle, including experiment tracking, reproducible runs and model packaging. MLflow has grown quickly since then, with over 120 contributors from dozens of companies, including major contributions from R Studio and Microsoft. It has also gained new capabilities such as automatic logging from TensorFlow and Keras, Kubernetes integrations, and a high-level Java API. In this talk, we’ll cover some of the new features that have come to MLflow, and then focus on a major upcoming feature: model management with the MLflow Model Registry. Many organizations face challenges tracking which models are available in the organization and which ones are in production. The MLflow Model Registry provides a centralized database to keep track of these models, share and describe new model versions, and deploy the latest version of a model through APIs. We’ll demonstrate how these features can simplify common ML lifecycle tasks
machine_learning 
october 2019 by amy
GitHub - google/jax: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
JAX is Autograd and XLA, brought together for high-performance machine learning research.

With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation) via grad as well as forward-mode differentiation, and the two can be composed arbitrarily to any order.

What’s new is that JAX uses XLA to compile and run your NumPy programs on GPUs and TPUs. Compilation happens under the hood by default, with library calls getting just-in-time compiled and executed. But JAX also lets you just-in-time compile your own Python functions into XLA-optimized kernels using a one-function API, jit. Compilation and automatic differentiation can be composed arbitrarily, so you can express sophisticated algorithms and get maximal performance without leaving Python.

Dig a little deeper, and you'll see that JAX is really an extensible system for composable function transformations. Both grad and jit are instances of such transformations. Another is vmap for automatic vectorization, with more to come.
machine_learning  google  python 
october 2019 by amy
[1909.12744] On the use of BERT for Neural Machine Translation
Exploiting large pretrained models for various NMT tasks have gained a lot of visibility recently. In this work we study how BERT pretrained models could be exploited for supervised Neural Machine Translation. We compare various ways to integrate pretrained BERT model with NMT model and study the impact of the monolingual data used for BERT training on the final translation quality. We use WMT-14 English-German, IWSLT15 English-German and IWSLT14 English-Russian datasets for these experiments. In addition to standard task test set evaluation, we perform evaluation on out-of-domain test sets and noise injected test sets, in order to assess how BERT pretrained representations affect model robustness.
machine_learning  nlp 
october 2019 by amy
[1901.07291] Cross-lingual Language Model Pretraining
BERT-related?
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.
machine_learning  nlp 
october 2019 by amy
[1902.04094] BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
We show that BERT (Devlin et al., 2018) is a Markov random field language model. This formulation gives way to a natural procedure to sample sentences from BERT. We generate from BERT and find that it can produce high-quality, fluent generations. Compared to the generations of a traditional left-to-right language model, BERT generates sentences that are more diverse but of slightly worse quality.
machine_learning  nlp 
october 2019 by amy
« earlier      
per page:    204080120160

related tags

academia  acl2017nlp  adversarial-learning  agents  ai  algorithms  amazon  amusements  analysis  analytics  apache  APIs  apple  arghh  art  artificial_intelligence  astronomy  astrophysics  attention  automl  aws  azure  Beam  bert  bias  bigdata  big_data  biodiversity  bioinformatics  biology  blogging  books  cats  classification  climate  climate_change  cloud  cmle  CNNs  collaborative_filtering  computer_science  computing  cookbook  cool  crowdsourcing  culture  data  dataflow  datamining  DataScience  data_science  deepdream  deeplearning  deep_learning  discrimination  distributed  diversity  dna  docker  dqn  drosophila  ec2  economics  education  encryption  ethics  events  Facebook  fairness  federated_learning  fintech  food  framework  gae  GANs  gce  gcp  gcs  gender  generative  genomics  geo  github  gke  go  golang  google  GoogleNext19  gpus  hadoop  hardware  health  healthcare  images  image_processing  inception  india  information_retrieval  java  javascript  jeepers  jupyter  k8s  kaggle  keras  kfp  KubeCon  kubeflow  kubernetes  language  law  learning  libraries  library  life_sciences  LSTMs  machine  MachineLearning  machine_learning  magenta  mapreduce  mashups  math  mathematics  microsoft  military  mlops  ml_ops  mobile  moi  music  NAS  nature  netflix  neuroscience  nlp  nodejs  nvidia  ocr  opensource  optimization  oss  papers  performance  physics  podcast  podcasts  politics  presentations  privacy  probability  programming  python  PyTorch  R  rails  recipes  recommendation  redhat  reference  reinforcement-learning  reinforcement_learning  reproducibility  research  resources  RL  RNNs  road-sign-vision  robotics  ruby  scalability  seattle  seldon  seq2seq  society  sociology  software/social  spark  stack_overflow  stanford  startups  statistics  streaming  survey  svd  swift  tensorflow  tensors  tensor_flow  tfp  tfx  theano  tips  toread  tpus  tutorial  tutorials  twitter  uber  udacity  usa  videos  vision  visualization  visualizations  war  xai 

Copy this bookmark:



description:


tags: