amy + tensorflow   447

Google AI Blog: Advances in Semantic Textual Similarity
The recent rapid progress of neural network-based natural language understanding research, especially on learning semantic text representations, can enable truly novel products such as Smart Compose and Talk to Books. It can also help improve performance on a variety of natural language tasks which have limited amounts of training data, such as building strong text classifiers from as few as 100 labeled examples.

Below, we discuss two papers reporting recent progress on semantic representation research at Google, as well as two new models available for download on TensorFlow Hub that we hope developers will use to build new and exciting applications.
TensorFlow  machine_learning  google 
8 hours ago by amy
model-analysis/examples/chicago_taxi at master · tensorflow/model-analysis
The Chicago Taxi example demonstrates the end-to-end workflow and steps of how to transform data, train a model, analyze and serve it, using:

TensorFlow Transform for feature preprocessing
TensorFlow Estimators for training
TensorFlow Model Analysis and Jupyter for evaluation
TensorFlow Serving for serving
The example shows two modes of deployment.

The first is a “local mode” with all necessary dependencies and components deployed locally.
The second is a “cloud mode”, where all components will be deployed on Google Cloud.
In the future we will be showing additional deployment modes, so dear reader, feel free to check back in periodically!
machine_learning  TensorFlow  google 
7 weeks ago by amy
google/deepvariant: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
google  genomics  dna  machine_learning  TensorFlow 
7 weeks ago by amy
SeldonIO/seldon-core: Machine Learning Deployment for Kubernetes
Seldon Core is an open source platform for deploying machine learning models on Kubernetes.
machine_learning  kubernetes  k8s  TensorFlow 
7 weeks ago by amy
Cyclic computational graphs with Tensorflow or Theano - Stack Overflow
TensorFlow does support cyclic computation graphs. The tf.while_loop() function allows you to specify a while loop with arbitrary subgraphs for the condition and the body of the loop, and the runtime will execute the loop in parallel. The tf.scan() function is a higher-level API that is similar to Theano's theano.scan() function. Both allow you to loop over tensors of dynamic size.
TensorFlow 
11 weeks ago by amy
Twitter
RT : I'm doing a webinar on a development workflow for models written in -- to go from data…
tensorflow  machinelearning  from twitter
11 weeks ago by amy
Propel ML
Propel provides a GPU-backed numpy-like infrastructure for scientific computing in JavaScript. JavaScript is a fast, dynamic language which, we think, could act as an ideal workflow for scientific programmers of all sorts.
machine_learning  TensorFlow  javascript 
12 weeks ago by amy
Machine Learning with TensorFlow on Google Cloud Platform: code samples
new courses!: “Machine Learning with TensorFlow on Google Cloud Platform: code samples”

via @lak_gcp
TensorFlow  machine_learning  gcp 
12 weeks ago by amy
design | Architecture and UX design of KAML-D
KAML-D can be deployed on any cloud (or on-premises) platform that allows you to run Kubernetes. Most of the components are open source. As a SaaS, it integrates with the cloud providers (user) identity management system, on-prem something like LDAP.

Existing open source components KAML-D uses:

Kubernetes for workload management and to ensure portability
TensorFlow for machine learning execution
JupyterHub for data scientists (dev/test of algorithms)
Storage layer: To hold the datasets, Minio, Ceph, as well as cloud-provider specific offerings such as EBS, with built-in dotmesh support for snapshots
New components KAML-D introduces:

KAML-D Workbench: a graphical UI for data scientists, data engineers, developers, and SREs to manage datasets as well as to test and deploy ML algorithms. Builds on the metadata layer to find and visualize datasets. Builds on the storage layer to store and load datasets.
KAML-D Metadata Hub: a data and metadata layer using PrestoDB and Elasticsearch for indexing and querying datasets.
KAML-D Observation Hub: a comprehensive observability suite for SREs and admins (as well as developers on the app level) to understand the health of the KAML-D platform and troubleshoot issues on the platform and application level:
Prometheus and Grafana for end-to-end metrics and monitoring/alerting
EFK stack for (aggregrated) logging
Jaeger for (distributed) tracing
The user management and access control part is outside of the scope of KAML-D but standard integration points such as LDAP are offered.
machine_learning  kubernetes  TensorFlow 
february 2018 by amy
nikhilk/node-tensorflow: Node.js + TensorFlow
TensorFlow is Google's machine learning runtime. It is implemented as C++ runtime, along with Python framework to support building a variety of models, especially neural networks for deep learning.

From Nikhil

It is interesting to be able to use TensorFlow in a node.js application using just JavaScript (or TypeScript if that's your preference). However, the Python functionality is vast (several ops, estimator implementations etc.) and continually expanding. Instead, it would be more practical to consider building Graphs and training models in Python, and then consuming those for runtime use-cases (like prediction or inference) in a pure node.js and Python-free deployment. This is what this node module enables.

This module takes care of the building blocks and mechanics for working with the TensorFlow C API, and instead provides an API around Tensors, Graphs, Sessions and Models.

This is still in the works, and recently revamped to support TensorFlow 1.4+.
machinelearning  TensorFlow 
february 2018 by amy
Forbes: 12 Amazing Deep Learning Breakthroughs of 2017
1. DeepMind’s AlphaZero Clobbered The Top AI Champions In Go, Shogi, And Chess
2. OpenAI’s Universe Gained Traction With High-Profile Partners
3. Sonnet & Tensorflow Eager Joined Their Fellow Open-Source Frameworks
4. Facebook & Microsoft Joined Forces To Enable AI Framework Interoperability
5. Unity Enabled Developers To Easily Build Intelligent Agents In Games
6. Machine Learning As A Service (MLAAS) Platforms Sprout Up Everywhere
7. The Gan Zoo Continued To Grow
8. Who Needs Recurrence Or Convolution When You Have Attention? (Transformer)
9. AutoML Simplified The Lives Of Data Scientists & Machine Learning Engineers
10. Hinton Declared Backprop Dead, Finally Dropped His Capsule Networks
11. Quantum & Optical Computing Entered The AI Hardware Wars
12. Ethics & Fairness Of ML Systems Took Center Stage
machine_learning  TensorFlow  google  gcp 
february 2018 by amy
GitHub Issues | Kaggle
GitHub issue titles and descriptions for NLP analysis.
machine_learning  TensorFlow  kaggle  nlp  github 
february 2018 by amy
hannw/nlstm: Nested LSTM Cell
Here is a tensorflow implementation of Nested LSTM cell.
TensorFlow  machine_learning 
february 2018 by amy
models/research/slim/nets/nasnet at master · tensorflow/models
This directory contains the code for the NASNet-A model from the paper Learning Transferable Architectures for Scalable Image Recognition by Zoph et al. In nasnet.py there are three different configurations of NASNet-A that are implementented. One of the models is the NASNet-A built for CIFAR-10 and the other two are variants of NASNet-A trained on ImageNet, which are listed below.
TensorFlow  machine_learning  google 
january 2018 by amy
Stanford University: Tensorflow for Deep Learning Research
Course Description
TensorFlow is a powerful open-source software library for machine learning developed by researchers at Google. It has many pre-built functions to ease the task of building different neural networks. TensorFlow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of TensorFlow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use TensorFlow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks to solve tasks such as word embedding, translation, optical character recognition, reinforcement learning. Students will also learn best practices to structure a model and manage research experiments.
stanford  machine_learning  TensorFlow  education 
january 2018 by amy
[1707.03717] Using Transfer Learning for Image-Based Cassava Disease Detection
Cassava is the third largest source of carbohydrates for human food in the world but is vulnerable to virus diseases, which threaten to destabilize food security in sub-Saharan Africa. Novel methods of cassava disease detection are needed to support improved control which will prevent this crisis. Image recognition offers both a cost effective and scalable technology for disease detection. New transfer learning methods offer an avenue for this technology to be easily deployed on mobile devices. Using a dataset of cassava disease images taken in the field in Tanzania, we applied transfer learning to train a deep convolutional neural network to identify three diseases and two types of pest damage (or lack thereof). The best trained model accuracies were 98% for brown leaf spot (BLS), 96% for red mite damage (RMD), 95% for green mite damage (GMD), 98% for cassava brown streak disease (CBSD), and 96% for cassava mosaic disease (CMD). The best model achieved an overall accuracy of 93% for data not used in the training process. Our results show that the transfer learning approach for image recognition of field images offers a fast, affordable, and easily deployable strategy for digital plant disease detection.
machine_learning  TensorFlow 
january 2018 by amy
tensorflow/tensorflow/contrib/gan at master · tensorflow/tensorflow
TFGAN is a lightweight library for training and evaluating Generative Adversarial Networks (GANs). This technique allows you to train a network (called the 'generator') to sample from a distribution, without having to explicitly model the distribution and without writing an explicit loss. For example, the generator could learn to draw samples from the distribution of natural images. For more details on this technique, see 'Generative Adversarial Networks' by Goodfellow et al. See tensorflow/models for examples, and this tutorial for an introduction.
machine_learning  TensorFlow  google  GANs 
january 2018 by amy
Research Blog: TFGAN: A Lightweight Library for Generative Adversarial Networks
Training a neural network usually involves defining a loss function, which tells the network how close or far it is from its objective. For example, image classification networks are often given a loss function that penalizes them for giving wrong classifications; a network that mislabels a dog picture as a cat will get a high loss. However, not all problems have easily-defined loss functions, especially if they involve human perception, such as image compression or text-to-speech systems. Generative Adversarial Networks (GANs), a machine learning technique that has led to improvements in a wide range of applications including generating images from text, superresolution, and helping robots learn to grasp, offer a solution. However, GANs introduce new theoretical and software engineering challenges, and it can be difficult to keep up with the rapid pace of GAN research.

A video of a generator improving over time. It begins by producing random noise, and eventually learns to generate MNIST digits.
In order to make GANs easier to experiment with, we’ve open sourced TFGAN, a lightweight library designed to make it easy to train and evaluate GANs. It provides the infrastructure to easily train a GAN, provides well-tested loss and evaluation metrics, and gives easy-to-use examples that highlight the expressiveness and flexibility of TFGAN. We’ve also released a tutorial that includes a high-level API to quickly get a model trained on your data.
machine_learning  google  GANs  TensorFlow 
january 2018 by amy
[1712.01769] State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In our previous work, we have shown that such architectures are comparable to state-of-the-art ASR systems on dictation tasks, but it was not clear if such architectures would be practical for more challenging tasks such as voice search. In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. On the structural side, we show that word piece models can be used instead of graphemes. We introduce a multi-head attention architecture, which offers improvements over the commonly-used single-head attention. On the optimization side, we explore techniques such as synchronous training, scheduled sampling, label smoothing, and minimum word error rate optimization, which are all shown to improve accuracy. We present results with a unidirectional LSTM encoder for streaming recognition. On a 12,500 hour voice search task, we find that the proposed changes improve the WER of the LAS system from 9.2% to 5.6%, while the best conventional system achieve 6.7% WER. We also test both models on a dictation dataset, and our model provide 4.1% WER while the conventional system provides 5% WER.
machine_learning  google  TensorFlow  seq2seq 
january 2018 by amy
Research Blog: Improving End-to-End Models For Speech Recognition
Traditional automatic speech recognition (ASR) systems, used for a variety of voice search applications at Google, are comprised of an acoustic model (AM), a pronunciation model (PM) and a language model (LM), all of which are independently trained, and often manually designed, on different datasets [1]. AMs take acoustic features and predict a set of subword units, typically context-dependent or context-independent phonemes. Next, a hand-designed lexicon (the PM) maps a sequence of phonemes produced by the acoustic model to words. Finally, the LM assigns probabilities to word sequences. Training independent components creates added complexities and is suboptimal compared to training all components jointly. Over the last several years, there has been a growing popularity in developing end-to-end systems, which attempt to learn these separate components jointly as a single system. While these end-to-end models have shown promising results in the literature [2, 3], it is not yet clear if such approaches can improve on current state-of-the-art conventional systems.

Today we are excited to share “State-of-the-art Speech Recognition With Sequence-to-Sequence Models [4],” which describes a new end-to-end model that surpasses the performance of a conventional production system [1]. We show that our end-to-end system achieves a word error rate (WER) of 5.6%, which corresponds to a 16% relative improvement over a strong conventional system which achieves a 6.7% WER. Additionally, the end-to-end model used to output the initial word hypothesis, before any hypothesis rescoring, is 18 times smaller than the conventional model, as it contains no separate LM and PM.
machine_learning  google  TensorFlow  research 
january 2018 by amy
[1710.05941] Searching for Activation Functions
The choice of activation functions in deep networks has a significant effect on the training dynamics and task performance. Currently, the most successful and widely-used activation function is the Rectified Linear Unit (ReLU). Although various hand-designed alternatives to ReLU have been proposed, none have managed to replace it due to inconsistent gains. In this work, we propose to leverage automatic search techniques to discover new activation functions. Using a combination of exhaustive and reinforcement learning-based search, we discover multiple novel activation functions. We verify the effectiveness of the searches by conducting an empirical evaluation with the best discovered activation function. Our experiments show that the best discovered activation function, f(x)=x⋅sigmoid(βx), which we name Swish, tends to work better than ReLU on deeper models across a number of challenging datasets. For example, simply replacing ReLUs with Swish units improves top-1 classification accuracy on ImageNet by 0.9\% for Mobile NASNet-A and 0.6\% for Inception-ResNet-v2. The simplicity of Swish and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network.
machine_learning  google  TensorFlow 
january 2018 by amy
[1611.01578] Neural Architecture Search with Reinforcement Learning
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
machine_learning  google  TensorFlow 
january 2018 by amy
[1703.01041] Large-Scale Evolution of Image Classifiers
Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically. Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year. Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively. To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model. Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements.
machine_learning  google  TensorFlow 
january 2018 by amy
[1706.03762] Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
machine_learning  google  TensorFlow  attention  nlp 
december 2017 by amy
[1706.03292] Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Deep learning models can take weeks to train on a single GPU-equipped machine, necessitating scaling out DL training to a GPU-cluster. However, current distributed DL implementations can scale poorly due to substantial parameter synchronization over the network, because the high throughput of GPUs allows more data batches to be processed per unit time than CPUs, leading to more frequent network synchronization. We present Poseidon, an efficient communication architecture for distributed DL on GPUs. Poseidon exploits the layered model structures in DL programs to overlap communication and computation, reducing bursty network communication. Moreover, Poseidon uses a hybrid communication scheme that optimizes the number of bytes required to synchronize each layer, according to layer properties and the number of machines. We show that Poseidon is applicable to different DL frameworks by plugging Poseidon into Caffe and TensorFlow. We show that Poseidon enables Caffe and TensorFlow to achieve 15.5x speed-up on 16 single-GPU machines, even with limited bandwidth (10GbE) and the challenging VGG19-22K network for image classification. Moreover, Poseidon-enabled TensorFlow achieves 31.5x speed-up with 32 single-GPU machines on Inception-V3, a 50% improvement over the open-source TensorFlow (20x speed-up).
machine_learning  TensorFlow 
november 2017 by amy
« earlier      
per page:    204080120160

Copy this bookmark:



description:


tags: