amy + tensorflow   469

tensor2tensor/tensor2tensor/mesh_tensorflow at master · tensorflow/tensor2tensor
Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying a broad class of distributed tensor computations. The purpose of mesh-tensorflow is to formalize and implement distribution strategies for your computation graph over your hardware/processors For example: "Split the batch over rows of processors and split the units in the hidden layer across columns of processors." Mesh-TensorFlow is implemented as a layer over TensorFlow.
TensorFlow  machine_learning 
3 days ago by amy
tensorflow/probability: Probabilistic reasoning and statistical analysis in TensorFlow
TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFlow ecosystem, TensorFlow Probability provides integration of probabilistic methods with deep networks, gradient-based inference via automatic differentiation, and scalability to large datasets and models via hardware acceleration (e.g., GPUs) and distributed computation.
TensorFlow  statistics  machine_learning  probability 
9 days ago by amy
tf.contrib.eager.defun  |  TensorFlow
Compiles a Python function into a callable TensorFlow graph.
TensorFlow  machine_learning 
4 weeks ago by amy
AutoGraph converts Python into TensorFlow graphs – TensorFlow – Medium
We’d like to tell you about a new TensorFlow feature called “AutoGraph”. AutoGraph converts Python code, including control flow, print() and other Python-native features, into pure TensorFlow graph code.
TensorFlow  machine_learning  google 
4 weeks ago by amy
training-data-analyst/serving_embed.ipynb at master · GoogleCloudPlatform/training-data-analyst
Serving embeddings
This notebook illustrates how to:

- Create a custom embedding as part of a regression/classification model
- Representing categorical variables in different ways
- Math with feature columns
- Serve out the embedding, as well as the original model's predictions
TensorFlow  machine_learning  gcp 
5 weeks ago by amy
tensorflow - train_and_evaluate() batch size with TPU on GCMLE - Stack Overflow
The batch size handling is slightly different between normal Estimator and TPUEstimator.

For normal Estimator, the batch size is not explicitly visible to Estimator; instead, it is part of the input_fn story, like your example is doing.

For TPU, batch size is handled differently. To be specific, the "xxx_batch_size" family, e.g., train batch size, in TPUEstimator constructor is the global batch size for your model. By changing the tf.contrib.tpu.TPUConfig.per_host_input_for_training, your input_fn is invoked by TPUEstimator in different ways.

Here, the params['batch_size'] is the shard batch size, calculated by the train_batch_size in constructor.

A concrete example is: Say, train_batch_size is 64, and for Cloud TPU,

if per_host_input_for_training is False, input_fn will be invoked 8 times on Cloud TPU (this is called per-core mode). In this case, the params['batch_size'] in input_fn is 64/8=8. The total global batch size your model sees is 64, which is the train_batch_size above passed via TPUEstimator constructor.

If flipping the per_host_input_for_training to bool true, params['batch_size'] in input_fn will be 64 (not 64/8) and the input_fn will be called only once. So, global batch size is still 64.

The same input_fn can work in both case.

For TPU Pods, this is the same story as params['batch_size'] is the shard batch size with respect to each host.

To summarize:

The global batch size should be passed via TPUEstimator constructor.

The input_fn should take the shard batch size from params['batch_size'] and respect that to create your dataset.
TensorFlow  machine_learning  TPUs 
10 weeks ago by amy
Predicting San Francisco Bikeshare. availability with TensorFlow and LSTMs
Predicting San Francisco Bikeshare availability with TensorFlow and LSTMs
machine_learning  TensorFlow  LSTMs 
10 weeks ago by amy
Google AI Blog: Advances in Semantic Textual Similarity
The recent rapid progress of neural network-based natural language understanding research, especially on learning semantic text representations, can enable truly novel products such as Smart Compose and Talk to Books. It can also help improve performance on a variety of natural language tasks which have limited amounts of training data, such as building strong text classifiers from as few as 100 labeled examples.

Below, we discuss two papers reporting recent progress on semantic representation research at Google, as well as two new models available for download on TensorFlow Hub that we hope developers will use to build new and exciting applications.
TensorFlow  machine_learning  google 
12 weeks ago by amy
model-analysis/examples/chicago_taxi at master · tensorflow/model-analysis
The Chicago Taxi example demonstrates the end-to-end workflow and steps of how to transform data, train a model, analyze and serve it, using:

TensorFlow Transform for feature preprocessing
TensorFlow Estimators for training
TensorFlow Model Analysis and Jupyter for evaluation
TensorFlow Serving for serving
The example shows two modes of deployment.

The first is a “local mode” with all necessary dependencies and components deployed locally.
The second is a “cloud mode”, where all components will be deployed on Google Cloud.
In the future we will be showing additional deployment modes, so dear reader, feel free to check back in periodically!
machine_learning  TensorFlow  google 
april 2018 by amy
google/deepvariant: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
google  genomics  dna  machine_learning  TensorFlow 
march 2018 by amy
SeldonIO/seldon-core: Machine Learning Deployment for Kubernetes
Seldon Core is an open source platform for deploying machine learning models on Kubernetes.
machine_learning  kubernetes  k8s  TensorFlow 
march 2018 by amy
Cyclic computational graphs with Tensorflow or Theano - Stack Overflow
TensorFlow does support cyclic computation graphs. The tf.while_loop() function allows you to specify a while loop with arbitrary subgraphs for the condition and the body of the loop, and the runtime will execute the loop in parallel. The tf.scan() function is a higher-level API that is similar to Theano's theano.scan() function. Both allow you to loop over tensors of dynamic size.
TensorFlow 
march 2018 by amy
Twitter
RT : I'm doing a webinar on a development workflow for models written in -- to go from data…
tensorflow  machinelearning  from twitter
march 2018 by amy
Propel ML
Propel provides a GPU-backed numpy-like infrastructure for scientific computing in JavaScript. JavaScript is a fast, dynamic language which, we think, could act as an ideal workflow for scientific programmers of all sorts.
machine_learning  TensorFlow  javascript 
february 2018 by amy
Machine Learning with TensorFlow on Google Cloud Platform: code samples
new courses!: “Machine Learning with TensorFlow on Google Cloud Platform: code samples”

via @lak_gcp
TensorFlow  machine_learning  gcp 
february 2018 by amy
design | Architecture and UX design of KAML-D
KAML-D can be deployed on any cloud (or on-premises) platform that allows you to run Kubernetes. Most of the components are open source. As a SaaS, it integrates with the cloud providers (user) identity management system, on-prem something like LDAP.

Existing open source components KAML-D uses:

Kubernetes for workload management and to ensure portability
TensorFlow for machine learning execution
JupyterHub for data scientists (dev/test of algorithms)
Storage layer: To hold the datasets, Minio, Ceph, as well as cloud-provider specific offerings such as EBS, with built-in dotmesh support for snapshots
New components KAML-D introduces:

KAML-D Workbench: a graphical UI for data scientists, data engineers, developers, and SREs to manage datasets as well as to test and deploy ML algorithms. Builds on the metadata layer to find and visualize datasets. Builds on the storage layer to store and load datasets.
KAML-D Metadata Hub: a data and metadata layer using PrestoDB and Elasticsearch for indexing and querying datasets.
KAML-D Observation Hub: a comprehensive observability suite for SREs and admins (as well as developers on the app level) to understand the health of the KAML-D platform and troubleshoot issues on the platform and application level:
Prometheus and Grafana for end-to-end metrics and monitoring/alerting
EFK stack for (aggregrated) logging
Jaeger for (distributed) tracing
The user management and access control part is outside of the scope of KAML-D but standard integration points such as LDAP are offered.
machine_learning  kubernetes  TensorFlow 
february 2018 by amy
nikhilk/node-tensorflow: Node.js + TensorFlow
TensorFlow is Google's machine learning runtime. It is implemented as C++ runtime, along with Python framework to support building a variety of models, especially neural networks for deep learning.

From Nikhil

It is interesting to be able to use TensorFlow in a node.js application using just JavaScript (or TypeScript if that's your preference). However, the Python functionality is vast (several ops, estimator implementations etc.) and continually expanding. Instead, it would be more practical to consider building Graphs and training models in Python, and then consuming those for runtime use-cases (like prediction or inference) in a pure node.js and Python-free deployment. This is what this node module enables.

This module takes care of the building blocks and mechanics for working with the TensorFlow C API, and instead provides an API around Tensors, Graphs, Sessions and Models.

This is still in the works, and recently revamped to support TensorFlow 1.4+.
machinelearning  TensorFlow 
february 2018 by amy
Forbes: 12 Amazing Deep Learning Breakthroughs of 2017
1. DeepMind’s AlphaZero Clobbered The Top AI Champions In Go, Shogi, And Chess
2. OpenAI’s Universe Gained Traction With High-Profile Partners
3. Sonnet & Tensorflow Eager Joined Their Fellow Open-Source Frameworks
4. Facebook & Microsoft Joined Forces To Enable AI Framework Interoperability
5. Unity Enabled Developers To Easily Build Intelligent Agents In Games
6. Machine Learning As A Service (MLAAS) Platforms Sprout Up Everywhere
7. The Gan Zoo Continued To Grow
8. Who Needs Recurrence Or Convolution When You Have Attention? (Transformer)
9. AutoML Simplified The Lives Of Data Scientists & Machine Learning Engineers
10. Hinton Declared Backprop Dead, Finally Dropped His Capsule Networks
11. Quantum & Optical Computing Entered The AI Hardware Wars
12. Ethics & Fairness Of ML Systems Took Center Stage
machine_learning  TensorFlow  google  gcp 
february 2018 by amy
GitHub Issues | Kaggle
GitHub issue titles and descriptions for NLP analysis.
machine_learning  TensorFlow  kaggle  nlp  github 
february 2018 by amy
hannw/nlstm: Nested LSTM Cell
Here is a tensorflow implementation of Nested LSTM cell.
TensorFlow  machine_learning 
february 2018 by amy
models/research/slim/nets/nasnet at master · tensorflow/models
This directory contains the code for the NASNet-A model from the paper Learning Transferable Architectures for Scalable Image Recognition by Zoph et al. In nasnet.py there are three different configurations of NASNet-A that are implementented. One of the models is the NASNet-A built for CIFAR-10 and the other two are variants of NASNet-A trained on ImageNet, which are listed below.
TensorFlow  machine_learning  google 
january 2018 by amy
Stanford University: Tensorflow for Deep Learning Research
Course Description
TensorFlow is a powerful open-source software library for machine learning developed by researchers at Google. It has many pre-built functions to ease the task of building different neural networks. TensorFlow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of TensorFlow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use TensorFlow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks to solve tasks such as word embedding, translation, optical character recognition, reinforcement learning. Students will also learn best practices to structure a model and manage research experiments.
stanford  machine_learning  TensorFlow  education 
january 2018 by amy
[1707.03717] Using Transfer Learning for Image-Based Cassava Disease Detection
Cassava is the third largest source of carbohydrates for human food in the world but is vulnerable to virus diseases, which threaten to destabilize food security in sub-Saharan Africa. Novel methods of cassava disease detection are needed to support improved control which will prevent this crisis. Image recognition offers both a cost effective and scalable technology for disease detection. New transfer learning methods offer an avenue for this technology to be easily deployed on mobile devices. Using a dataset of cassava disease images taken in the field in Tanzania, we applied transfer learning to train a deep convolutional neural network to identify three diseases and two types of pest damage (or lack thereof). The best trained model accuracies were 98% for brown leaf spot (BLS), 96% for red mite damage (RMD), 95% for green mite damage (GMD), 98% for cassava brown streak disease (CBSD), and 96% for cassava mosaic disease (CMD). The best model achieved an overall accuracy of 93% for data not used in the training process. Our results show that the transfer learning approach for image recognition of field images offers a fast, affordable, and easily deployable strategy for digital plant disease detection.
machine_learning  TensorFlow 
january 2018 by amy
tensorflow/tensorflow/contrib/gan at master · tensorflow/tensorflow
TFGAN is a lightweight library for training and evaluating Generative Adversarial Networks (GANs). This technique allows you to train a network (called the 'generator') to sample from a distribution, without having to explicitly model the distribution and without writing an explicit loss. For example, the generator could learn to draw samples from the distribution of natural images. For more details on this technique, see 'Generative Adversarial Networks' by Goodfellow et al. See tensorflow/models for examples, and this tutorial for an introduction.
machine_learning  TensorFlow  google  GANs 
january 2018 by amy
Research Blog: TFGAN: A Lightweight Library for Generative Adversarial Networks
Training a neural network usually involves defining a loss function, which tells the network how close or far it is from its objective. For example, image classification networks are often given a loss function that penalizes them for giving wrong classifications; a network that mislabels a dog picture as a cat will get a high loss. However, not all problems have easily-defined loss functions, especially if they involve human perception, such as image compression or text-to-speech systems. Generative Adversarial Networks (GANs), a machine learning technique that has led to improvements in a wide range of applications including generating images from text, superresolution, and helping robots learn to grasp, offer a solution. However, GANs introduce new theoretical and software engineering challenges, and it can be difficult to keep up with the rapid pace of GAN research.

A video of a generator improving over time. It begins by producing random noise, and eventually learns to generate MNIST digits.
In order to make GANs easier to experiment with, we’ve open sourced TFGAN, a lightweight library designed to make it easy to train and evaluate GANs. It provides the infrastructure to easily train a GAN, provides well-tested loss and evaluation metrics, and gives easy-to-use examples that highlight the expressiveness and flexibility of TFGAN. We’ve also released a tutorial that includes a high-level API to quickly get a model trained on your data.
machine_learning  google  GANs  TensorFlow 
january 2018 by amy
[1712.01769] State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In our previous work, we have shown that such architectures are comparable to state-of-the-art ASR systems on dictation tasks, but it was not clear if such architectures would be practical for more challenging tasks such as voice search. In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. On the structural side, we show that word piece models can be used instead of graphemes. We introduce a multi-head attention architecture, which offers improvements over the commonly-used single-head attention. On the optimization side, we explore techniques such as synchronous training, scheduled sampling, label smoothing, and minimum word error rate optimization, which are all shown to improve accuracy. We present results with a unidirectional LSTM encoder for streaming recognition. On a 12,500 hour voice search task, we find that the proposed changes improve the WER of the LAS system from 9.2% to 5.6%, while the best conventional system achieve 6.7% WER. We also test both models on a dictation dataset, and our model provide 4.1% WER while the conventional system provides 5% WER.
machine_learning  google  TensorFlow  seq2seq 
january 2018 by amy
Research Blog: Improving End-to-End Models For Speech Recognition
Traditional automatic speech recognition (ASR) systems, used for a variety of voice search applications at Google, are comprised of an acoustic model (AM), a pronunciation model (PM) and a language model (LM), all of which are independently trained, and often manually designed, on different datasets [1]. AMs take acoustic features and predict a set of subword units, typically context-dependent or context-independent phonemes. Next, a hand-designed lexicon (the PM) maps a sequence of phonemes produced by the acoustic model to words. Finally, the LM assigns probabilities to word sequences. Training independent components creates added complexities and is suboptimal compared to training all components jointly. Over the last several years, there has been a growing popularity in developing end-to-end systems, which attempt to learn these separate components jointly as a single system. While these end-to-end models have shown promising results in the literature [2, 3], it is not yet clear if such approaches can improve on current state-of-the-art conventional systems.

Today we are excited to share “State-of-the-art Speech Recognition With Sequence-to-Sequence Models [4],” which describes a new end-to-end model that surpasses the performance of a conventional production system [1]. We show that our end-to-end system achieves a word error rate (WER) of 5.6%, which corresponds to a 16% relative improvement over a strong conventional system which achieves a 6.7% WER. Additionally, the end-to-end model used to output the initial word hypothesis, before any hypothesis rescoring, is 18 times smaller than the conventional model, as it contains no separate LM and PM.
machine_learning  google  TensorFlow  research 
january 2018 by amy
[1710.05941] Searching for Activation Functions
The choice of activation functions in deep networks has a significant effect on the training dynamics and task performance. Currently, the most successful and widely-used activation function is the Rectified Linear Unit (ReLU). Although various hand-designed alternatives to ReLU have been proposed, none have managed to replace it due to inconsistent gains. In this work, we propose to leverage automatic search techniques to discover new activation functions. Using a combination of exhaustive and reinforcement learning-based search, we discover multiple novel activation functions. We verify the effectiveness of the searches by conducting an empirical evaluation with the best discovered activation function. Our experiments show that the best discovered activation function, f(x)=x⋅sigmoid(βx), which we name Swish, tends to work better than ReLU on deeper models across a number of challenging datasets. For example, simply replacing ReLUs with Swish units improves top-1 classification accuracy on ImageNet by 0.9\% for Mobile NASNet-A and 0.6\% for Inception-ResNet-v2. The simplicity of Swish and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network.
machine_learning  google  TensorFlow 
january 2018 by amy
« earlier      
per page:    204080120160

Copy this bookmark:



description:


tags: