Case Study: A world class image classifier for dogs and cats (err.., anything)

february 2018

It is amazing how far computer vision has come in the last couple of years. Problems that are insanely intractable for classical machine learning methods are a piece of cake for the emerging field of…

deep-learning
convolutions
convolutional-neural-networks
neural-networks
differential-learning-rates
learning-rate
kaggle
fast.ai
transfer-learning
from pocket
february 2018

Decoding the ResNet architecture // teleported.in

february 2018

A blog where I share my intuitions about artificial intelligence, machine learning, deep learning.

resnet
shortcut-connections
network-architecture
convolutional-neural-networks
cnns
deep-learning
fast.ai
from pocket
february 2018

Yet Another ResNet Tutorial (or not) – Apil Tamang – Medium

february 2018

The purpose of this article is to expose the most fundamental concept driving the design and success of ResNet architectures. Many blogs and articles go on and on describing how this architecture is…

ResNet
neural-networks
network-architecture
from pocket
february 2018

Improving the way we work with learning rate. – techburst

february 2018

Most optimization algorithms(such as SGD, RMSprop, Adam) require setting the learning rate — the most important hyper-parameter for training deep neural networks. Naive method for choosing learning…

deep-learning
learning-rate
cyclical-learning-rate
fast.ai
learning-rate-annealing
from pocket
february 2018

The Cyclical Learning Rate technique // teleported.in

february 2018

Learning rate (LR) is one of the most important hyperparameters to be tuned and holds key to faster and effective training of neural networks. Simply put, LR decides how much of the loss gradient is to be applied to our current weights to move them in the direction of lower loss.

Cyclical-Learning-Rate
Learning-Rate
fast.ai
SGDR
from pocket
february 2018

Batch normalization in Neural Networks

february 2018

This article explains batch normalization in a simple way. I wrote this article after what I learned from Fast.ai and deeplearning.ai. I will start with why we need it, how it works, then how to…

batch-normalization
transfer-learning
deep-learning
neural-networks
fast.ai
from pocket
february 2018

Why squared error? | benkuhn.net

february 2018

Why squared error is used instead of absolute error.

statistics
error
squared-error
absolute-error
from pocket
february 2018

Designing great data products - O'Reilly Media

february 2018

The Drivetrain Approach for optimization.

Step 1: Define the objective (and metric)

Step 2: Identify levers (what inputs can we control)

Step 3: Collect data

Step 4: Model (how the levers influence the objective)

Step 5: Simulate (see how the levers affect the distribution of the objective)

Step 6: Optimize (choose the possible outcome that best meets the objective)

data-science
mental-models
Drivetrain-Approach
from pocket
Step 1: Define the objective (and metric)

Step 2: Identify levers (what inputs can we control)

Step 3: Collect data

Step 4: Model (how the levers influence the objective)

Step 5: Simulate (see how the levers affect the distribution of the objective)

Step 6: Optimize (choose the possible outcome that best meets the objective)

february 2018

absolute-error
adam
arxiv
batch-normalization
batch-size
big-data
bilingual-word-embeddings
bitcoin
blockchain
blogs
boostrapping
career
class-imbalance
classification
cloud-computing
cnns
confidence-intervals
convolutional-neural-networks
convolutions
correlation
cryptocurrencies
cryptonetworks
cuda
cyclical-learning-rate
data-augmentation
data-science
datascience
decentralization
decentralized-networks
decision-trees
deep-learning
design-principles
differential-learning-rates
distributed-computing
double-exponential-smoothing
drivetrain-approach
dropout
ds
eda
embeddings
ensembles
error
exploratory-data-analysis
exponential-smoothing
exponentials
fast.ai
feature-importance
finance
forecasting
gartner-hype-cycles
generalization
gini-importance
gradient-boosting
holt-winters-forecasting
human-biases
human-computer-interaction
hyperparameter-optimization
hypothesis-testing
incentives
individual-conditional-expectation
instapaper
interpretability
interpretable
interpretation
jobs
kaggle
learning
learning-rate
learning-rate-annealing
learning-rate-finder
libraries
linear-regression
logarithm
logs
long-short-term-memory_networks
machine-learning
math
matplotlib
memex
mental-models
model-interpretation
modular-network
network-architecture
neural-networks
nlp
normal-distribution
oob-error
ordinary-least-squares
organization
oversampling
packages
pandas
partial-dependence
prioritization
probability
programming
python
random-forest
rectified-linear-unit
recurrent-neural-networks
reinforcement-learning
resnet
rnns
s-curve
scikit-learn
seasonality
sgd
sgdr
shared-representation
shortcut-connections
simple-exponential-smoothing
simple-moving-average
sklearn
smote
squared-error
statistics
t-sne
t-tests
test-time-augmentation
time-series-data
transfer-learning
tree-interpreter
triple-exponential-smoothing
undersampling
validation
visualization
waterfall-charts
word-embeddings