jm + training   9

A history of the neural net/tank legend in AI, and other examples of reward hacking
@gwern: "A history of the neural net/tank legend in AI: https://t.co/2s4AOGMS3a (Feel free to suggest more sightings or examples of reward hacking!)"
gwern  history  ai  machine-learning  ml  genetic-algorithms  neural-networks  perceptron  learning  training  data  reward-hacking 
8 weeks ago by jm
Put Down the Pink Dumbbell
So, ladies, let’s first put down the two-pound, pink dumbbells. We have been sold a false story about fitness, health (and its connection to weight loss).
I was exercised by wolves. And I’m going to tell you all the secrets and tricks I learned by avoiding the fitness-industrial complex.
Most of what I’ll say applies to men, but I have discovered that most of the outrageously wrong advice is given to women. [...]
So, here: truth number one. Very few of us consider strength-training as essential exercise, but it is. It is especially crucial as one ages, because a natural part of the aging process is losing muscle. Women, especially, need to lift weights, and the trick to lifting weights is stressing muscles. And that weight has to be a real weight, progressively increased, and barring health issues, an average woman should not even bother with two pound weights because that won’t stress your muscles enough to benefit you.
Exercise industry is surely partially to blame for why people don’t exercise regularly: they promise the wrong thing (weight loss) and then don’t push/guide people to do the right thing.
exercise  health  fitness  weight-loss  zeynep-tufekci  strength  aging  weights  training 
april 2017 by jm
Research Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data
Great stuff from Google - this is really nifty stuff for large-scale privacy-preserving machine learning usage:

It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.

Federated Learning allows for smarter models, lower latency, and less power consumption, all while ensuring privacy. And this approach has another immediate benefit: in addition to providing an update to the shared model, the improved model on your phone can also be used immediately, powering experiences personalized by the way you use your phone.

Papers:
https://arxiv.org/pdf/1602.05629.pdf , https://arxiv.org/pdf/1610.05492.pdf
google  ml  machine-learning  training  federated-learning  gboard  models  privacy  data-privacy  data-protection 
april 2017 by jm
How a Machine Learns Prejudice - Scientific American
Agreed, this is a big issue.
If artificial intelligence takes over our lives, it probably won’t involve humans battling an army of robots that relentlessly apply Spock-like logic as they physically enslave us. Instead, the machine-learning algorithms that already let AI programs recommend a movie you’d like or recognize your friend’s face in a photo will likely be the same ones that one day deny you a loan, lead the police to your neighborhood or tell your doctor you need to go on a diet. And since humans create these algorithms, they're just as prone to biases that could lead to bad decisions—and worse outcomes.
These biases create some immediate concerns about our increasing reliance on artificially intelligent technology, as any AI system designed by humans to be absolutely "neutral" could still reinforce humans’ prejudicial thinking instead of seeing through it.
prejudice  bias  machine-learning  ml  data  training  race  racism  google  facebook 
january 2017 by jm
Tesla Autopilot mode is learning
This is really impressive, but also a little scary. Drivers driving the Tesla Model S are "phoning home" training data as they drive:
A Model S owner by the username Khatsalano kept a count of how many times he had to “rescue” (meaning taking control after an alert) his Model S while using the Autopilot on his daily commute. He counted 6 “rescues” on his first day, by the fourth day of using the system on his 23.5 miles commute, he only had to take control over once. Musk said that Model S owners could add ~1 million miles of new data every day, which is helping the company create “high precision maps”.


Wonder if the data protection/privacy implications have been considered for EU use.
autopilot  tesla  maps  mapping  training  machine-learning  eu  privacy  data-protection 
november 2015 by jm
The reusable holdout: Preserving validity in adaptive data analysis
Useful stats hack from Google: "We show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses."
statistics  google  reusable-holdout  training  ml  machine-learning  data-analysis  holdout  corpus  sampling 
august 2015 by jm
How Curiosity, Luck, and the Flip of a Switch Saved the Moon Program | Motherboard
"SCE to off?" someone said. The switch was so obscure that neither of his bosses knew what he was talking about. "What the hell's that," blurted out Gerald Carr, who was in charge of communicating with the capsule. The rookie flight director, Gerry Griffin, didn't know either.

Sixty seconds had passed since the initial lightning strike. No one else knew what to do. The call to abort was fast approaching. 

Finally, Carr reluctantly gave the order in a voice far cooler than the moment. "Apollo 12, Houston, try SCE to Auxiliary, over."
spaceflight  stories  apollo  sce-to-aux  power  lightning  weather  outages  simulation  training  nasa 
november 2014 by jm
Practical machine learning tricks from the KDD 2011 best industry paper
Wow, this is a fantastic paper. It's a Google paper on detecting scam/spam ads using machine learning -- but not just that, it's how to build out such a classifier to production scale, and make it operationally resilient, and, indeed, operable.

I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
classification  via:codeslinger  training  machine-learning  google  ops  kdd  best-practices  anti-spam  classifiers  ensemble  map-reduce 
july 2012 by jm
_Building High-level Features Using Large Scale Unsupervised Learning_ [paper, PDF]
"We consider the problem of building highlevel, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images using unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art."
algorithms  machine-learning  neural-networks  sgd  labelling  training  unlabelled-learning  google  research  papers  pdf 
june 2012 by jm

Copy this bookmark:



description:


tags: