janpeuker + ai   165

Car Wars | this.
In this work of speculative fiction author Cory Doctorow takes us into a near future where the roads are solely populated by self-driving cars.
ai  future  literature 
2 days ago by janpeuker
TensorFlow Wide & Deep Learning Tutorial  |  TensorFlow
In this tutorial, we'll introduce how to use the tf.estimator API to jointly train a wide linear model and a deep feed-forward neural network. This approach combines the strengths of memorization and generalization. It's useful for generic large-scale regression and classification problems with sparse input features (e.g., categorical features with a large number of possible feature values). If you're interested in learning more about how Wide & Deep Learning works, please check out our research paper.

The figure above shows a comparison of a wide model (logistic regression with sparse features and transformations), a deep model (feed-forward neural network with an embedding layer and several hidden layers), and a Wide & Deep model (joint training of both). At a high level, there are only 3 steps to configure a wide, deep, or Wide & Deep model using the tf.estimator API:

Select features for the wide part: Choose the sparse base columns and crossed columns you want to use.
Select features for the deep part: Choose the continuous columns, the embedding dimension for each categorical column, and the hidden layer sizes.
Put them all together in a Wide & Deep model (DNNLinearCombinedClassifier).
ai  library  algorithm 
7 days ago by janpeuker
AI Has a Hallucination Problem That's Proving Tough to Fix | WIRED
Solving that problem—which could challenge designers of self-driving vehicles—may require a more radical rethink of machine-learning technology. “The fundamental problem I would say is that a deep neural network is very different from a human brain,” says Li.

Humans aren’t immune to sensory trickery. We can be fooled by optical illusions, and a recent paper from Google created weird images that tricked both software and humans who glimpsed them for less than a tenth of a second to mistake cats for dogs. But when interpreting photos we look at more than patterns of pixels, and consider the relationship between different components of an image, such as the features of a person’s face, says Li.
psychology  bias  ai 
7 days ago by janpeuker
NSynth Super
As part of this exploration, they've created NSynth Super in collaboration with Google Creative Lab. It’s an open source experimental instrument which gives musicians the ability to make music using completely new sounds generated by the NSynth algorithm from 4 different source sounds. The experience prototype (pictured above) was shared with a small community of musicians to better understand how they might use it in their creative process.
music  hardware  ai 
9 days ago by janpeuker
Deep Neural Network implemented in pure SQL over BigQuery
Now let us look at the deeper implications of a distributed SQL engine in the context of deep learning. One limitation of warehouse SQL engines like BigQuery and Presto is that the query processing is performed using CPUs instead of GPUs. It would be interesting to check out the results with GPU-accelerated SQL databases like blazingdb and mapd. One straightforward approach to check out would be to perform query and data distribution using a distributed SQL engine and to perform the local computations using a GPU accelerated database.
database  analytics  ai 
9 days ago by janpeuker
Reptile: A Scalable Meta-Learning Algorithm
We’ve developed a simple meta-learning algorithm called Reptile which works by repeatedly sampling a task, performing stochastic gradient descent on it, and updating the initial parameters towards the final parameters learned on that task. This method performs as well as MAML, a broadly applicable meta-learning algorithm, while being simpler to implement and more computationally efficient.
humans’ fast-learning abilities can
be explained as Bayesian inference, and that the key to developing algorithms with human-level
learning speed is to make our algorithms more Bayesian. However, in practice, it is challenging to
develop (from first principles) Bayesian machine learning algorithms that make use of deep neural
networks and are computationally feasible.
Metalearning has emerged recently as an approach for learning from small amounts of data.
Rather than trying to emulate Bayesian inference (which may be computationally intractable), metalearning
seeks to directly optimize a fast-learning algorithm, using a dataset of tasks. Specifically,
we assume access to a distribution over tasks, where each task is, for example, a classification task.
From this distribution, we sample a training set and a test set. Our algorithm is fed the training
set, and it must produce an agent that has good average performance on the test set. Since each
task corresponds to a learning problem, performing well on a task corresponds to learning quickly.
ai  learning  research  psychology 
11 days ago by janpeuker
How to Make A.I. That’s Good for People - The New York Times
Sometimes this difference is trivial. For instance, in my lab, an image-captioning algorithm once fairly summarized a photo as “a man riding a horse” but failed to note the fact that both were bronze sculptures. Other times, the difference is more profound, as when the same algorithm described an image of zebras grazing on a savanna beneath a rainbow. While the summary was technically correct, it was entirely devoid of aesthetic awareness, failing to detect any of the vibrancy or depth a human would naturally appreciate.

That may seem like a subjective or inconsequential critique, but it points to a major aspect of human perception beyond the grasp of our algorithms. How can we expect machines to anticipate our needs — much less contribute to our well-being — without insight into these “fuzzier” dimensions of our experience?
ai  article  psychology 
13 days ago by janpeuker
12 Useful Things to Know about Machine Learning – James Le – Medium
10 — Simplicity Does Not Imply Accuracy
Occam’s razor famously states that entities should not be multiplied beyond necessity. In machine learning, this is often taken to mean that, given two classifiers with the same training error, the simpler of the two will likely have the lowest test error. Purported proofs of this claim appear regularly in the literature, but in fact there are many counter-examples to it, and the “no free lunch” theorems imply it cannot be true.
ai  research  bias 
14 days ago by janpeuker
The Building Blocks of Interpretability
In our view, features do not need to be flawless detectors for it to be useful for us to think about them as such. In fact, it can be interesting to identify when a detector misfires.

With regards to attribution, recent work suggests that many of our current techniques are unreliable. One might even wonder if the idea is fundamentally flawed, since a function’s output could be the result of non-linear interactions between its inputs. One way these interactions can pan out is as attribution being “path-dependent”. A natural response to this would be for interfaces to explicitly surface this information: how path-dependent is the attribution? A deeper concern, however, would be whether this path-dependency dominates the attribution.
ai  documentation  Emergence 
16 days ago by janpeuker
ND4J: N-Dimensional Arrays for Java - N-Dimensional Scientific Computing for Java
A usability gap has separated Java, Scala and Clojure programmers from the most powerful tools in data analysis, like NumPy or Matlab. Libraries like Breeze don’t support n-dimensional arrays, or tensors, which are necessary for deep learning and other tasks. Libraries like Colt and Parallel Colt use or have dependencies with GPL in the license, making them unsuitable for commercial use. ND4J and ND4S are used by national laboratories such as Nasa JPL for tasks such as climatic modeling, which require computationally intensive simulations.
ai  java  scala  Python  library 
19 days ago by janpeuker
Notes on Gartner’s 2018 Data Science and Machine Learning MQ | ML/DL
While Apache Spark remains the go-to tool for data engineering and application development. interest among data scientists peaked a year or so ago. TensorFlow is now the cool kid on the block. We’re also seeing renewed interest in Caffe/Caffe2, due to the hot market for image classification and recognition.

Yeah, I know. I forgot PyTorch.

Apache Flink has solid use cases in stream processing, but its champions no longer bother to say it’s a tool for machine learning. Here’s a bye-ku for Flink:

Ten guys in Berlin

Thought Flink would eat the world, but

Budding users yawned

We can also drop Mahout and Pig from the chart. And now that Neo4J has a Spark backend, you can stick a fork in GraphX. Please.
analytics  ai  opensource 
24 days ago by janpeuker
Preparing for Malicious Uses of AI
AI is a technology capable of immensely positive and immensely negative applications. We should take steps as a community to better evaluate research projects for perversion by malicious actors, and engage with policymakers to understand areas of particular sensitivity. As we write in the paper: “Surveillance tools can be used to catch terrorists or oppress ordinary citizens. Information content filters could be used to bury fake news or manipulate public opinion. Governments and powerful private actors will have access to many of these AI tools and could use them for public good or harm.” Some potential solutions to these problems include pre-publication risk assessments for certain bits of research, selectively sharing some types of research with a significant safety or security component among a small set of trusted organizations, and exploring how to embed norms into the scientific community that are responsive to dual-use concerns.
ai  security  future  research 
28 days ago by janpeuker
t-SNE – Laurens van der Maaten
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets. We applied it on data sets with up to 30 million examples. The technique and its variants are introduced in the following papers:
visualization  ai  algorithm 
4 weeks ago by janpeuker
The Benjamin Franklin Method of Reading Programming Books | Path-Sensitive
This process is a little bit like being a human autoencoder. An autoencoder is a neural network that tries to produce output the same as its input, but passing through an intermediate layer which is too small to fully represent the data. In doing so, it’s forced to learn a more compact representation. Here, the neural net in question is that den of dendrons in your head.

K. Anders Ericsson likens it to how artists practice by trying to imitate some famous work. Mathematicians are taught to attempt to prove most theorems themselves when reading a book or paper --- even if they can’t, they’ll have an easier time compressing the proof to its basic insight. I used this process to get a better eye for graphical design; it was like LASIK.

But the basic version idea applied to programming books is particularly simple yet effective.

Here’s how it works:
Read your programming book as normal. When you get to a code sample, read it over

Then close the book.

Then try to type it up.
learning  book  ai 
4 weeks ago by janpeuker
Attention and Augmented Recurrent Neural Networks
Neural Turing Machines [2] combine a RNN with an external memory bank. Since vectors are the natural language of neural networks, the memory is an array of vectors:

Memory is an array of vectors.
Network A writes and reads from this memory each step.
But how does reading and writing work? The challenge is that we want to make them differentiable. In particular, we want to make them differentiable with respect to the location we read from or write to, so that we can learn where to read and write. This is tricky because memory addresses seem to be fundamentally discrete. NTMs take a very clever solution to this: every step, they read and write everywhere, just to different extents.

As an example, let’s focus on reading. Instead of specifying a single location, the RNN outputs an “attention distribution” that describes how we spread out the amount we care about different memory positions. As such, the result of the read operation is a weighted sum.
ai  algorithm 
6 weeks ago by janpeuker
Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models | Blog | Explosion AI
An embedding table maps long, sparse, binary vectors into shorter, dense, continuous vectors. For example, imagine we receive our text as a sequence of ASCII characters. There are 256 possible values, so we can represent each value as a binary vector with 256 dimensions. The value for a will be a vector of 0s, with a 1 at column 97, while the value for b will be a vector of zeros with a 1 at column 98. This is called the "one hot" encoding scheme. Different values receive entirely different vectors.

Most neural network models begin by tokenising the text into words, and embedding the words into vectors. Other models extend the word vector representation with other information. For instance, it's often useful to pass forward a sequence of part-of-speech tags, in addition to the word IDs. You can then learn tag embeddings, and concatenate the tag embedding to the word embedding. This lets you push some amount of position-sensitive information into the word representation. However, there's a much more powerful way to make the word representations context-specific.

Step 2: Encode
Given a sequence of word vectors, the encode step computes a representation that I'll call a sentence matrix, where each row represents the meaning of each token in the context of the rest of the sentence.

The technology used for this purpose is a bidirectional RNN. Both LSTM and GRU architectures have been shown to work well for this. The vector for each token is computed in two parts: one part by a forward pass, and another part by a backward pass. To get the full vector, we simply stick the two together.
algorithm  ai 
6 weeks ago by janpeuker
Understanding LSTM Networks -- colah's blog
Attention isn’t the only exciting thread in RNN research. For example, Grid LSTMs by Kalchbrenner, et al. (2015) seem extremely promising. Work using RNNs in generative models – such as Gregor, et al. (2015), Chung, et al. (2015), or Bayer & Osendorfer (2015) – also seems very interesting. The last few years have been an exciting time for recurrent neural networks, and the coming ones promise to only be more so!
ai  algorithm  mathematics 
6 weeks ago by janpeuker
How to build your own AlphaZero AI using Python and Keras
This file contains the Residual_CNN class, which defines how to build an instance of the neural network.

It uses a condensed version of the neural network architecture in the AlphaGoZero paper — i.e. a convolutional layer, followed by many residual layers, then splitting into a value and policy head.

The depth and number of convolutional filters can be specified in the config file.

The Keras library is used to build the network, with a backend of Tensorflow.

To view individual convolutional filters and densely connected layers in the neural network, run the following inside the the run.ipynb notebook:
ai  Python  howto 
7 weeks ago by janpeuker
Ethics in Machine Learning – Roya Pakzad – Medium
The other issue is the need for collaboration between social scientists and AI researchers. You know, you can’t expect AI researchers themselves to come up with a clear understanding of fairness. Not only we need people in social sciences to collaborate with us in defining these words, but also we need to keep this collaboration all along to the end of the product research and development.

“One very important issue is the lack of a concrete definition of fairness.”
But it’s important to note that some collaborations between AI researchers and social scientists are already underway. For example, Solon Barocas (Cornell University) and Moritz Hardt at UC Berkeley have been working on the issue of defining and modeling fairness in active collaboration with social scientists.
society  ai  philosophy 
8 weeks ago by janpeuker
One model to learn them all | the morning paper
We’d need to be able to support different input and output modalities (as required by the task in hand), we’d need a common representation of the learned knowledge that was shared across all of these modalities, and we’d need sufficient ‘apparatus’ such that tasks which need a particular capability (e.g. attention) are able to exploit it. ‘One model to rule them all’ introduces a MultiModel architecture with exactly these features, and it performs impressively we
ai  Architecture 
9 weeks ago by janpeuker
Turning Design Mockups Into Code With Deep Learning - FloydHub Blog
LSTMs are a lot heavier for my cognition compared to CNNs. When I unrolled all the LSTMs they became easier to understand. Fast.ai’s video on RNNs was super useful. Also, focus on the input and output features before you try understanding how they work.
Building a vocabulary from the ground up is a lot easier than narrowing down a huge vocabulary. This includes everything from fonts, div sizes, hex colors to variable names and normal words.
Most of the libraries are created to parse text documents and not code. In documents, everything is separated by a space, but in code, you need custom parsing.
You can extract features with a model that’s trained on Imagenet. This might seem counterintuitive since Imagenet has few web images. However, the loss is 30% higher compared to to a pix2code model, which is trained from scratch. I’d be interesting to use a pre-train inception-resnet type of model based on web screenshots.
ai  design  engineering 
10 weeks ago by janpeuker
Google and Others Are Building AI Systems That Doubt Themselves - MIT Technology Review
The work reflects the realization that uncertainty is a key aspect of human reasoning and intelligence. Adding it to AI programs could make them smarter and less prone to blunders, says Zoubin Ghahramani, a prominent AI researcher who is a professor at the University of Cambridge and chief scientist at Uber.

This may prove vitally important as AI systems are used in ever more critical scenarios. “We want to have a rock-solid framework for deep learning, but make it easier for people to represent uncertainty,” Ghahramani told me recently over coffee one morning during a major AI conference in Long Beach, California.

Pyro, a new programming language released by Uber that merges deep learning with probabilistic programming.
library  ai  psychology  Emergence 
10 weeks ago by janpeuker
A gentle introduction to genetic algorithms | sausheong's space
More technically speaking, mutations get us out of a local maximum in order to find the global maximum. If we look at genetic algorithms as a mechanism to find the optimal solution, if we don’t have mutation, once a local maximum is found the mechanism will simply settle on that and never moves on to find the global maximum. Mutations can jolt the population out of a local maximum and therefore provide an opportunity for the algorithm to continue looking for the global maximum.
algorithm  Emergence  ai 
10 weeks ago by janpeuker
LeCun vs Rahimi: Has Machine Learning Become Alchemy?
LeCun agreed with Rahimi’s views on pedagogy, saying “Simple and general theorems are good… but it could very well be that we won’t have ‘simple’ theorems that are more specific to neural networks, for the same reasons we don’t have analytical solutions of Navier-Stokes or the 3-body problem.”

The Rahimi — LeCun debate grew into a wide-ranging discussion at NIPS and on the internet. Dr. Yiran Chen, Director of the Duke Center of Evolutionary Lab, attempted to make peace, suggesting LeCun had overreacted, and that the opposing positions were actually not so contradictory.
philosophy  ai  research 
10 weeks ago by janpeuker
Transfer Learning - Machine Learning's Next Frontier
In the real world, however, we would like an agent to be able to deal with tasks that gradually become more complex by leveraging its past experience. To this end, we need to enable a model to learn continuously without forgetting. This area of machine learning is known as learning to learn [36], meta-learning, life-long learning, or continuous learning.
ai  psychology  research 
11 weeks ago by janpeuker
Multivariate Linear Regression, Gradient Descent in JavaScript - RWieruch
Multivariate Gradient Descent (Vectorized) in JavaScript
Now it is time to implement the gradient descent algorithm to train the theta parameters of the hypothesis function. The hypothesis function can be used later on to predict future housing prices by their number of bedrooms and size. If you recall from the introductory article about gradient descent, the algorithm takes a learning rate alpha and an initial definition of the theta parameters for the hypothesis. After an amount of iterations, it returns the trained theta parameters.
javascript  howto  ai 
11 weeks ago by janpeuker
Neuroevolution: A different kind of deep learning - O'Reilly Media
deep learning traditionally focuses on programming an ANN to learn, while the concern in neuroevolution focuses on the origin of the architecture of the brain itself, which may encompass what is connected to what, the weights of those connections, and (sometimes) how those connections are allowed to change. There is, of course, some overlap between the two fields—an ANN still needs connection weights suited to its task, whether evolved or not, and it's possible that evolved ANNs might leverage the methods used in deep learning (for instance, stochastic gradient descent) to obtain those weights. In fact, deep learning might even be viewed as a sibling of neuroevolution that studies how weights are learned within either an evolved or preconceived architecture.

However, it's also conceivable that the mechanism of learning itself could be evolved, potentially transcending or elaborating the conventional techniques of deep learning as well. In short, the brain—including its architecture and how it learns—is a product of natural evolution, and neuroevolution can probe all the factors that contribute to its emergence, or borrow some from deep learning and let evolution determine the rest.
psychology  ai  Emergence 
december 2017 by janpeuker
How Adversarial Attacks Work
The simplest yet still very efficient algorithm is known as Fast Gradient Step Method (FGSM). The core idea is to add some weak noise on every step of optimization, drifting towards the desired class — or, if you wish, away from the correct one. Sometimes we will have to limit the amplitude of noise to keep the attack subtle — for example, in case a human might be investigating our shenanigans. The amplitude in our case means the intensity of a pixel’s channel — limiting it ensures that the noise will be almost imperceptible, and in the most extreme case will look like an overly compressed JPEG.
security  ai  research 
december 2017 by janpeuker
Feature Visualization
Neural feature visualization has made great progress over the last few years. As a community, we’ve developed principled ways to create compelling visualizations. We’ve mapped out a number of important challenges and found ways of a addressing them.

In the quest to make neural networks interpretable, feature visualization stands out as one of the most promising and developed research directions. By itself, feature visualization will never give a completely satisfactory understanding. We see it as one of the fundamental building blocks that, combined with additional tools, will empower humans to understand these systems.
ai  visualization  research  Emergence 
december 2017 by janpeuker
Crapularity Hermeneutics
Compared with 1970s/1980s database dragnets, contemporary big data analytics have only become even more speculative, since their focus is no longer on drawing conclusions for the present from the past, but on guessing the future, and since they no longer target people based on the fact that their data matches other database records but instead based on more speculative statistical probabilities of environmental factors and behavioral patterns. Whether or not human-created (and hence human-tainted) data is to be blamed for discrimination, or for the hidden assumptions hard-coded into algorithms that are employed for processing this data – or whether machine-generated data can even be biased – they all confirm Cayley’s observation that language is “easy to capture but difficult to read”;
The “open society” is now better known under the name coined by Popper’s Mont Pelerin Society collaborator Alexander Rüstow, “neoliberalism”,90 which has historically proven to be able to falsify anything but itself.

This explains the resurgence of fascism and other forms of populism in the context of the crapularity. On the basis of Carl Schmitt’s political theology, populism offers a more honest alternative to the existing regime: against equilibrium promises and crapular reality, the proposed antidote is the state of exception; against invisible hands, the remedy is decision-making as a virtue in itself, what Schmitt referred to as “decisionism”.91 In other words, the states of exception and decisionism that various “systems” (from international political treaties to big data analytics) and post-democratic powers currently conceal, seem to become tangible and accountable again through populist re-embodiment.
philosophy  ai  cybernetics  analytics 
december 2017 by janpeuker
How Cargo Cult Bayesians encourage Deep Learning Alchemy
Perhaps there is an equivalent to this in deep learning? “Every time you fire a statistician or Bayesian, then the performance of your deep learning system goes up.” ;-) The insinuation of Jelinek’s quote is that premature ideas of how complex systems work can be detrimental to its performance. We understand this in computer science as premature optimization, where if we pre-maturely optimize a subcomponent it can become a performance bottleneck later.
The legendary Isaac Newton was in fact very involved in alchemy. Here’s an image of his manuscript on the subject of transmutation for gold:
mathematics  algorithm  ai  history  Emergence 
december 2017 by janpeuker
Understanding Hinton’s Capsule Networks. Part I: Intuition.
Inspired by this idea, Hinton argues that brains, in fact, do the opposite of rendering. He calls it inverse graphics: from visual information received by eyes, they deconstruct a hierarchical representation of the world around us and try to match it with already learned patterns and relationships stored in the brain. This is how recognition happens. And the key idea is that representation of objects in the brain does not depend on view angle.
ai  psychology  research 
december 2017 by janpeuker
Machine Learning for Creativity and Design | NIPS 2017 Workshop, Long Beach, California, USA
We will look at algorithms for generation and creation of new media and new designs, engaging researchers building the next generation of generative models (GANs, RL, etc) and also from a more information-theoretic view of creativity (compression, entropy, etc). We will investigate the social and cultural impact of these new models, engaging researchers from HCI/UX communities.
conference  ai  innovation  art 
december 2017 by janpeuker
Understand Deep Residual Networks — a simple, modular learning framework that has redefined state…
Let us consider a shallower architecture and its deeper counterpart that adds more layers onto it. There exists a solution to the deeper model by construction: the layers are copied from the learned shallower model, and the added layers are identity mapping. The existence of this constructed solution indicates that a deeper model should produce no higher training error than its shallower counterpart.
ai  algorithm  performance 
november 2017 by janpeuker
Solving Logistic Regression with Newton's Method
I like to think of the likelihood function as “the likelihood that our model will correctly predict any given yy value, given its corresponding feature vector x̂ x^”. It is, however, important to distinguish between probability and likelihood..

Now, we expand our likelihood function by applying it to every sample in our training data. We multiply each individual likelihood together to get the cumulative likelihood that our model is accurately predicting yy values of our training data:
mathematics  howto  ai  algorithm 
november 2017 by janpeuker
TensorFlow and deep learning, without a PhD
TensorFlow and deep learning, without a PhD
ai  howto  library 
november 2017 by janpeuker
Colaboratory – Google
Colaboratory is a research project created to help disseminate machine learning education and research. It’s a Jupyter notebook environment that requires no setup to use. For more information, see our FAQ.
Python  visualization  ai  google 
november 2017 by janpeuker
[1711.00165] Deep Neural Networks as Gaussian Processes
In this work, we derive this correspondence and develop a computationally efficient pipeline to compute the covariance functions. We then use the resulting GP to perform Bayesian inference for deep neural networks on MNIST and CIFAR-10. We find that the GP-based predictions are competitive and can outperform neural networks trained with stochastic gradient descent.
ai  algorithm  research 
november 2017 by janpeuker
[1710.08864] One pixel attack for fooling deep neural networks
73.8% of the test images can be crafted to adversarial images with modification just on one pixel with 98.7% confidence on average. In addition, it is known that investigating the robustness problem of DNN can bring critical clues for understanding the geometrical features of the DNN decision map in high dimensional input space.
security  visualization  ai 
october 2017 by janpeuker
NeuroEvolution with MarI/O
Seth’s implementation (in Lua) is based on the concept of NeuroEvolution of Augmenting Topologies (or NEAT). NEAT is a type of genetic algorithm which generates efficient artificial neural networks (ANNs) from a very simple starting network. It does so rather quickly too (compared to other evolutionary algorithms).
ai  games  Emergence 
october 2017 by janpeuker
Edward – Home
A library for probabilistic modeling, inference, and criticism.

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming.

It supports modeling with

Directed graphical models
Neural networks (via libraries such as Keras and TensorFlow Slim)
Implicit generative models
Bayesian nonparametrics and probabilistic programs
ai  engineering  mathematics  model 
october 2017 by janpeuker
Machine Learning FAQ
Random Forests vs. SVMs

I would say that random forests are probably THE “worry-free” approach - if such a thing exists in ML: There are no real hyperparameters to tune (maybe except for the number of trees; typically, the more trees we have the better). On the contrary, there are a lot of knobs to be turned in SVMs: Choosing the “right” kernel, regularization penalties, the slack variable, …

Both random forests and SVMs are non-parametric models (i.e., the complexity grows as the number of training samples increases). Training a non-parametric model can thus be more expensive, computationally, compared to a generalized linear model, for example. The more trees we have, the more expensive it is to build a random forest. Also, we can end up with a lot of support vectors in SVMs; in the worst-case scenario, we have as many support vectors as we have samples in the training set. Although, there are multi-class SVMs, the typical implementation for mult-class classification is One-vs.-All; thus, we have to train an SVM for each class – in contrast, decision trees or random forests, which can handle multiple classes out of the box.

To summarize, random forests are much simpler to train for a practitioner; it’s easier to find a good, robust model. The complexity of a random forest grows with the number of trees in the forest, and the number of training samples we have. In SVMs, we typically need to do a fair amount of parameter tuning, and in addition to that, the computational cost grows linearly with the number of classes as well.
ai  howto  algorithm 
october 2017 by janpeuker
Research Blog: TensorFlow Lattice: Flexibility Empowered by Prior Knowledge
We take advantage of the look-up table’s structure, which can be keyed by multiple inputs to approximate an arbitrarily flexible relationship, to satisfy monotonic relationships that you specify in order to generalize better. That is, the look-up table values are trained to minimize the loss on the training examples, but in addition, adjacent values in the look-up table are constrained to increase along given directions of the input space, which makes the model outputs increase in those directions
ai  google  library  analytics 
october 2017 by janpeuker
CS231n Convolutional Neural Networks for Visual Recognition
These notes accompany the Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition.
For questions/concerns/bug reports contact Justin Johnson regarding the assignments, or contact Andrej Karpathy regarding the course notes. You can also submit a pull request directly to our git repo.
We encourage the use of the hypothes.is extension to annote comments and discuss these notes inline.
ai  howto 
october 2017 by janpeuker
The Unreasonable Effectiveness of Recurrent Neural Networks
Viewed this way, RNNs essentially describe programs. In fact, it is known that RNNs are Turing-Complete in the sense that they can to simulate arbitrary programs (with proper weights). But similar to universal approximation theorems for neural nets you shouldn’t read too much into this. In fact, forget I said anything.
ai  learning  engineering  Emergence  reference 
october 2017 by janpeuker
DAOs, DACs, DAs and More: An Incomplete Terminology Guide - Ethereum Blog
an AI is completely autonomous, whereas a DAO still requires heavy involvement from humans specifically interacting according to a protocol defined by the DAO in order to operate. We can classify DAOs, DOs (and plain old Os), AIs and a fourth category, plain old robots, according to a good old quadrant chart, with another quadrant chart to classify entities that do not have internal capital thus altogether making a cube:

DAOs == automation at the center, humans at the edges. Thus, on the whole, it makes most sense to see Bitcoin and Namecoin as DAOs, albeit ones that barely cross the threshold from the DA mark.
economics  ai  blockchain  reference 
october 2017 by janpeuker
[1705.07962] pix2code: Generating Code from a Graphical User Interface Screenshot
Transforming a graphical user interface screenshot created by a designer into computer code is a typical task conducted by a developer in order to build customized software, websites, and mobile applications. In this paper, we show that deep learning methods can be leveraged to train a model end-to-end to automatically generate code from a single input image with over 77% of accuracy for three different platforms (i.e. iOS, Android and web-based technologies).
gui  ai  research  design 
october 2017 by janpeuker
ML Algorithms addendum: Passive Aggressive Algorithms - Giuseppe Bonaccorso
Temporal - Time-based Algorithm

Crammer K., Dekel O., Keshet J., Shalev-Shwartz S., Singer Y., Online Passive-Aggressive Algorithms, Journal of Machine Learning Research 7 (2006) 551–585
ai  research 
october 2017 by janpeuker
Forget Killer Robots—Bias Is the Real AI Danger - MIT Technology Review
The problem of bias in machine learning is likely to become more significant as the technology spreads to critical areas like medicine and law, and as more people without a deep technical understanding are tasked with deploying it. Some experts warn that algorithmic bias is already pervasive in many industries, and that almost no one is making an effort to identify or correct it (see “Biased Algorithms Are Everywhere, and No One Seems to Care”).
psychology  bias  ai  analytics 
october 2017 by janpeuker
The Seven Deadly Sins of AI Predictions - MIT Technology Review
It turns out that many AI researchers and AI pundits, especially those pessimists who indulge in predictions about AI getting out of control and killing people, are similarly imagination-challenged. They ignore the fact that if we are able to eventually build such smart devices, the world will have changed significantly by then. We will not suddenly be surprised by the existence of such super-intelligences. They will evolve technologically over time, and our world will come to be populated by many other intelligences, and we will have lots of experience already. Long before there are evil super-intelligences that want to get rid of us, there will be somewhat less intelligent, less belligerent machines. Before that, there will be really grumpy machines. Before that, quite annoying machines. And before them, arrogant, unpleasant machines. We will change our world along the way, adjusting both the environment for new technologies and the new technologies themselves. I am not saying there may not be challenges. I am saying that they will not be sudden and unexpected, as many people think.
ai  research  article 
october 2017 by janpeuker
Is AI Riding a One-Trick Pony? - MIT Technology Review
Neural nets are just thoughtless fuzzy pattern recognizers, and as useful as fuzzy pattern recognizers can be—hence the rush to integrate them into just about every kind of software—they represent, at best, a limited brand of intelligence, one that is easily fooled. A deep neural net that recognizes images can be totally stymied when you change a single pixel, or add visual noise that’s imperceptible to a human. Indeed, almost as often as we’re finding new ways to apply deep learning, we’re finding more of its limits. Self-driving cars can fail to navigate conditions they’ve never seen before. Machines have trouble parsing sentences that demand common-sense understanding of how the world works.

Deep learning in some ways mimics what goes on in the human brain, but only in a shallow way—which perhaps explains why its intelligence can sometimes seem so shallow. Indeed, backprop wasn’t discovered by probing deep into the brain, decoding thought itself; it grew out of models of how animals learn by trial and error in old classical-conditioning experiments. And most of the big leaps that came about as it developed didn’t involve some new insight about neuroscience; they were technical improvements, reached by years of mathematics and engineering. What we know about intelligence is nothing against the vastness of what we still don’t know.
ai  research  psychology  algorithm 
october 2017 by janpeuker
Numenta.com • Guest Post: Behind the Idea – HTM Based Autonomous Agent
As a firm believer in the power of video games as a communication tool, my main goal was to explore the feasibility of an HTM based game agent which can explore its environment and learn behaviors that are rewarding. The literature is almost non-existent on an unsupervised HTM based autonomous agent. I proposed a real-time agent architecture involving a hierarchy of HTM layers that can learn action sequences with respect to the stimulated reward. This agent navigates a procedurally generated 3D environment and models the patterns streaming onto its visual sensor shown in Figures 1 and 2.
games  ai  research 
october 2017 by janpeuker
Bizarre interview with Tim O’Reilly where he praises Chinese authoritarianism and lauds Jeff Bezos
ai  culture  politics  from twitter_favs
october 2017 by janpeuker
How Computers Do Genocide
SHIBBOLETH MACHINES: Simulations of our machines show initial levels of apparently random behavior giving way, around generation 300, to high rates of cooperation that coincide with near-complete domination by a single machine that drives others to extinction. This enforced cooperation collapses around generation 450. From then on, the system alternates between these two extremes. Green and yellow bands correspond to eras of high and low cooperation, respectively.
Francis Fukuyama might have been thinking along these lines when he penned his end-of-history thesis in 1992. Though Fukuyama’s argument was rooted in 19th-century German philosophers such as Friedrich Nietzsche and Georg Wilhelm Friedrich Hegel, we might rewrite it this way: A sufficiently complex simulation of human life would terminate in a rational, liberal-democratic, and capitalist order standing against a scattered and dispersing set of enemies.
Prisoner's Dilemma Cellular Automata
ai  society  innovation 
september 2017 by janpeuker
Connectionism (Stanford Encyclopedia of Philosophy)
Philosophers have become interested in connectionism because it promises to provide an alternative to the classical theory of the mind: the widely held view that the mind is something akin to a digital computer processing a symbolic language. Exactly how and to what extent the connectionist paradigm constitutes a challenge to classicism has been a matter of hot debate in recent years.
ai  philosophy  psychology  research 
september 2017 by janpeuker
Build delightful and natural conversational experiences
Give users new ways to interact with your product by building engaging voice and text-based conversational apps with API.AI.

Chatbot / Google Assistant / Actions on Google
ai  development  android 
september 2017 by janpeuker
Unsupervised Feature Learning and Deep Learning Tutorial
Softmax regression (or multinomial logistic regression) is a generalization of logistic regression to the case where we want to handle multiple classes. In logistic regression we assumed that the labels were binary: y(i)∈{0,1}y(i)∈{0,1}. We used such a classifier to distinguish between two kinds of hand-written digits. Softmax regression allows us to handle y(i)∈{1,…,K}y(i)∈{1,…,K} where KK is the number of classes.
howto  algorithm  ai 
september 2017 by janpeuker
Meet Michelangelo: Uber's Machine Learning Platform
o address these issues, we created a DSL (domain specific language) that modelers use to select, transform, and combine the features that are sent to the model at training and prediction times. The DSL is implemented as sub-set of Scala. It is a pure functional language with a complete set of commonly used functions. With this DSL, we also provide the ability for customer teams to add their own user-defined functions. There are accessor functions that fetch feature values from the current context (data pipeline in the case of an offline model or current request from client in the case of an online model) or from the Feature Store.
scala  DSL  Architecture  ai 
september 2017 by janpeuker
It's Been 100 Years and the Robots Still Haven't Taken Over | Literary Hub
A similar, optimistic view of artificial intelligence informs Frank Herbert’s novel Destination: Void (1966). His four scientists aboard the spaceship Earthling—a psychiatrist, a life-systems engineer, a doctor who specializes in brain chemistry, and a computer scientist—represent the four disciplines most closely allied with the understanding and development of cognitive science. In the critical circumstances that attend their lone journey through space, they come to the realization that their survival depends on developing high-level artificial intelligence. Herbert’s view is clearly that machine intelligence in cooperation with human intelligence is our only hope for the future and that scientists are therefore indispensable for the very reasons that led to their vilification by the majority of novelists discussed hitherto.
ai  article  future 
september 2017 by janpeuker
Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
scikit-learn algorithm cheat sheet
- classification (labeled categories)
- clustering (unlabeled categories)
- regression (quantity, binary)
- dimensionality reduction (distribution, visualization, investigation)
ai  howto  Python 
september 2017 by janpeuker
Fooling The Machine | Popular Science
“We show you a photo that’s clearly a photo of a school bus, and we make you think it’s an ostrich,” says Ian Goodfellow, a researcher at Google who has driven much of the work on adversarial examples.
By altering the images fed into a deep neural network by just four percent, researchers were able to trick it into misclassifying the image with a success rate of 97 percent. Even when they did not know how the network was processing the images, they could deceive the network with nearly 85 percent accuracy. That latter research, tricking the network without knowing its architecture, is called a black box attack. This is the first documented research of a functional black box attack on a deep learning system, which is important because this is the most likely scenario in the real world.
ai  security  psychology 
september 2017 by janpeuker
Logistic Regression for Machine Learning - Machine Learning Mastery
Ultimately in predictive modeling machine learning projects you are laser focused on making accurate predictions rather than interpreting the results. As such, you can break some assumptions as long as the model is robust and performs well.

Binary Output Variable: This might be obvious as we have already mentioned it, but logistic regression is intended for binary (two-class) classification problems. It will predict the probability of an instance belonging to the default class, which can be snapped into a 0 or 1 classification.
Remove Noise: Logistic regression assumes no error in the output variable (y), consider removing outliers and possibly misclassified instances from your training data.
Gaussian Distribution: Logistic regression is a linear algorithm (with a non-linear transform on output). It does assume a linear relationship between the input variables with the output. Data transforms of your input variables that better expose this linear relationship can result in a more accurate model. For example, you can use log, root, Box-Cox and other univariate transforms to better expose this relationship.
Remove Correlated Inputs: Like linear regression, the model can overfit if you have multiple highly-correlated inputs. Consider calculating the pairwise correlations between all inputs and removing highly correlated inputs.
Fail to Converge: It is possible for the expected likelihood estimation process that learns the coefficients to fail to converge. This can happen if there are many highly correlated inputs in your data or the data is very sparse (e.g. lots of zeros in your input data).
ai  algorithm  howto  mathematics 
september 2017 by janpeuker
New version of Cloud Datalab: Jupyter meets TensorFlow, cloud meets local deployment | Google Cloud Big Data and Machine Learning Blog  |  Google Cloud Platform
Google Cloud Datalab beta, an easy-to-use interactive tool for large-scale data exploration, analysis and visualization using Google Cloud Platform services such as Google BigQuery, Google App Engine Flex and Google Cloud Storage. Based on Jupyter (formerly IPython),
ai  Python  visualization  cloud 
september 2017 by janpeuker
Human-Centered Machine Learning – Google Design – Medium
4. Weigh the costs of false positives and false negatives
Your ML system will make mistakes. It’s important to understand what these errors look like and how they might affect the user’s experience of the product. In one of the questions in point 2 we mentioned something called the confusion matrix. This is a key concept in ML and describes what it looks like when an ML system gets it right and gets it wrong.
ai  google  research  design 
august 2017 by janpeuker
Introduction to Local Interpretable Model-Agnostic Explanations (LIME) - O'Reilly Media
Introduction to Local Interpretable Model-Agnostic Explanations (LIME)
A technique to explain the predictions of any machine learning classifier.

By Marco Tulio RibeiroSameer SinghCarlos Guestrin August 12, 2016

Happy predictions. (source: Jared Hersch on Flickr)
Machine learning is at the core of many recent advances in science and technology. With computers beating professionals in games like Go, many people have started asking if machines would also make for better drivers or even better doctors.

In many applications of machine learning, users are asked to trust a model to help them make decisions. A doctor will certainly not operate on a patient simply because “the model said so.” Even in lower-stakes situations, such as when choosing a movie to watch from Netflix, a certain measure of trust is required before we surrender hours of our time based on a model. Despite the fact that many machine learning models are black boxes, understanding the rationale behind the model's predictions would certainly help users decide when to trust or not to trust their predictions. An example is shown in Figure 1, in which a model predicts that a certain patient has the flu. The prediction is then explained by an "explainer" that highlights the symptoms that are most important to the model. With this information about the rationale behind the model, the doctor is now empowered to trust the model—or not.
ai  research  documentation 
august 2017 by janpeuker
Project Naptha
Project Naptha highlight, copy and translate text from any image

Project Naptha started out as "Images as Text" the 2nd place winning entry to 2013's HackMIT Hackathon.

It launched 5 months later, reaching 200,000 users in a week, and was featured on the front page of Hacker News, Reddit, Engadget, Lifehacker, The Verge, and PCWorld.
ai  visualization  Software 
august 2017 by janpeuker
Introducing Seldon Deploy – Open Source Machine Learning – Medium

Model Explanations
In May 2018 the new General Data Protection Regulation (GDPR) will give consumers a legal “right to explanation” from organisations that use algorithmic decision making.
And as more important decisions are being made and automated on the basis of machine learning models, organisations are seeking to understand why models give a certain output. This is a tough challenge considering there are many types of models with varying degrees of interpretability. E.g. It’s easy to traverse the decision tree generated by a random forest algorithm, but the connections between the nodes and layers of a neural network model are beyond human comprehension.
ai  devops  cloud  library 
august 2017 by janpeuker
Hands-On Machine Learning with Scikit-Learn and TensorFlow - O'Reilly Media
Explore the machine learning landscape, particularly neural nets
Use scikit-learn to track an example machine-learning project end-to-end
Explore several training models, including support vector machines, decision trees, random forests, and ensemble methods
Use the TensorFlow library to build and train neural nets
Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning
Learn techniques for training and scaling deep neural nets
Apply practical code examples without acquiring excessive machine learning theory or algorithm details
book  Python  ai 
august 2017 by janpeuker
Grotesque and Gorgeous: 100,000 Art and Medicine Images Released for Open Use
These images, freely available from the Wellcome Library, exist at the intersection of art and medicine.
ai  visualization  medicine  from twitter_favs
august 2017 by janpeuker
Research Blog: Facets: An Open Source Visualization Tool for Machine Learning Training Data
Facets, an open source visualization tool to aid in understanding and analyzing ML datasets. Facets consists of two visualizations that allow users to see a holistic picture of their data at different granularities. Get a sense of the shape of each feature of the data using Facets Overview, or explore a set of individual observations using Facets Dive.
visualization  ai  analytics 
july 2017 by janpeuker
CS231n Convolutional Neural Networks for Visual Recognition
Human Perception - Convolutional Neural Networks take advantage of the fact that the input consists of images and they constrain the architecture in a more sensible way. In particular, unlike a regular Neural Network, the layers of a ConvNet have neurons arranged in 3 dimensions: width, height, depth. (Note that the word depth here refers to the third dimension of an activation volume, not to the depth of a full Neural Network, which can refer to the total number of layers in a network.)
ai  algorithm  psychology 
july 2017 by janpeuker
Probabilistic programming from scratch
A simple algorithm for Bayesian inference

We can do that using Bayesian inference. Bayesian inference is a method for updating your knowledge about the world with the information you learn during an experiment. It derives from a simple equation called Bayes’s Rule. In its most advanced and efficient forms, it can be used to solve huge problems. But we’re going use a specific, simple inference algorithm called Approximate Bayesian Computation (ABC), which is barely a couple of lines of Python:
ai  engineering  mathematics  analytics 
july 2017 by janpeuker
Rise of the machines: who is the ‘internet of things’ good for? | Technology | The Guardian
There is a clear philosophical position, even a worldview, behind all of this: that the world is in principle perfectly knowable, its contents enumerable and their relations capable of being meaningfully encoded in a technical system, without bias or distortion. As applied to the affairs of cities, this is effectively an argument that there is one and only one correct solution to each identified need; that this solution can be arrived at algorithmically, via the operations of a technical system furnished with the proper inputs; and that this solution is something that can be encoded in public policy, without distortion. (Left unstated, but strongly implicit, is the presumption that whatever policies are arrived at in this way will be applied transparently, dispassionately and in a manner free from politics.)
iot  ai  article  philosophy 
june 2017 by janpeuker
TensorFlow Linear Model Tutorial  |  TensorFlow
How Logistic Regression Works

Finally, let's take a minute to talk about what the Logistic Regression model actually looks like in case you're not already familiar with it. We'll denote the label as
, and the set of observed features as a feature vector
. We define
if an individual earned > 50,000 dollars and
otherwise. In Logistic Regression, the probability of the label being positive (
) given the features
is given as:

are the model weights for the features
is a constant that is often called the bias of the model. The equation consists of two parts—A linear model and a logistic function:
Model training is an optimization problem: The goal is to find a set of model weights (i.e. model parameters) to minimize a loss function defined over the training data, such as logistic loss for Logistic Regression models. The loss function measures the discrepancy between the ground-truth label and the model's prediction. If the prediction is very close to the ground-truth label, the loss value will be low; if the prediction is very far from the label, then the loss value would be high.
mathematics  ai  howto 
june 2017 by janpeuker
TensorFlow Lite Introduces Machine Learning in Mobile Apps
TensorFlow Lite, a streamlined version of TensorFlow for mobile, was announced by Dave Burke, vice president of engineering for Android. Mr. Burke said: “TensorFlow Lite will leverage a new neural network API to tap into silicate specific accelerators, and over time we expect to see DSPs (Digital Signal Processors) specifically designed for neural network inference and training.” He also added: “We think these new capabilities will help power the next generation of on-device speech processing, visual search, augmented reality, and more.” TensorFlow Lite comes at a time where silicon manufacturers like Qualcomm have begun adding on-chip machine learning capabilities to their products, and as OEMs have increasingly been adopting varying degrees of “AI” into their ROMs.
android  ai  library 
june 2017 by janpeuker
« earlier      
per page:    204080120160

Copy this bookmark: