Software 2.0 – Andrej Karpathy – Medium


105 bookmarks. First posted by kartik november 2017.


I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
yesterday by goldenberg
It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program. In these cases, the programmers will split into two teams. The 2.0 programmers manually curate, maintain, massage, clean and label datasets; each labeled example literally programs the final system because the dataset gets compiled into Software 2.0 code via the optimization. Meanwhile, the 1.0 programmers maintain the surrounding tools, analytics, visualizations, labeling interfaces, infrastructure, and the training code.

Software 2.0 is: Computationally homogeneous. It is much easier to make various correctness/performance guarantees.

Simple to bake into silicon. As a corollary, since the instruction set of a neural network is relatively small, it is significantly easier to implement these networks much closer to silicon, e.g. with custom ASICs, neuromorphic chips, and so on.

Constant running time. Every iteration of a typical neural net forward pass takes exactly the same amount of FLOPS. There is zero variability based on the different execution paths your code could take through some sprawling C++ code base.
Constant memory use. Related to the above, there is no dynamically allocated memory anywhere so there is also little possibility of swapping to disk, or memory leaks that you have to hunt down in your code.

It is highly portable. A sequence of matrix multiplies is significantly easier to run on arbitrary computational configurations compared to classical binaries or scripts.
in Software 2.0 we can take our network, remove half of the channels, retrain, and there — it runs exactly at twice the speed and works a bit worse.

Finally, and most importantly, a neural network is a better piece of code than anything you or I can come up with in a large fraction of valuable verticals, which currently at the very least involve anything to do with images/video and sound/speech.

The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”, e.g., by silently adopting biases in their training data, which are very difficult to properly analyze and examine when their sizes are easily in the millions in most cases.

Finally, we’re still discovering some of the peculiar properties of this stack. For instance, the existence of adversarial examples and attacks highlights the unintuitive nature of this stack.

When the network fails in some hard or rare cases, we do not fix those predictions by writing code, but by including more labeled examples of those cases. Who is going to develop the first Software 2.0 IDEs, which help with all of the workflows in accumulating, visualizing, cleaning, labeling, and sourcing datasets? Perhaps the IDE bubbles up images that the network suspects are mislabeled based on the per-example loss, or assists in labeling by seeding labels with predictions, or suggests useful examples to label based on the uncertainty of the network’s predictions.
machinelearning  workbench 
8 days ago by mike
"The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”, e.g., by silently adopting biases in their training data, which are very difficult to properly analyze and examine when their sizes are easily in the millions in most cases." - Software 2.0 https://ift.tt/2hsOCzx
quote 
may 2018 by techczech
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. Unfortunately, this interpretation completely misses the forest for the trees. Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0.
Softwear  Engineering  SiliconValley 
may 2018 by mikon_nikon
Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0.
AI  ML  coding  future 
march 2018 by cierniak
It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program. A large portion of programmers of tomorrow do not maintain complex software repositories, write intricate programs, or analyze their running times. They collect, clean, manipulate, label, analyze and visualize data that feeds neural networks.
machine-learning  design  philosophy  business  career 
march 2018 by brandon.w.barry

In contrast, Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard (I tried). Instead, we specify some constraints on the behavior of a desirable program (e.g., a dataset of input output pairs of examples) and use the computational resources at our disposal to search the program space for a program that satisfies the constraints. In the case of neural networks, we restrict the search to a continuous subset of the program space where the search process can be made (somewhat surprisingly) efficient with backpropagation and stochastic gradient descent.

It is very agile. If you had a C++ code and someone wanted you to make it twice as fast (at cost of performance if needed), it would be highly non-trivial to tune the system for the new spec. However, in Software 2.0 we can take our network, remove half of the channels, retrain, and there — it runs exactly at twice the speed and works a bit worse. It’s magic. Conversely, if you happen to get more data/compute, you can immediately make your program work better just by adding more channels and retraining.
Modules can meld into an optimal whole. Our software is often decomposed into modules that communicate through public functions, APIs, or endpoints. However, if two Software 2.0 modules that were originally trained separately interact, we can easily backpropagate through the whole. Think about how amazing it could be if your web browser could automatically re-design the low-level system instructions 10 stacks down to achieve a higher efficiency in loading web pages. With 2.0, this is the default behavior.

And Software 3.0? That will be entirely up to the AGI.
programming  computerscience  history  future  2017  teslamotors 
february 2018 by WimLeers
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. via Pocket
IFTTT  Pocket 
january 2018 by arronpj
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win…
radar 
january 2018 by twleung
Software 2.0 – Andrej Karpathy – Medium
from twitter_favs
december 2017 by afternoon
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
december 2017 by pulsar
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
december 2017 by Blubser
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by leftyotter
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by svs
Quote: "Software 2.0 is not going to replace 1.0 (indeed, a large amount of 1.0 infrastructure is needed for training and inference to “compile” 2.0 code), but it is going to take over increasingly large portions of what Software 1.0 is responsible for today. Let’s examine some examples of the ongoing transition to make this more concrete..." The ".. increasingly large portions of what Software 1.0 is.." seems like a stretch since the vast majority of code in the world isn't doing speech recogn...
ai  programming  software  machinelearning  future  ml 
november 2017 by ajohnson1200
Software 2.0 – Andrej Karpathy – Medium via Instapaper http://ift.tt/2hsOCzx
IFTTT  Instapaper 
november 2017 by dotimpact
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by motdiem
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by blanghals
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by thomas.carrington
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by mledu
Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0.
software  essay 
november 2017 by danmichaelo
The “classical stack” of Software 1.0 is what we’re all familiar with. It consists of explicit instructions to the computer written by a programmer. In contrast, Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard.

Benefits:
1. Computationally homogeneous.
2. Simple to bake into silicon.
3. Constant running time.
4. Constant memory use.
5. It is highly portable.
6. It is very agile.
7. Modules can meld into an optimal whole.
8. It is easy to pick up.
9. It is better than you.

Limitations:
1. At the end of the optimization we’re left with large networks that work well, but it’s very hard to tell how.
2. The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”, e.g., by silently adopting biases in their training data.
3. Finally, we’re still discovering some of the peculiar properties of this stack. For instance, the existence of adversarial examples and attacks.
programming  ai  benefits  limitations 
november 2017 by drmeme
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. Unfortunately, this interpretation completely misses the forest for the trees. Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0.
software  ai  programming  future  neural_networks  machine_learning 
november 2017 by archangel
New blog post: "Software 2.0"
november 2017 by dpl
RT ehddn1 : 테슬라 AI 부문 디렉터 Andrej Karpathy가 말하는 Software 2.0. 다소 길지만, 미래 소프트웨어 진화에 대한 다양한 견해를 접한다는 측면에서 읽어둘만한 내용인듯.. http://bit.ly/2jt5tD3 November 15, 2017 at 07:26AM http://twitter.com/ehddn1/status/930562531252363266
IFTTT  Twitter  ththlink 
november 2017 by seoulrain
Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard (I tried). Instead, we specify some constraints on the behavior of a desirable program (e.g., a dataset of input output pairs of examples) and use the computational resources at our disposal to search the program space for a program that satisfies the constraints. In the case of neural networks, we restrict the search to a continuous subset of the program space where the search process can be made (somewhat surprisingly) efficient with backpropagation and stochastic gradient descent.

It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data than to explicitly write the program.

If you think of neural networks as a software stack and not just a pretty good classifier, it becomes quickly apparent that they have a huge number of advantages and a lot of potential for transforming software in general.
development  !publish 
november 2017 by zephyr777
Software 2.0 – Andrej Karpathy
from twitter
november 2017 by nicola
In the future, humans will exist to provide training data for neural nets.
engineering  machinelearning  from iphone
november 2017 by danielbachhuber
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win…
Software_Engineering 
november 2017 by jnavon
Andrej Karpathy on how data-taught self-coding networks will write software of the future:
<p>Software 2.0 is not going to replace 1.0 (indeed, a large amount of 1.0 infrastructure is needed for training and inference to “compile” 2.0 code), but it is going to take over increasingly large portions of what Software 1.0 is responsible for today. Let’s examine some examples of the ongoing transition to make this more concrete:

<strong>Visual Recognition</strong> used to consist of engineered features with a bit of machine learning sprinkled on top at the end (e.g., SVM). Since then, we developed the machinery to discover much more powerful image analysis programs (in the family of ConvNet architectures), and more recently we’ve begun searching over architectures.

<strong>Speech recognition</strong> used to involve a lot of preprocessing, gaussian mixture models and hidden markov models, but today consist almost entirely of neural net stuff.

<strong>Speech synthesis</strong> has historically been approached with various stitching mechanisms, but today the state of the art models are large convnets (e.g. WaveNet) that produce raw audio signal outputs.

<strong>Machine Translation</strong> has usually been approaches with phrase-based statistical techniques, but neural networks are quickly becoming dominant. My favorite architectures are trained in the multilingual setting, where a single model translates from any source language to any target language, and in weakly supervised (or entirely unsupervised) settings.

<strong>Robotics</strong> has a long tradition of breaking down the problem into blocks of sensing, pose estimation, planning, control, uncertainty modeling etc., using explicit representations and algorithms over intermediate representations. We’re not quite there yet, but research at UC Berkeley and Google hint at the fact that Software 2.0 may be able to do a much better job of representing all of this code.

<strong>Games:</strong> Go playing programs have existed for a long while, but AlphaGo Zero (a ConvNet that looks at the raw state of the board and plays a move) has now become by far the strongest player of the game. I expect we’re going to see very similar results in other areas, e.g. DOTA 2, or StarCraft.</p>
ai  programming 
november 2017 by charlesarthur
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions.
november 2017 by pitiphong_p
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win…
programming  future  ai  generalized  omnicomprensive 
november 2017 by gilberto5757
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win…
ml  ai 
november 2017 by nham
It is better than you. Finally, and most importantly, a neural network is a better piece of code than anything you or I can come up with in a large fraction of valuable verticals, which currently at the very least involve anything to do with images/video, sound/speech, and text.
AI 
november 2017 by kristofger
RT : New blog post: "Software 2.0"
from twitter
november 2017 by dave_sullivan
RT : New blog post: "Software 2.0"
from twitter
november 2017 by sktrill
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. via Pocket
IFTTT  Pocket 
november 2017 by tkhwang
I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there,…
from instapaper
november 2017 by wahoo5