22 bookmarks. First posted by skchrko january 2018.
architecture machine learning deeplearning
One model to learn them all Kaiser et al., arXiv 2017 You almost certainly have an abstract conception of a banana in your head. Suppose you ask me if I’d like anything to eat. I can say the word ‘banana’ (such that you hear it spoken), send you a text message whereby you see (and…
january 2018 by e2b
We’d need to be able to support different input and output modalities (as required by the task in hand), we’d need a common representation of the learned knowledge that was shared across all of these modalities, and we’d need sufficient ‘apparatus’ such that tasks which need a particular capability (e.g. attention) are able to exploit it. ‘One model to rule them all’ introduces a MultiModel architecture with exactly these features, and it performs impressively weai Architecture
january 2018 by janpeuker