19 bookmarks. First posted by skchrko 9 days ago.
architecture machine learning deeplearning
One model to learn them all Kaiser et al., arXiv 2017 You almost certainly have an abstract conception of a banana in your head. Suppose you ask me if I’d like anything to eat. I can say the word ‘banana’ (such that you hear it spoken), send you a text message whereby you see (and…
6 days ago by e2b
We’d need to be able to support different input and output modalities (as required by the task in hand), we’d need a common representation of the learned knowledge that was shared across all of these modalities, and we’d need sufficient ‘apparatus’ such that tasks which need a particular capability (e.g. attention) are able to exploit it. ‘One model to rule them all’ introduces a MultiModel architecture with exactly these features, and it performs impressively weai Architecture
8 days ago by janpeuker