List of 27 papers (supposedly) given to John Carmack by Ilya Sutskever: "If you really learn all of these, you’ll
know
90% of what matters today.", taken from keshavchan
1787861946173186062
and arc folder D0472A20-9C20-4D3F-B145-D2865C0A9FEE.
It is not confirmed if this is the actual list, but I think its a good list anyway, so decided to save it in
case
it gets lost in the dead internet.
- The Annotated Transformer
(nlp.seas.harvard.edu)
- The First Law of Complexodynamics
(scottaaronson.blog)
- The Unreasonable
Effectiveness of RNNs (karpathy.github.io)
- Understanding LSTM
Networks (colah.github.io)
- Recurrent Neural Network Regularization
(arxiv.org)
- Keeping Neural Networks Simple
by Minimizing the Description Length of the Weights (cs.toronto.edu)
- Pointer Networks (arxiv.org)
- ImageNet Classification with Deep CNNs (proceedings.neurips.cc)
- Order Matters: Sequence to sequence for
sets (arxiv.org)
- GPipe: Efficient Training of Giant Neural
Networks using Pipeline Parallelism (arxiv.org)
- Deep Residual Learning for Image
Recognition (arxiv.org)
- Multi-Scale Context Aggregation by Dilated
Convolutions (arxiv.org)
- Neural Quantum Chemistry (arxiv.org)
- Attention Is All You Need (arxiv.org)
- Neural Machine Translation by Jointly Learning
to Align and Translate (arxiv.org)
- Identity Mappings in Deep Residual
Networks (arxiv.org)
- A Simple NN Module for Relational
Reasoning (arxiv.org)
- Variational Lossy Autoencoder (arxiv.org)
- Relational RNNs (arxiv.org)
- Quantifying the Rise and Fall of Complexity in
Closed Systems: The Coffee Automaton (arxiv.org)
- Neural Turing Machines (arxiv.org)
- Deep Speech 2: End-to-End Speech Recognition
in English and Mandarin (arxiv.org)
- Scaling Laws for Neural LMs (arxiv.org)
- A Tutorial Introduction to the Minimum
Description Length Principle (arxiv.org)
- Machine Super
Intelligence Dissertation (vetta.org)
- PAGE 434 onwards: Komogrov
Complexity (lirmm.fr)
- CS231n Convolutional Neural Networks for Visual
Recognition (cs231n.github.io)