Construisons GPT : De zéro à l’algorithme, expliquons en détails.
Bien sûr! Pour pouvoir t’aider, j’ai besoin que tu me fournisses la transcription de la vidéo YouTube que tu souhaites résumer. Peux-tu me la donner, s’il te plaît ? Source : Andrej Karpathy | Date : 2023-01-17 17:33:27 | Durée : 01:56:20
Thank you for a superb tutorial. Major revelation for me was that actually, you need a lot more than just attention. I had neglected to add that last layer normalization (ln_f ) in the forward method of class BigramLanguageModel and my loss was stuck at around 2.5. Once i caught and corrected that error, I duplicated Karpathy's result. After having read a number of books and papers and taken a few of the Stanford graduate NLP and deep generative models courses, this really brought it all together.
Thank you so much for this video. You explain some very complex topics in simple ways. I understood so much more from this than many other yt videos 👍👍
Around 12:50 it is very weird that space is the first character but the code for it is "1" not "0" 🤔 Pretty sure enumerate shouldn't be doing that… UPD: nvm the first character is actually n or something
Thank you very much. I have read the paper of transformer multiple times in the past two years. I still feel that I had not completely grasped the full picture of it, until watching the video and going through the code you provided. It also clarified a few important concepts in deep learning, such as normalization, drop out, and residual communication. I guess that I will watch it at least one more time.
Imagine being between your job at Tesla and your job at OpenAI, being a tad bored and, just for fun, dropping on YouTube the best introduction to deep-learning and NLP from scratch so far, for free. Amazing people do amazing things even for a hobby.
Comments
14:47
Is this supposed to cause seizures?
Jesus he is good.
Thank you for a superb tutorial. Major revelation for me was that actually, you need a lot more than just attention. I had neglected to add that last layer normalization (ln_f ) in the forward method of class BigramLanguageModel and my loss was stuck at around 2.5. Once i caught and corrected that error, I duplicated Karpathy's result. After having read a number of books and papers and taken a few of the Stanford graduate NLP and deep generative models courses, this really brought it all together.
blew my mind!!
Thank you man!
Please, put Auto-dubbing in your videos please, i really want to learn. Thanks a lot
gyatt
"1000usd" thanks for the video
Thank you so much for this video. You explain some very complex topics in simple ways. I understood so much more from this than many other yt videos 👍👍
💗a great teacher out there!
Thank you for making this video!
I guess this is same one from freecodecamp which came 5 yrs ago
Around 12:50 it is very weird that space is the first character but the code for it is "1" not "0" 🤔
Pretty sure enumerate shouldn't be doing that…
UPD: nvm the first character is actually n or something
Is there a new successor to this guide ?
A deep and sincere thank you!
Thank you very much. I have read the paper of transformer multiple times in the past two years. I still feel that I had not completely grasped the full picture of it, until watching the video and going through the code you provided. It also clarified a few important concepts in deep learning, such as normalization, drop out, and residual communication. I guess that I will watch it at least one more time.
cool
Imagine being between your job at Tesla and your job at OpenAI, being a tad bored and, just for fun, dropping on YouTube the best introduction to deep-learning and NLP from scratch so far, for free. Amazing people do amazing things even for a hobby.
oooooo