Discussion
Loading...

Post

Log in
  • Sign up
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Kevin Thomas ✅
Kevin Thomas ✅
@kevinthomas@defcon.social  ·  activity timestamp 20 hours ago

Stop importing "Magic." Start importing Math.

If you write from transformers import AutoModel, you are a consumer. If you write class SelfAttention(nn.Module), you are an Engineer.

Abstraction is great for production, but it is poison for understanding. To truly master Large Language Models, you have to strip away the libraries and look at the raw tensors.

I decided to build a GPT architecture from scratch to see exactly where the gradients flow and where they vanish.

I call it MicroGPT.

In my latest engineering log, I break down the entire architecture line-by-line in pure PyTorch:

🔹 The Mechanics: How Query, Key, and Value actually interact.

🔹 The Mask: Why we force the model to respect the "arrow of time."

🔹 The Block: Pre-Norm vs. Post-Norm and why the order matters.

🔹 The Expansion: Why the Feed Forward Network blows up dimensions by 4x.

This isn't a "Hello World" tutorial. This is an architectural teardown of the engine that changed the world.

If you are ready to stop guessing and start building, read the full code breakdown here: https://mytechnotalent.substack.com/

#ArtificialIntelligence #MachineLearning #DeepLearning #PyTorch #Engineering #BuildInPublic

Sorry, no caption provided by author
Sorry, no caption provided by author
Sorry, no caption provided by author
  • Copy link
  • Flag this post
  • Block

Indieweb Studio

This is a relaxed, online social space for the indieweb community, brought to you by indieweb.social.

Please abide by our code of conduct and have a nice time!

Indieweb Studio: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in Create account
  • Explore
  • About
  • Members
  • Code of Conduct