FilterHN
new
ask
show
jobs
submit
FilterHN
show menu
Tiny hackable CUDA language model implementation
19 points
by
markusheimerl
2 days ago
|
past
| 1 comment
|
github.com
|
HN
▲
yobbo
1 hour ago
[-]
Looks very nice, but I can't find numerical gradient checks, which is helpful when verifying that backward pass is correct:
https://github.com/markusheimerl/gpt/blob/main/transformer/a...
reply