Debugging misaligned completions with sparse-autoencoder latent attribution
1 points
1 day ago
| 0 comments
| alignment.openai.com
| HN
No one has commented on this post.