But if this is targeting real-world applications, I'd have concerns about price-to-performance. High-level synthesis tools often result in fairly poor performance compared to writing Verilog or SystemVerilog. Also, AI-focused SoCs like the Nvidia Jetson usually offer better price-to-performance and performance-per-watt than FPGA systems like the KV260.
Potentially focusing on specialized transformer architectures with high sparsity or significant quantization could give FPGAs an advantage over AI chips, though.
Not to toot my own horn, but I wrote up a piece on open-source FPGA development recently going a bit deeper into some of these insights, and why AI might not be the best use-case for open-source FPGA applications: https://www.zach.be/p/how-to-build-a-commercial-open-source
> High-level synthesis tools often result in fairly poor performance compared to writing Verilog or SystemVerilog.
Agreed.