They claim the algorithm "discovered" the new techniques, but the methods described in section 5 do not seem all that novel to me. It smells like it could be "laundering" the literature [1] and reshuffling existing techniques. This is not inherently a bad thing, but I would hope that if it is borrowing existing techniques, the appropriate citation would eventually make it into this paper.
In the future, we will all be Jürgen Schmidhuber. :-)
Am I reading this wrong, or does this only support FP16 inputs, and compares its performance against an FP32 solver?
> To valid kernel correctness, we need to compare its output to a reference correct kernel with the same inputs.
No, you need a numerical proof, which you don't have.
This is a standard which few kernels will ever meet. I'd say requiring a numerical proof is the same as requiring no proof at all - because it won't ever happen unless you're validating silicon or something equally expensive.