HNNewShowAskJobs
Built with Tanstack Start
Matrix Core Programming on AMD CDNA Architecture(rocm.blogs.amd.com)
19 points by salykova 5 days ago | 2 comments
  • phkahler3 hours ago

    So from CDNA3 to 4 they doubled fp16 and fp8 performance but cut fp32 and fp64 by half?

    Wonder why the regression on non-AI workloads?

    • bigdict3 hours ago |parent

      cuz area and power