Intel seems to have tricked me slightly in this slide. The numbers they give are per Xe-core, I thought it was per Vector/Matrix Engine.
Operations/Clock per Engine:
Vector: 32 FP32/FP64, 64 FP16
Matrix: 256 TF32, 512 FP16/BF16, 1024 INT8
I only noticed because PVC would have to be running at 172 MHz for 45 TFLOPs FP32/FP64...
Calculating with Xe-cores instead of Vector/Matrix makes it 1.373 GHz, which is far more reasonable.