Skip to content

[website] Expected performance results on website are mostly wrong #520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sbryngelson opened this issue Jul 13, 2024 · 2 comments · Fixed by #537
Closed

[website] Expected performance results on website are mostly wrong #520

sbryngelson opened this issue Jul 13, 2024 · 2 comments · Fixed by #537
Labels
documentation Improvements or additions to documentation website Website changes

Comments

@sbryngelson
Copy link
Member

sbryngelson commented Jul 13, 2024

https://mflowcode.github.io/documentation/md_expectedPerformance.html

Most of these numbers are incorrect. It's unclear where things went wrong. @wilfonba and I already confirmed that the A100 numbers are incorrect.

One comment is that this page should have an example of how exactly to run the performance test locally. For example, the command ./mfc.sh run -n 8 -j 8 ./examples/3D_performance_test/case.py --case-optimization -t pre_process simulation or some such for CPU and the addition of --gpu for GPU cases.

I ran the 3D_performance_test example with 4M and 8M grid points on my M1 Max on 8 Cores, gfortran 14.1.0 and got:

  • 1M GPs (100^3): Performance: 74.107741811522786 ns/gp/eq/rhs
  • 4M GPs (159^3): Performance: 70.347097355807136 ns/gp/eq/rhs
  • 8M GPs (200^3): Performance: 71.969625308176333 ns/gp/eq/rhs

which is a factor of 5x faster than what's on the website for the M2 chip. I know the M1 Max is probably faster than the M2 for this workload, but not 5x faster. Again, @wilfonba replicated this problem on NV A100s as well. These results should all be updated.

We can remove Summit performance results instead of generic V100 test results. We also don't to have 1, 4, and 8M grid point cases. The numbers are so similar regardless. I think we should just converge on 8M grid points (200^3 simulation) for all performance tests, which is big enough to be meaningful but not too big to overwhelm the memory of any real device.

Open to other suggestions!

@sbryngelson sbryngelson added documentation Improvements or additions to documentation website Website changes labels Jul 13, 2024
@sbryngelson
Copy link
Member Author

sbryngelson commented Jul 13, 2024

I'm gathering some more info, all using 8M grid points. This is everything I have. I didn't run a test on Frontier, but we should also update that number.

Intel Xeon Gold 6226 CPU (Cascade Lake) @ 2.70GHz (on Phoenix), 12 core CPU, best performance using 12 cores, Intel oneAPI 2022.1.0

  • Performance: 151.599077472947 ns/gp/eq/rhs

AMD EPYC 7713 (Milan) 64-Core CPU, best performance using 32 cores. gcc12.1.0

  • Performance: 137.48353539352445 ns/gp/eq/rhs

M1 Max, 8 Cores. gcc14.1

  • Performance: 71.969625308176333 ns/gp/eq/rhs

RTX6000 (single-precision GPU upconverting to DP in software) @ Phoenix, NVHPC 22.11

  • Performance: 3.851041689413657 ns/gp/eq/rhs

A40 (single-precision GPU upconverting to DP in software) @ NCSA Delta, NVHPC 22.11

  • Performance: 3.316569112456631 ns/gp/eq/rhs

MI250X 1 GCD, CCE16.0.1

  • Performance: 1.0871197509246793 ns/gp/eq/rhs

A30 @ RG, NVHPC 24.1

  • Performance: 1.055906093866407 ns/gp/eq/rhs

V100-32GB @ Phoenix, NVHPC 24.5

  • Performance: 0.9892712201437496 ns/gp/eq/rhs

A100-80GB @ Phoenix, NVHPC 22.11

  • Performance: 0.6163026871295073 ns/gp/eq/rhs

H100 80GB PCIe @ Rogues Gallery, NVHPC 24.5

  • Performance: 0.4362547841810634 ns/gp/eq/rhs

GH200 @ Rogues Gallery, NVHPC 24.1, (only the GPU is used)

  • Performance: 0.3201266592472489 ns/gp/eq/rhs

@sbryngelson
Copy link
Member Author

sbryngelson commented Jul 24, 2024

I want to add A40 and RTX____ to this list (single precision GPU that will convert in software to DP)

Update: Added. Want to add MI100 and MI210 if possible. working on it.

@sbryngelson sbryngelson linked a pull request Jul 27, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation website Website changes
Development

Successfully merging a pull request may close this issue.

1 participant