Hi,
I am running WRF simulations on multiple nodes and am running into
problems where the simulation will randomly slow down. The model still
works, but slows down tremendously. I looked at the each node and
found that 1 node will only be using 25% of the CPU, while the others
are using 100%. Is there a chance that this is related to MPI? I can
resubmit the same run on a different nodes and sometimes it will work,
and other times it slows down.
Is there any commands I can utilize that could point me to what is causing the node only to use 25%?
Thanks