Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bad parallel scaling using Code Saturne with openmpi
From: Dugenoux Albert (dugenouxa_at_[hidden])
Date: 2012-07-11 16:21:50

Hi. To answer the differents remarks : 1) Code Saturne launch itself embedded python and bash scripts with the mpiexec parameters, but I will test your parameter next week and will give you the result of this benchmark. 2) I do not think there is a problem with the load balancing : Code Saturne partitions itself the mesh with the reliable and well-known Metis library which is the graph partitioner. So CPU are equally busy. 3) CPUs are Xeon which have multithreading capabilities. However I have tested it by setting np=24 in the server_priv/nodes file of the PBS server, and compared that with a configuration of np=12. The results are very similar : there is no gain of 20% or 30% 4) I will examine the hardware options as you have suggested but I will have to convince my office for such investissment ! ________________________________ De : Gus Correa <gus_at_[hidden]> À : Open MPI Users <users_at_[hidden]> Envoyé le : Mercredi 11 juillet 2012 0h51 Objet : Re: [OMPI users] Bad parallel scaling using Code Saturne with openmpi On 07/10/2012 05:31 PM, Jeff Squyres wrote: > +1.  Also, not all Ethernet switches are created equal -- > particularly commodity 1GB Ethernet switches. > I've seen plenty of crappy Ethernet switches rated for 1GB > that could not reach that speed when under load. > Are you perhaps belittling my dear $43 [brand undisclosed] 5-port GigE SoHo switch, that connects my Pentium-III toy cluster, just because it drops a few packages [per microsec]? It looks so good, with all those fiercely blinking green LEDs. Where else could I fool around with cluster setup and test the OpenMPI new releases? :) The production cluster is just too crowded for this, maybe because it has a decent HP GigE switch [IO] and Infiniband [MPI] ... Gus > > > On Jul 10, 2012, at 10:47 AM, Ralph Castain wrote: > >> I suspect it mostly reflects communication patterns. I don't know anything about Saturne, but shared memory is a great deal faster than TCP, so the more processes sharing a node the better. You may also be hitting some natural boundary in your model - perhaps with 8 processes/node you wind up with more processes that cross the node boundary, further increasing the communication requirement. >> >> Do things continue to get worse if you use all 4 nodes with 6 processes/node? >> >> >> On Jul 10, 2012, at 7:31 AM, Dugenoux Albert wrote: >> >>> Hi. >>> >>> I have recently built a cluster upon a Dell PowerEdge Server with a Debian 6.0 OS. This server is composed of >>> 4 system board of 2 processors of hexacores. So it gives 12 cores per system board. >>> The boards are linked with a local Gbits switch. >>> >>> In order to parallelize the software Code Saturne, which is a CFD solver, I have configured the cluster >>> such that there are a pbs server/mom on 1 system board and 3 mom and the 3 others cards. So this leads to >>> 48 cores dispatched on 4 nodes of 12 CPU. Code saturne is compiled with the openmpi 1.6 version. >>> >>> When I launch a simulation using 2 nodes with 12 cores, elapse time is good and network traffic is not full. >>> But when I launch the same simulation using 3 nodes with 8 cores, elapse time is 5 times the previous one. >>> I both cases, I use 24 cores and network seems not to be satured. >>> >>> I have tested several configurations : binaries in local file system or on a NFS. But results are the same. >>> I have visited severals forums (in particular >>> and read lots of threads, but as I am not an expert at clusters, I presently do not see where it is wrong ! >>> >>> Is it a problem in the configuration of PBS (I have installed it from the deb packages), a subtile compilation options >>> of openMPI, or a bad network configuration ? >>> >>> Regards. >>> >>> B. S. >>> _______________________________________________ >>> users mailing list >>> users_at_[hidden] >>> >> >> _______________________________________________ >> users mailing list >> users_at_[hidden] >> > > _______________________________________________ users mailing list users_at_[hidden]