It depends on the application you are using. Some are "balanced" - i.e., they run faster if the number of processes is a power of two. You'll see that n8 is faster than n7, so this is likely the situation.


On Jun 6, 2013, at 4:10 PM, "Blosch, Edwin L" <edwin.l.blosch@lmco.com> wrote:

I am running single-node Sandy Bridge cases with OpenMPI and looking at scaling.
 
I’m using –bind-to-core without any other options (default is –bycore I believe).
 
These numbers indicate number of cores first, then the second digit is the run number (except for n=1, all runs repeated 3 times).  Any thought why n15 should be so much slower than n16?   I also measure the RSS of the running processes, and the rank 0 process for n=15 cases uses about 2x more memory than all the other ranks, whereas all the ranks use the same amount of memory for the n=16 cases.
 
Thanks for insights,
 
Ed
 
n1.1:    6.9530   
n2.1:    7.0185   
n2.2:    7.0313   
n3.1:    8.2069
n3.2:    8.1628   
n3.3:    8.1311   
n4.1:    7.5307   
n4.2:    7.5323   
n4.3:    7.5858   
n5.1:    9.5693   
n5.2:    9.5104   
n5.3:    9.4821   
n6.1:    8.9821   
n6.2:    8.9720   
n6.3:    8.9541   
n7.1:    10.640   
n7.2:    10.650   
n7.3:    10.638   
n8.1:    8.6822   
n8.2:    8.6630   
n8.3:    8.6903   
n9.1:    9.5058   
n9.2:    9.5255   
n9.3:    9.4809   
n10.1:    10.484    
n10.2:    10.452    
n10.3:    10.516    
n11.1:    11.327    
n11.2:    11.316    
n11.3:    11.318    
n12.1:    12.285    
n12.2:    12.303    
n12.3:    12.272    
n13.1:    13.127    
n13.2:    13.113    
n13.3:    13.113    
n14.1:    14.035    
n14.2:    13.989    
n14.3:    14.021    
n15.1:    14.533    
n15.2:    14.529    
n15.3:    14.586    
n16.1:    8.6542    
n16.2:    8.6731    
n16.3:    8.6586    
~                                
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users