Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Pummill (jpummil_at_[hidden])
Date: 2007-06-11 09:06:29


Glad to contribute Victor!

I am running on a home workstation that uses an AMD 3800 cpu attached to
2 gigs of ram.
My timings for FT were 175 secs with one core and 110 on two cores with
-O3 and -mtune=amd64 as tuning options.

Brock, Terry and Jeff are all exactly correct in their comments
regarding benchmarks. There are simply too many variables to contend
with. In addition, one and two core runs on a single workstation
probably isn't the best evaluation of OpenMPI. As you expand to more
devices and generate bigger problems (HPL or HPCC for example), a better
overall picture will emerge.

Jeff F. Pummill
Senior Linux Cluster Administrator
University of Arkansas

victor marian wrote:
> Thank you everybody for the advices.
> I ran the NAS benchmark class B and it runs in 181
> seconds on one core and in 90 seconds on two cores, so
> it scales almost perfectly.
> What were your timings, Jeff, and what processor do
> you exactly have?
> Mine is a Pentium D at 2.8GHz.
>
> Victor
>
>
> --- Jeff Pummill <jpummil_at_[hidden]> wrote:
>
>
>> Victor,
>>
>> Build the FT benchmark and build it as a class B
>> problem. This will run
>> in the 1-2 minute range instead of 2-4 seconds the
>> CG class A benchmark
>> does.
>>
>>
>> Jeff F. Pummill
>> Senior Linux Cluster Administrator
>> University of Arkansas
>>
>>
>>
>> Terry Frankcombe wrote:
>>
>>> Hi Victor
>>>
>>> I'd suggest 3 seconds of CPU time is far, far to
>>>
>> small a problem to do
>>
>>> scaling tests with. Even with only 2 CPUs, I
>>>
>> wouldn't go below 100
>>
>>> times that.
>>>
>>>
>>> On Mon, 2007-06-11 at 01:10 -0700, victor marian
>>>
>> wrote:
>>
>>>
>>>
>>>> Hi Jeff
>>>>
>>>> I ran the NAS Parallel Bechmark and it gives for
>>>>
>> me
>>
> -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$
>
>>>> mpirun -np 1 cg.A.1
>>>>
>>>>
> --------------------------------------------------------------------------
>
>>>> [0,1,0]: uDAPL on host SERVSOLARIS was unable to
>>>>
>> find
>>
>>>> any NICs.
>>>> Another transport will be used instead, although
>>>>
>> this
>>
>>>> may result in
>>>> lower performance.
>>>>
>>>>
> --------------------------------------------------------------------------
>
>>>> NAS Parallel Benchmarks 3.2 -- CG Benchmark
>>>>
>>>> Size: 14000
>>>> Iterations: 15
>>>> Number of active processes: 1
>>>> Number of nonzeroes per row: 11
>>>> Eigenvalue shift: .200E+02
>>>> Benchmark completed
>>>> VERIFICATION SUCCESSFUL
>>>> Zeta is 0.171302350540E+02
>>>> Error is 0.512264003323E-13
>>>>
>>>>
>>>> CG Benchmark Completed.
>>>> Class = A
>>>> Size = 14000
>>>> Iterations = 15
>>>> Time in seconds = 3.02
>>>> Total processes = 1
>>>> Compiled procs = 1
>>>> Mop/s total = 495.93
>>>> Mop/s/process = 495.93
>>>> Operation type = floating point
>>>> Verification = SUCCESSFUL
>>>> Version = 3.2
>>>> Compile date = 11 Jun 2007
>>>>
>>>>
>>>>
>>>>
> -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$
>
>>>> mpirun -np 2 cg.A.2
>>>>
>>>>
> --------------------------------------------------------------------------
>
>>>> [0,1,0]: uDAPL on host SERVSOLARIS was unable to
>>>>
>> find
>>
>>>> any NICs.
>>>> Another transport will be used instead, although
>>>>
>> this
>>
>>>> may result in
>>>> lower performance.
>>>>
>>>>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
>>>> [0,1,1]: uDAPL on host SERVSOLARIS was unable to
>>>>
>> find
>>
>>>> any NICs.
>>>> Another transport will be used instead, although
>>>>
>> this
>>
>>>> may result in
>>>> lower performance.
>>>>
>>>>
> --------------------------------------------------------------------------
>
>>>> NAS Parallel Benchmarks 3.2 -- CG Benchmark
>>>>
>>>> Size: 14000
>>>> Iterations: 15
>>>> Number of active processes: 2
>>>> Number of nonzeroes per row: 11
>>>> Eigenvalue shift: .200E+02
>>>>
>>>> Benchmark completed
>>>> VERIFICATION SUCCESSFUL
>>>> Zeta is 0.171302350540E+02
>>>> Error is 0.522633719989E-13
>>>>
>>>>
>>>> CG Benchmark Completed.
>>>> Class = A
>>>> Size = 14000
>>>> Iterations = 15
>>>> Time in seconds = 2.47
>>>> Total processes = 2
>>>> Compiled procs = 2
>>>> Mop/s total = 606.32
>>>> Mop/s/process = 303.16
>>>> Operation type = floating point
>>>> Verification = SUCCESSFUL
>>>> Version = 3.2
>>>> Compile date = 11 Jun 2007
>>>>
>>>>
>>>> You can remark that the scalling is not so
>>>>
>> good
>>
>>>> like yours. Maibe I am having comunications
>>>>
>> problems
>>
>>>> between processors.
>>>> You can also remark that I am faster on one
>>>>
>> process
>>
>>>> concared to your processor.
>>>>
>>>> Victor
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --- Jeff Pummill <jpummil_at_[hidden]> wrote:
>>>>
>>>>
>>>>
>>>>> Perfect! Thanks Jeff!
>>>>>
>>>>> The NAS Parallel Benchmark on a dual core AMD
>>>>> machine now returns this...
>>>>> [jpummil_at_localhost bin]$ mpirun -np 1 cg.A.1
>>>>> NAS Parallel Benchmarks 3.2 -- CG Benchmark
>>>>> CG Benchmark Completed.
>>>>> Class = A
>>>>> Size = 14000
>>>>> Iterations = 15
>>>>> Time in seconds = 4.75
>>>>> Total processes = 1
>>>>> Compiled procs = 1
>>>>> Mop/s total = 315.32
>>>>>
>>>>> ...and...
>>>>>
>>>>> [jpummil_at_localhost bin]$ mpirun -np 2 cg.A.2
>>>>> NAS Parallel Benchmarks 3.2 -- CG Benchmark
>>>>> CG Benchmark Completed.
>>>>> Class = A
>>>>> Size = 14000
>>>>> Iterations = 15
>>>>> Time in seconds = 2.48
>>>>> Total processes = 2
>>>>> Compiled procs = 2
>>>>> Mop/s total = 604.46
>>>>>
>>>>> Not quite linear, but one must account for all
>>>>>
>> of
>>
>>>>> the OS traffic that
>>>>> one core or the other must deal with.
>>>>>
>>>>>
>>>>> Jeff F. Pummill
>>>>> Senior Linux Cluster Administrator
>>>>> University of Arkansas
>>>>> Fayetteville, Arkansas 72701
>>>>> (479) 575 - 4590
>>>>> http://hpc.uark.edu
>>>>>
>>>>> "A supercomputer is a device for turning
>>>>> compute-bound
>>>>> problems into I/O-bound problems." -Seymour Cray
>>>>>
>>>>>
>>>>> Jeff Squyres wrote:
>>>>>
>>>>>
>>>>>> Just remove the -L and -l arguments -- OMPI's
>>>>>>
>>>>>>
>>>>> "mpif90" (and other
>>>>>
>>>>>
>>>>>> wrapper compilers) will do all that magic for
>>>>>>
>> you.
>>
>>
> === message truncated ===>
> _______________________________________________
>
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
>
> ____________________________________________________________________________________
> Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out.
> http://answers.yahoo.com/dir/?link=list&sid=396545469
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>