Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Question regarding osu-benchamarks 3.1.1
From: Jeffrey Squyres (jsquyres_at_[hidden])
Date: 2012-02-29 12:56:14


FWIW, I'm immediately suspicious of *any* MPI application that uses the MPI one-sided operations (i.e., MPI_PUT and MPI_GET). It looks like these two OSU benchmarks are using those operations.

Is it known that these two benchmarks are correct?

On Feb 29, 2012, at 11:33 AM, Venkateswara Rao Dokku wrote:

> Sorry, i forgot to introduce the system.. Ours is the customized OFED stack implemented to work on the specific hardware.. We tested the stack with the q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to execute the osu_benchamark3.1.1 suite on our OFED..
>
> On Wed, Feb 29, 2012 at 9:57 PM, Venkateswara Rao Dokku <dvrao.584_at_[hidden]> wrote:
> Hiii,
> I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_
> get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the remaining tests are hanging at some message size.. the output is shown below
>
> [root_at_test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
> Local host: test1
> Device name: plx2_0
> Device vendor ID: 0x10b5
> Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance. You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
> btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
> Local host: test2
> Device name: plx2_0
> Device vendor ID: 0x10b5
> Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance. You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
> btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1
> # Size Bi-Bandwidth (MB/s)
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> 1 0.00
> 2 0.00
> 4 0.01
> 8 0.03
> 16 0.07
> 32 0.15
> 64 0.11
> 128 0.21
> 256 0.43
> 512 0.88
> 1024 2.10
> 2048 4.21
> 4096 8.10
> 8192 16.19
> 16384 8.46
> 32768 20.34
> 65536 39.85
> 131072 84.22
> 262144 142.23
> 524288 234.83
> mpirun: killing job...
>
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7305 on node test2 exited on signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
>
> [root_at_test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
> Local host: test1
> Device name: plx2_0
> Device vendor ID: 0x10b5
> Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance. You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
> btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
> Local host: test2
> Device name: plx2_0
> Device vendor ID: 0x10b5
> Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance. You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
> btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bandwidth Test v3.1.1
> # Size Bandwidth (MB/s)
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> 1 0.02
> 2 0.05
> 4 0.10
> 8 0.19
> 16 0.39
> 32 0.77
> 64 1.53
> 128 2.57
> 256 4.16
> 512 8.30
> 1024 16.62
> 2048 33.22
> 4096 66.51
> 8192 42.45
> 16384 11.99
> 32768 18.20
> 65536 76.04
> 131072 98.64
> 262144 407.66
> 524288 489.84
> mpirun: killing job...
>
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7314 on node test2 exited on signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
>
> I even checked the logs but i couldn't see any errors...
> Could you suggest a way to overcome/debug this issue..
>
> Thanks for the kind reply..
>
>
> --
> Thanks & Regards,
> D.Venkateswara Rao,
> Software Engineer,One Convergence Devices Pvt Ltd.,
> Jubille Hills,Hyderabad.
>
>
>
>
> --
> Thanks & Regards,
> D.Venkateswara Rao,
> Software Engineer,One Convergence Devices Pvt Ltd.,
> Jubille Hills,Hyderabad.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/