Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bus Error (7) on PS3 running HPL (OpenMPI 1.2.8)
From: vipin kumar (vipinkumar41_at_[hidden])
Date: 2009-08-07 00:12:42


may be because of insufficient storage space...............???( I mean hard
disk space.)

On Thu, Aug 6, 2009 at 11:23 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> Any chance you could re-try the experiment with Open MPI 1.3.3?
>
>
> On Aug 4, 2009, at 11:10 AM, Hoelzlwimmer Andreas - S0810595005 wrote:
>
> Hello,
>>
>> I’ve wanted to run MPI on a couple of PS3 here. According to a colleague
>> who set it up, I had to set several HugePages. As the PS3 RAM is limited I
>> had to allocate 2 HugePages. I ran HPL at first with the following command
>> (out of a tutorial):
>> mpirun --mca btl_openib_want_fork_support 0 -np 1 numactl --physcpubind=0
>> ./xhpl : -np 1 numactl --physcpubind=1 ./xhpl
>>
>> Now as I had very little memory I had to disable some services. I did so
>> (Wifi Service, Bluetooth, printing, unneeded). After running the same
>> command again, I got the an error message (see below). Can anyone help me
>> here, I have no idea what the error message actually means, and I can’t find
>> anything useful about it. It’s running on Yellow Dog Linux, using OpenMPI
>> 1.2.8
>>
>> Cheers,
>> Andreas Hoelzlwimmer
>>
>> Error Message:
>> [PS02:04815] *** Process received signal ***
>> [PS02:04815] Signal: Bus error (7)
>> [PS02:04815] Signal code: (2)
>> [PS02:04815] Failing at address: 0x4000ca78008
>> [PS02:04816] *** Process received signal ***
>> [PS02:04816] Signal: Bus error (7)
>> [PS02:04816] Signal code: (2)
>> [PS02:04816] Failing at address: 0x4000ca78008
>> [PS02:04816] [ 0] [0x1003e8]
>> [PS02:04816] [ 1] ./xhpl(HPL_hpalloc-0x17cc8c) [0x1001103c]
>> [PS02:04816] [ 2] ./xhpl(HPL_pdtest-0x17da40) [0x100101f8]
>> [PS02:04816] [ 3] ./xhpl(main-0x182f2c) [0x1000acdc]
>> [PS02:04816] [ 4] /lib64/libc.so.6 [0x80ca0e966c]
>> [PS02:04816] [ 5] /lib64/libc.so.6(__libc_start_main-0x1473e0)
>> [0x80ca0e98e8]
>> [PS02:04816] *** End of error message ***
>> [PS02:04815] [ 0] [0x1003e8]
>> [PS02:04815] [ 1] ./xhpl(HPL_hpalloc-0x17cc8c) [0x1001103c]
>> [PS02:04815] [ 2] ./xhpl(HPL_pdtest-0x17da40) [0x100101f8]
>> [PS02:04815] [ 3] ./xhpl(main-0x182f2c) [0x1000acdc]
>> [PS02:04815] [ 4] /lib64/libc.so.6 [0x80ca0e966c]
>> [PS02:04815] [ 5] /lib64/libc.so.6(__libc_start_main-0x1473e0)
>> [0x80ca0e98e8]
>> [PS02:04815] *** End of error message ***
>> mpirun noticed that job rank 0 with PID 4815 on node PS02 exited on signal
>> 7 (Bus error).
>> 1 additional process aborted (not shown)
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Vipin K.
Research Engineer,
C-DOTB, India