Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] SIGSEV when running OMPI Java binding
From: Saliya Ekanayake (esaliya_at_[hidden])
Date: 2014-03-13 00:01:04


Just checking if there's some solution for this.

Thank you,
Saliya

On Tue, Mar 11, 2014 at 10:54 PM, Saliya Ekanayake <esaliya_at_[hidden]>wrote:

> I forgot to mention that I tried the hello.c version instead of Java and
> it too failed in a similar manner, but
>
> 1. On a single node with --mca btl ^tcp it went up to 24 procs before
> failing
> 2. On 8 nodes with --mca btl ^tcp it could go only up to 16 procs
>
>
> On Tue, Mar 11, 2014 at 5:06 PM, Saliya Ekanayake <esaliya_at_[hidden]>wrote:
>
>> I just tested with "ml" turned off as you suggested, but unfortunately it
>> didn't solve the issue.
>>
>> However, I found that by explicitly setting --mca btl ^tcp the code
>> worked on upto 4 nodes with each running 8 procs. If I don't specify this
>> it'll simply fail even on one node with 8 procs.
>>
>> Thank you,
>> Saliya
>>
>>
>> On Tue, Mar 11, 2014 at 4:35 PM, Jeff Squyres (jsquyres) <
>> jsquyres_at_[hidden]> wrote:
>>
>>> Looks like we still have a bug in one of our components -- can you try:
>>>
>>> mpirun --mca coll ^ml ...
>>>
>>> This will deactivate the "ml" collective component. See if that enables
>>> you to run (this particular component has nothing to do with Java).
>>>
>>>
>>> On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:
>>>
>>> > Just tested that this happens even with the simple Hello.java program
>>> given in OMPI distribution.
>>> >
>>> > I've made a tarball containing details of the error adhering to
>>> http://www.open-mpi.org/community/help/. Please let me know if I have
>>> missed any info necessary.
>>> >
>>> > Thank you,
>>> > Saliya
>>> >
>>> >
>>> >
>>> >
>>> > On Mon, Mar 10, 2014 at 10:46 AM, Jeff Squyres (jsquyres) <
>>> jsquyres_at_[hidden]> wrote:
>>> > Greetings, and thanks for trying out our Java bindings.
>>> >
>>> > Can you provide some more details? E.g., is there a particular
>>> program you're running that incurs these problems? Or is there even a
>>> particular MPI function that you're using that results in this segv (e.g.,
>>> perhaps we have a specific bug somewhere)?
>>> >
>>> > Can you reduce the segv to a small example that we can reproduce (and
>>> therefore fix)?
>>> >
>>> >
>>> > On Mar 10, 2014, at 12:05 AM, Saliya Ekanayake <esaliya_at_[hidden]>
>>> wrote:
>>> >
>>> > > Hi,
>>> > >
>>> > > I have 8 nodes each with 2 quad core sockets. Also, the nodes have
>>> IB connectivity. I am trying to run OMPI Java binding in OMPI trunk
>>> revision 30301 with 8 procs per node totaling 64 procs. This gives a SIGSEV
>>> error as below.
>>> > >
>>> > > I wonder if you have any suggestion to resolve this?
>>> > >
>>> > > Thank you,
>>> > > Saliya
>>> > >
>>> > > # A fatal error has been detected by the Java Runtime Environment:
>>> > > #
>>> > > # SIGSEGV (0xb) at pc=0x000000313867b75b, pid=12229,
>>> tid=47864973515072
>>> > > #
>>> > > # JRE version: Java(TM) SE Runtime Environment (8.0-b118) (build
>>> 1.8.0-ea-b118)
>>> > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b60 mixed mode
>>> linux-amd64 compressed oops)
>>> > > # Problematic frame:
>>> > > # C [libc.so.6+0x7b75b] memcpy+0x15b
>>> > >
>>> > >
>>> > > --
>>> > > Saliya Ekanayake esaliya_at_[hidden]
>>> > > http://saliya.org
>>> > > _______________________________________________
>>> > > users mailing list
>>> > > users_at_[hidden]
>>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> >
>>> > --
>>> > Jeff Squyres
>>> > jsquyres_at_[hidden]
>>> > For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> >
>>> >
>>> > --
>>> > Saliya Ekanayake esaliya_at_[hidden]
>>> > Cell 812-391-4914 Home 812-961-6383
>>> > http://saliya.org
>>> > <hellobug.tar.gz>_______________________________________________
>>> > users mailing list
>>> > users_at_[hidden]
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esaliya_at_[hidden]
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>
>
>
>
> --
> Saliya Ekanayake esaliya_at_[hidden]
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>

-- 
Saliya Ekanayake esaliya_at_[hidden]
Cell 812-391-4914 Home 812-961-6383
http://saliya.org