Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SIGSEV when running OMPI Java binding
From: Saliya Ekanayake (esaliya_at_[hidden])
Date: 2014-03-11 22:54:20


I forgot to mention that I tried the hello.c version instead of Java and it
too failed in a similar manner, but

1. On a single node with --mca btl ^tcp it went up to 24 procs before
failing
2. On 8 nodes with --mca btl ^tcp it could go only up to 16 procs

On Tue, Mar 11, 2014 at 5:06 PM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:

> I just tested with "ml" turned off as you suggested, but unfortunately it
> didn't solve the issue.
>
> However, I found that by explicitly setting --mca btl ^tcp the code worked
> on upto 4 nodes with each running 8 procs. If I don't specify this it'll
> simply fail even on one node with 8 procs.
>
> Thank you,
> Saliya
>
>
> On Tue, Mar 11, 2014 at 4:35 PM, Jeff Squyres (jsquyres) <
> jsquyres_at_[hidden]> wrote:
>
>> Looks like we still have a bug in one of our components -- can you try:
>>
>> mpirun --mca coll ^ml ...
>>
>> This will deactivate the "ml" collective component. See if that enables
>> you to run (this particular component has nothing to do with Java).
>>
>>
>> On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake <esaliya_at_[hidden]> wrote:
>>
>> > Just tested that this happens even with the simple Hello.java program
>> given in OMPI distribution.
>> >
>> > I've made a tarball containing details of the error adhering to
>> http://www.open-mpi.org/community/help/. Please let me know if I have
>> missed any info necessary.
>> >
>> > Thank you,
>> > Saliya
>> >
>> >
>> >
>> >
>> > On Mon, Mar 10, 2014 at 10:46 AM, Jeff Squyres (jsquyres) <
>> jsquyres_at_[hidden]> wrote:
>> > Greetings, and thanks for trying out our Java bindings.
>> >
>> > Can you provide some more details? E.g., is there a particular program
>> you're running that incurs these problems? Or is there even a particular
>> MPI function that you're using that results in this segv (e.g., perhaps we
>> have a specific bug somewhere)?
>> >
>> > Can you reduce the segv to a small example that we can reproduce (and
>> therefore fix)?
>> >
>> >
>> > On Mar 10, 2014, at 12:05 AM, Saliya Ekanayake <esaliya_at_[hidden]>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I have 8 nodes each with 2 quad core sockets. Also, the nodes have IB
>> connectivity. I am trying to run OMPI Java binding in OMPI trunk revision
>> 30301 with 8 procs per node totaling 64 procs. This gives a SIGSEV error as
>> below.
>> > >
>> > > I wonder if you have any suggestion to resolve this?
>> > >
>> > > Thank you,
>> > > Saliya
>> > >
>> > > # A fatal error has been detected by the Java Runtime Environment:
>> > > #
>> > > # SIGSEGV (0xb) at pc=0x000000313867b75b, pid=12229,
>> tid=47864973515072
>> > > #
>> > > # JRE version: Java(TM) SE Runtime Environment (8.0-b118) (build
>> 1.8.0-ea-b118)
>> > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b60 mixed mode
>> linux-amd64 compressed oops)
>> > > # Problematic frame:
>> > > # C [libc.so.6+0x7b75b] memcpy+0x15b
>> > >
>> > >
>> > > --
>> > > Saliya Ekanayake esaliya_at_[hidden]
>> > > http://saliya.org
>> > > _______________________________________________
>> > > users mailing list
>> > > users_at_[hidden]
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > --
>> > Jeff Squyres
>> > jsquyres_at_[hidden]
>> > For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> > --
>> > Saliya Ekanayake esaliya_at_[hidden]
>> > Cell 812-391-4914 Home 812-961-6383
>> > http://saliya.org
>> > <hellobug.tar.gz>_______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esaliya_at_[hidden]
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>

-- 
Saliya Ekanayake esaliya_at_[hidden]
Cell 812-391-4914 Home 812-961-6383
http://saliya.org