Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi linking problem
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-06-26 06:47:14


This doesn't sound like a linking problem; this sounds like there's an error in your application that is causing it to abort before completing.

On Jun 25, 2014, at 12:19 PM, Sergii Veremieiev <s.veremieiev_at_[hidden]> wrote:

> Dear Sir/Madam,
>
> I'm trying to run a parallel finite element analysis 64-bit code on my desktop with Windows 7, Cygwin, Open MPI 1.7.5, 64Gb RAM and 6-core Intel Core i7-3930K CPU via "mpirun -np 6 executable" command. The code runs fine, but if I increase the number of elements to a critical one (roughly more than 100k) the built-in Mumps library returns an error message (please see below). Can you possibly advise me what can be a problem? I have checked in Task Manager the code is using about 3-6Gb per process or about 20Gb in total, that is much smaller than the amount of physical memory available on the system 55Gb. Is there possibly a memory limit in Windows available per process? Thank you.
>
> Best regards,
>
> Sergii
>
>
> mpirun has exited due to process rank 1 with PID 6028 on
> node exiting improperly. There are three reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
> orte_create_session_dirs is set to false. In this case, the run-time cannot
> detect that the abort call was an abnormal termination. Hence, the only
> error message you will receive is this one.
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
> You can avoid this message by specifying -quiet on the mpirun command line.
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24703.php

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/