Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler
From: Sims, James S. Dr. (james.sims_at_[hidden])
Date: 2009-07-23 21:49:33


[sims_at_raritan openmpi]$ mpirun -V
mpirun (Open MPI) 1.3.1rc4

________________________________________
From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] On Behalf Of Ralph Castain [rhc_at_[hidden]]
Sent: Thursday, July 23, 2009 5:44 PM
To: Open MPI Users
Subject: Re: [OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler

What OMPI version are you using?

On Jul 23, 2009, at 3:00 PM, Sims, James S. Dr. wrote:

> I have an OpenMPI program compiled with a version of OpenMPI built
> using the ifort 10.1
> compiler. I can compile and run this code with no problem, using the
> 32 bit
> version of ifort. And I can also submit batch jobs using torque with
> this 32-bit code.
> However, compiling the same code to produce a 64 bit executable
> produces a code
> that runs correctly only in the simplest cases. It does not run
> correctly when run
> under the torque batch queuing system, running for awhile and then
> giving a
> segmentation violation in s section of code that is fine in the 32
> bit version.
>
> I have to run the mpi multinode jobs using our torque batch queuing
> system,
> but we do have the capability of running the jobs in an interactive
> batch environment.
>
> If I do a qsub -I -l nodes=1:x4gb
> I get an interactive session on the remote node assigned to my job.
> I can run the
> job using either
> ./MPI_li_64 or
> mpirun -np 1 ./MPI_li_64
> and the job runs successfully to completion. I can also
> start an interactive shell using
> qsub -I -l nodes=1:ppn=2:x4gb
> and I will get a single dual processor (or greater node). On this
> single node,
> mpirun -np 2 ./MPI_li_64 works.
> However, if instead I ask for two nodes in my interactive batch node,
> qsub -I -l nodes=2:x4gb,
> Two nodes will be assigned to me but when I enter
> mpirun -np 2 ./MPI_li_64
> the job runs awhile, then fails with a
> mpirun noticed that process rank 1 with PID 23104 on node n339
> exited on signal 11 (Segmentation fault).
>
> I can trace this in the intel debugger and see that the segmentation
> fault is occuring in what should
> be good code, and in code that executes with no problem when
> everything is compiled 32-bit. I am
> at a loss for what could be preventing this code to run within the
> batch queuing environment in the
> 64-bit version.
>
> Jim
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users