Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-08-02 19:22:40


Allocating memory is one thing. Being able to use it it's a
completely different story. Once you allocate the 8GB array can you
fill it with some random values ? This will force the kernel to
really give you the 8GB of memory. If this segfault, then that's the
problem. If not ... the problem come from Open MPI I guess.

   Thanks,
     george.

On Aug 2, 2007, at 6:59 PM, Juan Carlos Guzman wrote:

> Jelena, George,
>
> Thanks for your replies.
>
>> it is possible that the problem is not in MPI - I've seen similar
>> problem
>> on some of our workstations some time ago.
>> Juan, are you sure you can allocate more than 2x 4GB memory of
>> data in
>> non-mpi program on your system?
> Yes, I did a small program that can allocate more than 8 GB of memory
> (using malloc()).
>
> Cheers,
> Juan-Carlos.
>
>>
>> Thanks,
>> Jelena
>>
>> On Wed, 1 Aug 2007, George Bosilca wrote:
>>
>>> Juan,
>>>
>>> I have to check to see what's wrong there. We build Open MPI with
>>> full support for data transfer up to sizeof(size_t) bytes. so you
>>> case should be covered. However, there are some known problems with
>>> the MPI interface for data larger than sizeof(int). As an example
>>> the
>>> _count field in the MPI_Status structure will be truncated ...
>>>
>>> Thanks,
>>> george.
>>>
>>> On Jul 30, 2007, at 1:47 AM, Juan Carlos Guzman wrote:
>>>
>>>> Hi,
>>>>
>>>> Does anyone know the maximum buffer size I can use in MPI_Send()
>>>> (MPI_Recv) function?. I was doing some testing using two nodes
>>>> on my
>>>> cluster to measure the point-to-point MPI message rate depending on
>>>> size. The test program exchanges MPI_FLOAT datatypes between two
>>>> nodes. I was able to send up to 4 GB of data (500 Mega MPI_FLOATs)
>>>> before the process crashed with a segmentation fault message.
>>>>
>>>> Is the maximum size of the message limited by the sizeof(int) *
>>>> sizeof
>>>> (MPI data type) used in the MPI_Send()/MPI_Recv() functions?
>>>>
>>>> My cluster has openmpi 1.2.3 installed. Each node has 2 x dual core
>>>> AMD Opteron and 12 GB RAM.
>>>>
>>>> Thanks in advance.
>>>> Juan-Carlos.
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> --
>> Jelena Pjesivac-Grbovic, Pjesa
>> Graduate Research Assistant
>> Innovative Computing Laboratory
>> Computer Science Department, UTK
>> Claxton Complex 350
>> (865) 974 - 6722
>> (865) 974 - 6321
>> jpjesiva_at_[hidden]
>>
>> "The only difference between a problem and a solution is that
>> people understand the solution."
>> -- Charles Kettering
>>
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 1 Aug 2007 15:06:56 -0500
>> From: "Adams, Samuel D Contr AFRL/HEDR" <Samuel.Adams_at_[hidden]>
>> Subject: Re: [OMPI users] torque and openmpi
>> To: "Open MPI Users" <users_at_[hidden]>
>> Message-ID:
>>
>> <8BF06A36E7AD424197195998D9A0B8E1D7724F_at_FBRMLBR01.Enterprise.afmc.ds.
>> a
>> f.mil>
>>
>> Content-Type: text/plain; charset="us-ascii"
>>
>> I reran the configure script with the --with-tm flag this time.
>> Thanks
>> for the info. It was working before for clients with ssh properly
>> configured (i.e. my account only). But now it is working without
>> having
>> to use ssh for all accounts (i.e. biologist and physicists users).
>>
>> Sam Adams
>> General Dynamics Information Technology
>> Phone: 210.536.5945
>>
>> -----Original Message-----
>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_open-
>> mpi.org] On
>> Behalf Of Jeff Squyres
>> Sent: Friday, July 27, 2007 2:58 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] torque and openmpi
>>
>> On Jul 27, 2007, at 2:48 PM, Galen Shipman wrote:
>>
>>>> I set up ompi before I configured Torque. Do I need to recompile
>>>> ompi
>>>> with appropriate torque configure options to get better
>>>> integration?
>>>
>>> If libtorque wasn't present on the machine at configure then yes,
>>> you
>>> need to run:
>>>
>>> ./configure --with-tm=<path>
>>
>> You don't *have* to do this, of course. If you've got it working
>> with ssh, that's fine. But the integration with torque can be
>> better:
>>
>> - you can disable ssh for non-root accounts (assuming no other
>> services need rsh/ssh)
>> - users don't have to setup ssh keys to run MPI jobs (a small thing,
>> but sometimes nice when the users aren't computer scientists)
>> - torque knows about all processes on all nodes (not just the mother
>> superior) and can therefore both track and kill them if necessary
>>
>> Just my $0.02...
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 1 Aug 2007 20:58:44 -0400
>> From: Jeff Squyres <jsquyres_at_[hidden]>
>> Subject: Re: [OMPI users] unable to compile open mpi using pgf90 in
>> AMD opteron system
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <5453C030-B7C9-48E1-BBA7-F04BCC43C9CB_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>>
>> On Aug 1, 2007, at 11:38 AM, S.Sundar Raman wrote:
>>
>>> dear openmpi users,
>>> i m trying to compile openmpi with pgf90 compiler in AMD opteron
>>> system.
>>> i followed the procedure given in the mailer archives.
>>
>> What procedure are you referring to, specifically?
>>
>>> i found the following problem.
>>> please kindly help me in this regard and i m eagerly waiting for
>>> your reply
>>> make[2]: Entering directory `/usr/local/openmpi-1.2.3/ompi/mpi/f90'
>>>
>>> /bin/sh ../../../libtool --mode=link pgf90 -I../../../ompi/include -
>>> I../../../ompi/include -I. -I. -I../../../ompi/mpi/f90 -export-
>>> dynamic -o libmpi_f90.la -rpath /usr/local/mpi/lib mpi.lo
>>> mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo
>>> mpi_testsome_f90.lo mpi_waitall_f90.lo mpi_waitsome_f90.lo
>>> mpi_wtick_f90.lo mpi_wtime_f90.lo -lnsl -lutil -lm
>>>
>>> libtool: link: pgf90 -shared -fPIC -Mnomain .libs/mpi.o .libs/
>>> mpi_sizeof.o .libs/mpi_comm_spawn_multiple_f90.o .libs/
>>> mpi_testall_f90.o .libs/mpi_testsome_f90.o .libs/
>>> mpi_waitall_f90.o .libs/mpi_waitsome_f90.o .libs/
>>> mpi_wtick_f90.o .libs/mpi_wtime_f90.o -lnsl -lutil -lm -Wl,-soname -
>>> Wl,libmpi_f90.so.0 -o .libs/libmpi_f90.so.0.0.0
>>>
>>> /usr/bin/ld: .libs/mpi.o: relocation R_X86_64_PC32 against
>>> `__pgio_ini' can not be used when making a shared object; recompile
>>> with -fPIC
>>
>> I can usually compile with the PGI compilers without needing to do
>> anything special (PGI v6.2-5 and 7.0-2), although I usually do add
>> the following option to configure:
>>
>> --with-wrapper-cxxflags=-fPIC
>>
>> This puts "-fPIC" in the flags that the mpiCC wrapper compiler will
>> automatically insert when compiling MPI C++ applications.
>>
>> Can you send all the information listed here:
>>
>> http://www.open-mpi.org/community/help/
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 657, Issue 1
>> *************************************
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users