Hi,
  I think it is time to see the actual code:) Would it be possible to send us a part of the code that we can run and test with?

With best regards,
-Belaid.

From: dtustudy68@hotmail.com
To: users@open-mpi.org
Date: Tue, 15 Mar 2011 09:44:35 -0600
Subject: Re: [OMPI users] OMPI seg fault by a class with weird address.

This should be the configure info about Open MPI which I am using. 


-bash-3.2$ mpic++ -v

Using built-in specs.

Target: x86_64-redhat-linux

Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --disable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux

Thread model: posix

gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)

 


thanks


From: samuel@lanl.gov
To: users@open-mpi.org
Date: Tue, 15 Mar 2011 09:27:35 -0600
Subject: Re: [OMPI users] OMPI seg fault by a class with weird address.

I -think- setting OMPI_MCA_memory_ptmalloc2_disable to 1 will turn off OMPI's memory wrappers without having to rebuild.  Someone please correct me if I'm wrong :-).

For example (bash-like shell):

export OMPI_MCA_memory_ptmalloc2_disable=1

Hope that helps,

--
Samuel K. Gutierrez
Los Alamos National Laboratory 


On Mar 15, 2011, at 9:19 AM, Jack Bryan wrote:

Thanks,

I do not have system administrator authorization. 
I am afraid that I cannot rebuild OpenMPI --without-memory-manager. 

Are there other ways to get around it ? 

For example, use other things to replace "ptmalloc" ?

Any help is really appreciated. 

thanks 


From: belaid_moa@hotmail.com
To: dtustudy68@hotmail.com; users@open-mpi.org
Subject: RE: [OMPI users] OMPI seg fault by a class with weird address.
Date: Tue, 15 Mar 2011 08:00:56 +0000

Hi Jack,
  I may need to see the whole code to decide but my quick look suggest that ptmalloc is causing a problem with STL-vector allocation. ptmalloc is the openMPI internal malloc library. Could you try to build openMPI without memory management (using --without-memory-manager) and let us know the outcome. ptmalloc is not needed if you are not using an RDMA interconnect.

  With best regards,
-Belaid.


From: dtustudy68@hotmail.com
To: belaid_moa@hotmail.com; users@open-mpi.org
Subject: RE: [OMPI users] OMPI seg fault by a class with weird address.
Date: Tue, 15 Mar 2011 00:30:19 -0600

Hi, 

Because the code is very long, I just  show the calling relationship of functions. 

main()
{
    scheduler();

}
scheduler()
{
     ImportIndices();
}

ImportIndices()
{
Index IdxNode ;
IdxNode = ReadFile("fileName");
}

Index ReadFile(const char* fileinput) 
{
Index TempIndex;
        .........

}

vector<int> Index::GetPosition() const { return Position; }
vector<int> Index::GetColumn() const { return Column; }
vector<int> Index::GetYear() const { return Year; }
vector<string> Index::GetName() const { return Name; }
int Index::GetPosition(const int idx) const { return Position[idx]; }
int Index::GetColumn(const int idx) const { return Column[idx]; }
int Index::GetYear(const int idx) const { return Year[idx]; }
string Index::GetName(const int idx) const { return Name[idx]; }
int Index::GetSize() const { return Position.size(); }

The sequential code works well, and there is no  scheduler(). 

The parallel code output from gdb:
----------------------------------------------
Breakpoint 1, myNeplanTaskScheduler(CNSGA2 *, int, int, int, ._85 *, char, int, message_para_to_workers_VecT &, MPI_Datatype, int &, int &, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, std::vector<double, std::allocator<double> > &, int, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, MPI_Datatype, int, MPI_Datatype, int) (nsga2=0x118c490, 
    popSize=<value optimized out>, nodeSize=<value optimized out>, 
    myRank=<value optimized out>, myChildpop=0x1208d80, genCandTag=65 'A', 
    generationNum=1, myPopParaVec=std::vector of length 4, capacity 4 = {...}, 
    message_to_master_type=0x7fffffffd540, myT1Flag=@0x7fffffffd68c, 
    myT2Flag=@0x7fffffffd688, 
    resultTaskPackageT1=std::vector of length 4, capacity 4 = {...}, 
    resultTaskPackageT2Pr=std::vector of length 4, capacity 4 = {...}, 
    xdataV=std::vector of length 4, capacity 4 = {...}, objSize=7, 
    resultTaskPackageT12=std::vector of length 4, capacity 4 = {...}, 
    xdata_to_workers_type=0x121c410, myGenerationNum=1, 
    Mpara_to_workers_type=0x121b9b0, nconNum=0)
    at src/nsga2/myNetplanScheduler.cpp:109
109                     ImportIndices();
(gdb) c
Continuing.

Breakpoint 2, ImportIndices () at src/index.cpp:120
120             IdxNode = ReadFile("prepdata/idx_node.csv");
(gdb) c
Continuing.

Breakpoint 4, ReadFile (fileinput=0xd8663d "prepdata/idx_node.csv")
    at src/index.cpp:86
86              Index TempIndex;
(gdb) c
Continuing.

Breakpoint 5, Index::Index (this=0x7fffffffcb80) at src/index.cpp:20
20              Name(0) {}
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaab3b0b81 in opal_memory_ptmalloc2_int_malloc ()
   from /opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0

---------------------------------------
the backtrace output from the above parallel OpenMPI code:

(gdb) bt
#0  0x00002aaaab3b0b81 in opal_memory_ptmalloc2_int_malloc ()
   from /opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0
#1  0x00002aaaab3b2bd3 in opal_memory_ptmalloc2_malloc ()
   from /opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0
#2  0x0000003f7c8bd1dd in operator new(unsigned long) ()
   from /usr/lib64/libstdc++.so.6
#3  0x00000000004646a7 in __gnu_cxx::new_allocator<int>::allocate (
    this=0x7fffffffcb80, __n=0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:88
#4  0x00000000004646cf in std::_Vector_base<int, std::allocator<int> >::_M_allocate (this=0x7fffffffcb80, __n=0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_vector.h:127
#5  0x0000000000464701 in std::_Vector_base<int, std::allocator<int> >::_Vector_base (this=0x7fffffffcb80, __n=0, __a=...)
    at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_vector.h:113
#6  0x0000000000464d0b in std::vector<int, std::allocator<int> >::vector (
    this=0x7fffffffcb80, __n=0, __value=@0x7fffffffc968, __a=...)
    at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_vector.h:216
#7  0x00000000004890d7 in Index::Index (this=0x7fffffffcb80)
---Type <return> to continue, or q <return> to quit---
    at src/index.cpp:20
#8  0x000000000048927a in ReadFile (fileinput=0xd8663d "prepdata/idx_node.csv")
    at src/index.cpp:86
#9  0x0000000000489533 in ImportIndices () at src/index.cpp:120
#10 0x0000000000445e0e in myNeplanTaskScheduler(CNSGA2 *, int, int, int, ._85 *, char, int, message_para_to_workers_VecT &, MPI_Datatype, int &, int &, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, std::vector<double, std::allocator<double> > &, int, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > &, MPI_Datatype, int, MPI_Datatype, int) (nsga2=0x118c490, 
    popSize=<value optimized out>, nodeSize=<value optimized out>, 
    myRank=<value optimized out>, myChildpop=0x1208d80, genCandTag=65 'A', 
    generationNum=1, myPopParaVec=std::vector of length 4, capacity 4 = {...}, 
    message_to_master_type=0x7fffffffd540, myT1Flag=@0x7fffffffd68c, 
    myT2Flag=@0x7fffffffd688, 
    resultTaskPackageT1=std::vector of length 4, capacity 4 = {...}, 
    resultTaskPackageT2Pr=std::vector of length 4, capacity 4 = {...}, 
    xdataV=std::vector of length 4, capacity 4 = {...}, objSize=7, 
    resultTaskPackageT12=std::vector of length 4, capacity 4 = {...}, 
    xdata_to_workers_type=0x121c410, myGenerationNum=1, 
    Mpara_to_workers_type=0x121b9b0, nconNum=0)
---Type <return> to continue, or q <return> to quit---
    at src/nsga2/myNetplanScheduler.cpp:109
#11 0x000000000044f44b in main (argc=1, argv=0x7fffffffd998)
    at src/nsga2/main-parallel2.cpp:216
----------------------------------------------------

What is "opal_memory_ptmalloc2_int_malloc ()" ?

The gdb output from sequential code: 
-------------------------------------
Breakpoint 1, main (argc=<value optimized out>, argv=<value optimized out>)
    at src/nsga2/main-seq.cpp:32
32              ImportIndices();
(gdb) c
Continuing.

Breakpoint 2, ImportIndices () at src/index.cpp:115
115             IdxNode = ReadFile("prepdata/idx_node.csv");
(gdb) c
Continuing.

Breakpoint 4, ReadFile (fileinput=0xd6bb9d "prepdata/idx_node.csv")
    at src/index.cpp:86
86              Index TempIndex;
(gdb) c
Continuing.

Breakpoint 5, Index::Index (this=0x7fffffffd6d0) at src/index.cpp:20
20              Name(0) {}
(gdb) c
Continuing.

Breakpoint 4, ReadFile (fileinput=0xd6bbb3 "prepdata/idx_ud.csv")
    at src/index.cpp:86
86              Index TempIndex;
(gdb) bt
#0  ReadFile (fileinput=0xd6bbb3 "prepdata/idx_ud.csv") at src/index.cpp:86
#1  0x0000000000471cc9 in ImportIndices () at src/index.cpp:116
#2  0x000000000043bba6 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at src/nsga2/main-seq.cpp:32

--------------------------------------
thanks



From: belaid_moa@hotmail.com
To: users@open-mpi.org; dtustudy68@hotmail.com
Subject: RE: [OMPI users] OMPI seg fault by a class with weird address.
Date: Tue, 15 Mar 2011 06:16:35 +0000

Hi Jack,
1- Where is your main function to see how you called your class?
2- I do not see the implementation of GetPosition, GetName, etc.?

With best regards,
-Belaid.
  


From: dtustudy68@hotmail.com
To: users@open-mpi.org
Date: Mon, 14 Mar 2011 19:04:12 -0600
Subject: [OMPI users] OMPI seg fault by a class with weird address.

Hi, 

I got a run-time error of a Open MPI C++ program. 

The following output is from gdb: 

--------------------------------------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x00002aaaab3b0b81 in opal_memory_ptmalloc2_int_malloc ()
   from /opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0

At the point 

Breakpoint 9, Index::Index (this=0x7fffffffcb80) at src/index.cpp:20
20              Name(0) {}

The Index has been called before this point and no problem:
-------------------------------------------------------
Breakpoint 9, Index::Index (this=0x117d800) at src/index.cpp:20
20              Name(0) {}
(gdb) c
Continuing.

Breakpoint 9, Index::Index (this=0x117d860) at src/index.cpp:20
20              Name(0) {}
(gdb) c
Continuing.
----------------------------------------------------------------------------

It seems that the 0x7fffffffcb80 address is a problem. 

But, I donot know the reason and how to remove the bug. 

Any help is really appreciated. 

thanks

the following is the index definition.

---------------------------------------------------------
class Index {
    public:
        Index();
        Index(const Index& rhs);
        ~Index();
        Index& operator=(const Index& rhs);
vector<int> GetPosition() const;
vector<int> GetColumn() const;
vector<int> GetYear() const;
vector<string> GetName() const;
int GetPosition(const int idx) const;
int GetColumn(const int idx) const;
int GetYear(const int idx) const;
string GetName(const int idx) const;
int GetSize() const;
void Add(const int idx, const int col, const string& name);
void Add(const int idx, const int col, const int year, const string& name);
void Add(const int idx, const Step& col, const string& name);
void WriteFile(const char* fileinput) const;
    private:
vector<int> Position;
vector<int> Column;
vector<int> Year;
vector<string> Name;
};
// Contructors and destructor for the Index class
Index::Index() :
Position(0),
Column(0),
Year(0),
Name(0) {}

Index::Index(const Index& rhs) :
Position(rhs.GetPosition()),
Column(rhs.GetColumn()),
Year(rhs.GetYear()),
Name(rhs.GetName()) {}

Index::~Index() {}

Index& Index::operator=(const Index& rhs) {
    Position = rhs.GetPosition();
Column = rhs.GetColumn(),
Year = rhs.GetYear(),
Name = rhs.GetName();
    return *this;
}
----------------------------------------------------------



_______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users