Hi,
In my program I am calling MPI_Barrier(MPI_COMM_WORLD), but
it seems to cause an error on one node. The node changes depending on how
many total nodes I have (it could be 4 or 2). I’m thinking it’s
MPI_Barrier because I put print statements around it and that’s where the
program terminates. This is the error message that I get:
Signal:11 info.si_errno:0(Success)
si_code:1(SEGV_MAPERR)
Failing at addr:0xc900000002
[0] func:/opt/openmpi/st/lib/libopal.so.0
[0x2aaaab04dbc8]
[1] func:/lib64/libpthread.so.0 [0x3be4f0c530]
[2]
func:/opt/openmpi/st/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x2a3)
[0x2aaaacf26f33]
[3]
func:/opt/openmpi/st/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_barrier_intra_recursivedoubling+0x14a)
[0x2aaaad8699ba]
[4]
func:/opt/openmpi/st/lib/libmpi.so.0(PMPI_Barrier+0xa4) [0x2aaaaad87294]
[5] func:Debug/BioRiskAssessmentMpiLibTest(_ZN2BL14CMpiTestRunner11SynchronizeEv+0xe)
[0x490846]
[6]
func:Debug/BioRiskAssessmentMpiLibTest(_ZN2BL8CMpiTest12FinishedTestEi+0x3c)
[0x490884]
[7]
func:Debug/BioRiskAssessmentMpiLibTest(_ZN15CMpiProcessTest8RunTestsEv+0x269)
[0x490297]
[8]
func:Debug/BioRiskAssessmentMpiLibTest(_ZN29CMpiConsequenceCalculatorTest3RunEP19ompi_communicator_t+0xdf)
[0x45a8e7]
[9]
func:Debug/BioRiskAssessmentMpiLibTest(_ZN2BL14CMpiTestRunner3RunEv+0x60)
[0x4909ba]
[10] func:Debug/BioRiskAssessmentMpiLibTest(main+0x42)
[0x44558a]
[11] func:/lib64/libc.so.6(__libc_start_main+0xef)
[0x3be481c40f]
[12]
func:Debug/BioRiskAssessmentMpiLibTest(__gxx_personality_v0+0x99) [0x4454b9]
*** End of error message ***
I’m using version 1.1.2. Not sure if it
matters, but before I call the MPI_Barrier I create a comm subset (which in
this case happens to be all of the same processes that are in MPI_COMM_WORLD).
Does anybody have an idea what might be my problem?
Or what I should do to get more information?
Thanks!
Matt
______________________________
Matt Cupp
Battelle Memorial Institute
Statistics and Information Analysis