Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Where is the error? (MPI program in fortran)
From: Oscar Mojica (o_mojical_at_[hidden])
Date: 2014-04-16 08:30:56


How would be the command line to compile with the option -g ? What debugger can I use?
Thanks

Enviado desde mi iPad

> El 15/04/2014, a las 18:20, "Gus Correa" <gus_at_[hidden]> escribió:
>
> Or just compiling with -g or -traceback (depending on the compiler) will
> give you more information about the point of failure
> in the error message.
>
>> On 04/15/2014 04:25 PM, Ralph Castain wrote:
>> Have you tried using a debugger to look at the resulting core file? It
>> will probably point you right at the problem. Most likely a case of
>> overrunning some array when #temps > 5
>>
>>
>>
>>
>> On Tue, Apr 15, 2014 at 10:46 AM, Oscar Mojica <o_mojical_at_[hidden]
>> <mailto:o_mojical_at_[hidden]>> wrote:
>>
>> Hello everybody
>>
>> I implemented a parallel simulated annealing algorithm in fortran.
>> The algorithm is describes as follows
>>
>> 1. The MPI program initially generates P processes that have rank
>> 0,1,...,P-1.
>> 2. The MPI program generates a starting point and sends it for all
>> processes set T=T0
>> 3. At the current temperature T, each process begins to execute
>> iterative operations
>> 4. At end of iterations, process with rank 0 is responsible for
>> collecting the solution obatined by
>> 5. Each process at current temperature and broadcast the best
>> solution of them among all participating
>> process
>> 6. Each process cools the temperatue and goes back to step 3, until
>> the maximum number of temperatures
>> is reach
>>
>> I compiled with: mpif90 -o exe mpivfsa_version2.f
>> and run with: mpirun -np 4 ./exe in a single machine
>>
>> So I have 4 processes, 1 iteration per temperature and for example
>> 15 temperatures. When I run the program
>> with just 5 temperatures it works well, but when the number of
>> temperatures is higher than 5 it doesn't write the
>> ouput files and I get the following error message:
>>
>>
>> [oscar-Vostro-3550:06740] *** Process received signal ***
>> [oscar-Vostro-3550:06741] *** Process received signal ***
>> [oscar-Vostro-3550:06741] Signal: Segmentation fault (11)
>> [oscar-Vostro-3550:06741] Signal code: Address not mapped (1)
>> [oscar-Vostro-3550:06741] Failing at address: 0xad6af
>> [oscar-Vostro-3550:06742] *** Process received signal ***
>> [oscar-Vostro-3550:06740] Signal: Segmentation fault (11)
>> [oscar-Vostro-3550:06740] Signal code: Address not mapped (1)
>> [oscar-Vostro-3550:06740] Failing at address: 0xad6af
>> [oscar-Vostro-3550:06742] Signal: Segmentation fault (11)
>> [oscar-Vostro-3550:06742] Signal code: Address not mapped (1)
>> [oscar-Vostro-3550:06742] Failing at address: 0xad6af
>> [oscar-Vostro-3550:06740] [ 0]
>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f49ee2224a0]
>> [oscar-Vostro-3550:06740] [ 1]
>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f49ee26f54c]
>> [oscar-Vostro-3550:06740] [ 2] ./exe() [0x406742]
>> [oscar-Vostro-3550:06740] [ 3] ./exe(main+0x34) [0x406ac9]
>> [oscar-Vostro-3550:06740] [ 4]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f49ee20d76d]
>> [oscar-Vostro-3550:06742] [ 0]
>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6877fdc4a0]
>> [oscar-Vostro-3550:06742] [ 1]
>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f687802954c]
>> [oscar-Vostro-3550:06742] [ 2] ./exe() [0x406742]
>> [oscar-Vostro-3550:06742] [ 3] ./exe(main+0x34) [0x406ac9]
>> [oscar-Vostro-3550:06742] [ 4]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f6877fc776d]
>> [oscar-Vostro-3550:06742] [ 5] ./exe() [0x401399]
>> [oscar-Vostro-3550:06742] *** End of error message ***
>> [oscar-Vostro-3550:06740] [ 5] ./exe() [0x401399]
>> [oscar-Vostro-3550:06740] *** End of error message ***
>> [oscar-Vostro-3550:06741] [ 0]
>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fa6c4c6e4a0]
>> [oscar-Vostro-3550:06741] [ 1]
>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7fa6c4cbb54c]
>> [oscar-Vostro-3550:06741] [ 2] ./exe() [0x406742]
>> [oscar-Vostro-3550:06741] [ 3] ./exe(main+0x34) [0x406ac9]
>> [oscar-Vostro-3550:06741] [ 4]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fa6c4c5976d]
>> [oscar-Vostro-3550:06741] [ 5] ./exe() [0x401399]
>> [oscar-Vostro-3550:06741] *** End of error message ***
>> --------------------------------------------------------------------------
>> mpirun noticed that process rank 0 with PID 6917 on node
>> oscar-Vostro-3550 exited on signal 11 (Segmentation fault).
>> --------------------------------------------------------------------------
>> 2 total processes killed (some possibly by mpirun during cleanup)
>>
>> If there is a segmentation fault in no case it must work .
>> I checked the program and didn't find the error. Why does the
>> program work with five temperatures?
>> Could someone help me to find the error and answer my question please.
>>
>> The program and the necessary files to run it are attached
>>
>> Thanks
>>
>>
>> _Oscar Fabian Mojica Ladino_
>> Geologist M.S. in Geophysics
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users