Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Process size
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-05-30 09:33:15


Leonardo,

The CRCP 'coord' component implements the bookmark exchange. I store
the message signatures for the bookmark exchange. Since I am
implementing this above the point-to-point stack in Open MPI (PML) I
need to keep track of this message information to implement post-
checkpoint resolution of drained messages.

After a successful checkpoint operation I should be able to free the
memory for most of the messages, excluding those that were drained
during the checkpoint operation but not fully matched. Unfortunately
when I looked back at the code I noticed that I was *not* freeing any
memory, but continuing to append messages per usual. This works
correctly, but becomes a resource and performance problem fairly
quickly for large numbers of messages.

The re-work of the 'coord' component that I am currently working on
will be more careful with memory. I'll let you know when the new
component is made available.

Cheers,
Josh

On May 30, 2008, at 8:09 AM, Leonardo Fialho wrote:

> Josh,
>
> Some time ago I was studying CRCP component, I´m not sure, but I
> remember that this component is used for bookmark exchange. You store
> these informations exactly for this (bookmark exchange)? After a
> successfully checkpoint operation you can free this memory?
>
> Thanks,
> Leonardo
>
> Josh Hursey escribió:
>> Leonardo,
>>
>> You are exactly correct. The CRCP module/component will grow the
>> application size probably for every message that you send or receive.
>> This is because the CRCP component tracks the signature {data_size,
>> tag, communicator, peer} (*not* the contents of the message) of every
>> message sent/received.
>>
>> I have in development some fixes for the CRCP component to make it
>> behave a bit better for large numbers of messages, and as a result
>> will also help control the number of memory allocations needed by
>> this
>> component. Unfortunately it is not 100% ready for public use at the
>> moment, but hopefully soon.
>>
>> As an aside: to clearly see the effect of turning the CRCP component
>> on/off at runtime try the two commands below:
>> Without CRCP:
>> shell$ mpirun -np 2 -am ft-enable-cr -mca crcp none simple-ping
>> 20 1
>> With CRCP:
>> shell$ mpirun -np 2 -am ft-enable-cr simple-ping 20 1
>>
>> -- Josh
>>
>> On May 29, 2008, at 7:54 AM, Leonardo Fialho wrote:
>>
>>
>>> Hi All,
>>>
>>> I made some tests with a dummy "ping" application. Some memory
>>> problems occurred. On these tests I obtained the following results:
>>>
>>> 1) OpenMPI (without FT):
>>> - delaying 1 second to send token to other node: orted and
>>> application size stable;
>>> - delaying 0 seconds to send token to other node: orted and
>>> application size stable.
>>>
>>> 2) OpenMPI (with CRCP FT):
>>> - delaying 1 second to send token to other node: orted stable and
>>> application size grow in the first seconds and establish;
>>> - delaying 0 seconds to send token to other node: orted stable and
>>> application size growing all the time.
>>>
>>> I think that it is something in the CRCP module/component...
>>>
>>> Thanks,
>>>
>>> --
>>> Leonardo Fialho
>>> Computer Architecture and Operating Systems Department - CAOS
>>> Universidad Autonoma de Barcelona - UAB
>>> ETSE, Edifcio Q, QC/3088
>>> http://www.caos.uab.es
>>> Phone: +34-93-581-2888
>>> Fax: +34-93-581-2478
>>>
>>> #include </softs/openmpi/include/mpi.h>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>>
>>> int main (int argc, char *argv[]) {
>>> double time_end, time_start;
>>> int count, rank, fim, x;
>>> char buffer[5] = "test!";
>>> MPI_Status status;
>>>
>>> if (3 > argc) {
>>> printf("\n Insuficient arguments (%d)\n\n ping <times>
>>> <delay>\n\n", argc);
>>> exit(1);
>>> }
>>>
>>> if (MPI_Init(&argc, &argv) == MPI_SUCCESS) {
>>> time_start = MPI_Wtime();
>>> MPI_Comm_size (MPI_COMM_WORLD, &count);
>>> MPI_Comm_rank (MPI_COMM_WORLD, &rank );
>>> for (fim = 1; fim <= atoi(argv[1]); fim++) {
>>> if (rank == 0) {
>>> printf("(%d) sent token to (%d)\n", rank, rank+1);
>>> fflush(stdout);
>>> sleep(atoi(argv[2]));
>>> MPI_Send(buffer, 5, MPI_CHAR, 1, 1, MPI_COMM_WORLD);
>>> MPI_Recv(buffer, 5, MPI_CHAR, count-1, 1,
>>> MPI_COMM_WORLD, &status);
>>> } else {
>>> MPI_Recv(buffer, 5, MPI_CHAR, rank-1, 1,
>>> MPI_COMM_WORLD, &status);
>>> printf("(%d) sent token to (%d)\n", rank,
>>> (rank==(count-1) ? 0 : rank+1));
>>> fflush(stdout);
>>> sleep(atoi(argv[2]));
>>> MPI_Send(buffer, 5, MPI_CHAR, (rank==(count-1) ? 0 :
>>> rank+1), 1, MPI_COMM_WORLD);
>>> }
>>> }
>>> }
>>>
>>> time_end = MPI_Wtime();
>>> MPI_Finalize();
>>>
>>> if (rank == 0) {
>>> printf("%f\n", time_end - time_start);
>>> }
>>>
>>> return 0;
>>> }
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Leonardo Fialho
> Computer Architecture and Operating Systems Department - CAOS
> Universidad Autonoma de Barcelona - UAB
> ETSE, Edifcio Q, QC/3088
> http://www.caos.uab.es
> Phone: +34-93-581-2888
> Fax: +34-93-581-2478
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users