Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang
From: Sébastien Boisvert (Sebastien.Boisvert.3_at_[hidden])
Date: 2010-11-23 23:47:30


Le mardi 23 novembre 2010 à 20:21 -0500, Jeff Squyres (jsquyres) a
écrit :

> Beware that MPI-request-free on active buffers is valid but evil. You CANNOT be sure when the buffer is available for reuse.

Yes, but as I said, in my program an MPI rank never flood other MPI
ranks.
(I like to think they respect each other haha)

Therefore the evilness is no more -- it is casted away in oblivions.

If I understand correctly, a call to MPI_Request_free does not affect in
any way the void*buffer associated to the request, it just free the
memory of the MPI_Request.
For statuses, I use MPI_STATUS_IGNORE, except with my MPI_Iprobe
obviously !

So, in a way, MPI_REQUEST_IGNORE would be cool, but it does not exist.

For buffer availability:

For MPI_Recv and MPI_Isend, buffers are allocated with a
"RingAllocator" (one malloc at the start of execution).
But it is useless as most of the time there is only on active send.

Here is an example of my code (14567 lines, but yet MPI_Isend and
MPI_Recv appear both only once).
p.s. it is GPLed !

These bits extract a k-mer (a string of k symbols) from a DNA (the code
of life) sequence and send it to the good MPI rank

void VerticesExtractor::process(...){
        if(!m_ready){
                return;
        }
...
                if(isValidDNA(memory)){
                        VERTEX_TYPE a=wordId(memory);
                        int rankToFlush=0;
                        if(*m_reverseComplementVertex==false){
                                rankToFlush=vertexRank(a,size);

m_disData->m_messagesStock.addAt(rankToFlush,a);
                        }else{
                                VERTEX_TYPE
b=complementVertex(a,m_wordSize,m_colorSpaceMode);
                                rankToFlush=vertexRank(b,size);

m_disData->m_messagesStock.addAt(rankToFlush,b);
                        }

if(m_disData->m_messagesStock.flush(rankToFlush,1,TAG_VERTICES_DATA,m_outboxAllocator,m_outbox,rank,false)){
                                m_ready=false;
                        }

                }
...
}

so, if the "toilet" if flushed, the rank set its slot called m_ready to
false.

The following bits select the message handler:

O(1) message handler selection !

void MessageProcessor::processMessage(Message*message){
        int tag=message->getTag();
        FNMETHOD f=m_methods[tag];
        (this->*f)(message);
}

Obviously, it calls something like this:
(note that a reply is sent)

void MessageProcessor::call_TAG_VERTICES_DATA(Message*message){
        void*buffer=message->getBuffer();
        int count=message->getCount();
        VERTEX_TYPE*incoming=(VERTEX_TYPE*)buffer;
        int length=count;
        for(int i=0;i<length;i++){
                VERTEX_TYPE l=incoming[i];

                #ifdef SHOW_PROGRESS
                if((*m_last_value)!=(int)m_subgraph->size() and
(int)m_subgraph->size()%100000==0){
                        (*m_last_value)=m_subgraph->size();
                        cout<<"Rank "<<rank<<" has
"<<m_subgraph->size()<<" vertices "<<endl;
                }
                #endif
                SplayNode<VERTEX_TYPE,Vertex>*tmp=m_subgraph->insert(l);
                #ifdef ASSERT
                assert(tmp!=NULL);
                #endif
                if(m_subgraph->inserted()){
                        tmp->getValue()->constructor();
                }

tmp->getValue()->setCoverage(tmp->getValue()->getCoverage()+1);
                #ifdef ASSERT
                assert(tmp->getValue()->getCoverage()>0);
                #endif
        }
        Message
aMessage(NULL,0,MPI_UNSIGNED_LONG_LONG,message->getSource(),TAG_VERTICES_DATA_REPLY,rank);
        m_outbox->push_back(aMessage);
}

These bits process the reply:
(all my message handlers are called call_<TAG_NAME>)

void MessageProcessor::call_TAG_VERTICES_DATA_REPLY(Message*message){
        m_verticesExtractor->setReadiness();
}

And, finally, here it goes:

void VerticesExtractor::setReadiness(){
        m_ready=true;
}

So, you can see that there is no problem with my use of MPI_Isend
followed by MPI_Request_free.

Thanks !

>
> There was a sentence or paragraph added yo MPI 2.2 describing exactly this case.
>
> Sent from my PDA. No type good.
>
> On Nov 23, 2010, at 5:36 PM, Sébastien Boisvert <Sebastien.Boisvert.3_at_[hidden]> wrote:
>
> > Le mardi 23 novembre 2010 à 17:28 -0500, George Bosilca a écrit :
> >> Sebastien,
> >>
> >> Using MPI_Isend doesn't guarantee asynchronous progress. As you might be aware, the non-blocking communications are guaranteed to progress only when the application is in the MPI library. Currently very few MPI implementations progress asynchronously (and unfortunately Open MPI is not one of them).
> >>
> >
> > Regardless, I just need the non-blocking behavior.
> > I call MPI_Request_free just after MPI_Isend, and I use a ring allocator
> > to allocate message buffers.
> >
> > Message recipients just reply with another message to the source, using
> > a NULL buffer.
> >
> > The sender waits for the reply before sending the next message.
> >
> > And it works for assembling bacterial genomes on many MPI ranks:
> >
> > ...
> > Rank 0: 162 contigs/4576725 nucleotides
> >
> > Rank 0 reports the elapsed time, Tue Nov 23 01:35:48 2010
> > ---> Step: Collection of fusions
> > Elapsed time: 0 seconds
> > Since beginning: 17 minutes, 33 seconds
> >
> > Elapsed time for each step, Tue Nov 23 01:35:48 2010
> >
> > Beginning of computation: 1 seconds
> > Distribution of sequence reads: 7 minutes, 49 seconds
> > Distribution of vertices: 19 seconds
> > Calculation of coverage distribution: 1 seconds
> > Distribution of edges: 29 seconds
> > Indexing of sequence reads: 1 seconds
> > Computation of seeds: 2 minutes, 33 seconds
> > Computation of library sizes: 1 minutes, 47 seconds
> > Extension of seeds: 3 minutes, 34 seconds
> > Computation of fusions: 59 seconds
> > Collection of fusions: 0 seconds
> > Completion of the assembly: 17 minutes, 33 seconds
> >
> > Rank 0 wrote Ecoli-THEONE.CoverageDistribution.txt
> > Rank 0 wrote Ecoli-THEONE.fasta
> > Rank 0 wrote Ecoli-THEONE.ReceivedMessages.txt
> > Rank 0 wrote Ecoli-THEONE.Library0.txt
> > Rank 0 wrote Ecoli-THEONE.Library1.txt
> >
> > Au revoir !
> >
> >
> >> george.
> >>
> >> On Nov 23, 2010, at 17:17 , Sébastien Boisvert wrote:
> >>
> >>> I now use MPI_Isend, so the problem is no more.
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > --
> > M. Sébastien Boisvert
> > Étudiant au doctorat en physiologie-endocrinologie à l'Université Laval
> > Boursier des Instituts de recherche en santé du Canada
> > Équipe du Professeur Jacques Corbeil
> >
> > Centre de recherche en infectiologie de l'Université Laval
> > Local R-61B
> > 2705, boulevard Laurier
> > Québec, Québec
> > Canada G1V 4G2
> > Téléphone: 418 525 4444 46342
> >
> > Courriel: SEB_at_[hidden]
> > Web: http://boisvert.info
> >
> > "Innovation comes only from an assault on the unknown" -Sydney Brenner
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-