Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Andre Lichei (andre.lichei_at_[hidden])
Date: 2006-05-26 05:12:02


Many thanks for the fast reply!!!
I checked again, but it doesn't become clear.:(

> Look like you miss the bitmap.
I ignored the bitmap by purpose. The only lines were the bitmap is
change are in the loop were all btl-modules are iterated. Something
like that.

for each btl{
ompi_bitmap_clear_all_bits(reachable);[line 229]
rc = btl->btl_add_procs(btl, n_new_procs, new_procs, btl_endpoints,
reachable); [line 232]
}
So when the add_proc function of the r2 component returns, the bitmap
holds the information which process is reachable by the last btl. Here
it is the btl with the lowest exclusivity. I could not imaging what
purpose that should have so I ignored it.

> Every time, one of the endpoint is reacheable the corresponding bit
> in the bitmap is set to one.
With "endpoint is reachable" you meant that the process is reachable? I
belive the r2 function shows a different behavior, the bitmap only
holds the information from the last btl. I want to add here that I'm
not too familiar with C so I think I made a mistake in my last mail.
mca_bml_r2_add_proc() creates a new array of processes, only holding
the processes which are really new. But does NOT return it.(I was
confused by the pointers. Sorry.) The endpoints in the bml_endpoints
array correspond to the processes in the new array, so they do not
correspond to the processes in the array the upper level holds. With
the bitmap it is the same.

> The upper level reparse the bitmap and it will detect the number of
> registered BTL.
Sorry, but I don't understand this.

The more I think about it the more I believe that the behaviour of the
add_proc function in the bml_framework should be something like this:
When the function returns,
procs holds all processes.
bml_endpoint holds the endpoints, each corresponding to one process in
the procs array.
The corresponding bit in the bitmap is set when the bml can reach the process.
Is that right?

Many thanks!!

André

>
> Thanks,
> george.
>
> On May 24, 2006, at 6:12 AM, Andre Lichei wrote:
>
>> Hello
>>
>> currently I'm working at the r2 component of the bml framework. When I
>> tried to get an deeper understanding of the component I experienced
>> difficulties to figure out how the add_proc function should behave. So
>> my question is how should the function behave, and what is the purpose
>> of the bml_endpoint array? An explanation of my difficulties follows.
>>
>> The add_proc function is implemented in bml_r2.c and starts at line
>> 164
>> mca_bml_r2_add_procs(size_t nprocs,
>>> struct ompi_proc_t** procs,
>>> struct mca_bml_base_endpoint_t**
>>> bml_endpoints,
>>> struct ompi_bitmap_t* reachable
>>> )
>>
>> When I first read it, it seems that the function accepts an array of
>> ompi_proc_t structs and return an array of the same size which
>> contains
>> one bml_endpoint for every process in the procs array.
>> At the beginning of the function (line 193 to 204) is a loop checking
>> if there are processes which are not new. If this is the case the
>> existing bml_endpoint is selected and stored in the endpoint array.
>> New
>> processes are stored in an different array. This means if all
>> processes
>> are known the function behaves like described above.
>> When there are new processes the procs array is overwritten with the
>> newly created array of new processes.(line 210) This array may be
>> shorter. (When there was at least one known process) So the number of
>> elements in nprocs is overwritten too. (line 211) But this nummber is
>> no pointer so the calling function couldn't notice it.
>> Now new bml_endpoints are created an stored in the bml_endpoints
>> array.
>> But they are stored at the position the process has in the new
>> array!(line 271) So existing entries may be overwritten.
>>
>> Example:
>> The function receives an array with 4 processes (process 0 to 3).
>> Process 2 is already known. So in the first loop the bml_endpoint of
>> process 2 is stored at bml_endpoints[2]. Also a new array is created
>> containing process 0,1,3. This new array replaces the procs array.
>> Then
>> for all three processes bml_endpoints are created and stored at
>> bml_endpoint[0,1,2]. So the existing entry (bml_endpoint[2]) is
>> overwritten.
>> So the bml_endpoint array contains only three elements, but the
>> calling
>> function has as number 4, because the new number can't be returned.
>>
>> So my question again. Is this the intended behavior or is it a bug?
>> How
>> should the function behave?
>>
>> Thanks,
>> André
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>