Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] btl_openib_cpc_include rdmacm questions
From: Brock Palen (brockp_at_[hidden])
Date: 2011-07-27 11:45:36


Sorry to bring this back up.
We recently had an outage updated the firmware on our GD4700 and installed a new mellonox provided OFED stack and the problem has returned.
Specifically I am able to produce the problem with IMB 4 12 core nodes when it tries to go to 16 cores. I have verified that enabling different openib_flags of 313 fix the issue abit lower bw for some message sizes.

Has there been any progress on this issue?

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp_at_[hidden]
(734)936-1985

On May 18, 2011, at 10:25 AM, Brock Palen wrote:

> Well I have a new wrench into this situation.
> We have a power failure at our datacenter took down our entire system nodes,switch,sm.
> Now I am unable to produce the error with oob default ibflags etc.
>
> Does this shed any light on the issue? It also makes it hard to now debug the issue without being able to reproduce it.
>
> Any thoughts? Am I overlooking something?
>
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> brockp_at_[hidden]
> (734)936-1985
>
>
>
> On May 17, 2011, at 2:18 PM, Brock Palen wrote:
>
>> Sorry typo 314 not 313,
>>
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp_at_[hidden]
>> (734)936-1985
>>
>>
>>
>> On May 17, 2011, at 2:02 PM, Brock Palen wrote:
>>
>>> Thanks, I though of looking at ompi_info after I sent that note sigh.
>>>
>>> SEND_INPLACE appears to help performance of larger messages in my synthetic benchmarks over regular SEND. Also it appears that SEND_INPLACE still allows our code to run.
>>>
>>> We working on getting devs access to our system and code.
>>>
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> Center for Advanced Computing
>>> brockp_at_[hidden]
>>> (734)936-1985
>>>
>>>
>>>
>>> On May 16, 2011, at 11:49 AM, George Bosilca wrote:
>>>
>>>> Here is the output of the "ompi_info --param btl openib":
>>>>
>>>> MCA btl: parameter "btl_openib_flags" (current value: <306>, data
>>>> source: default value)
>>>> BTL bit flags (general flags: SEND=1, PUT=2, GET=4,
>>>> SEND_INPLACE=8, RDMA_MATCHED=64, HETEROGENEOUS_RDMA=256; flags
>>>> only used by the "dr" PML (ignored by others): ACK=16,
>>>> CHECKSUM=32, RDMA_COMPLETION=128; flags only used by the "bfo"
>>>> PML (ignored by others): FAILOVER_SUPPORT=512)
>>>>
>>>> So the 305 flags means: HETEROGENEOUS_RDMA | CHECKSUM | ACK | SEND. Most of these flags are totally useless in the current version of Open MPI (DR is not supported), so the only value that really matter is SEND | HETEROGENEOUS_RDMA.
>>>>
>>>> If you want to enable the send protocol try first with SEND | SEND_INPLACE (9), if not downgrade to SEND (1)
>>>>
>>>> george.
>>>>
>>>> On May 16, 2011, at 11:33 , Samuel K. Gutierrez wrote:
>>>>
>>>>>
>>>>> On May 16, 2011, at 8:53 AM, Brock Palen wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On May 16, 2011, at 10:23 AM, Samuel K. Gutierrez wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Just out of curiosity - what happens when you add the following MCA option to your openib runs?
>>>>>>>
>>>>>>> -mca btl_openib_flags 305
>>>>>>
>>>>>> You Sir found the magic combination.
>>>>>
>>>>> :-) - cool.
>>>>>
>>>>> Developers - does this smell like a registered memory availability hang?
>>>>>
>>>>>> I verified this lets IMB and CRASH progress pass their lockup points,
>>>>>> I will have a user test this,
>>>>>
>>>>> Please let us know what you find.
>>>>>
>>>>>> Is this an ok option to put in our environment? What does 305 mean?
>>>>>
>>>>> There may be a performance hit associated with this configuration, but if it lets your users run, then I don't see a problem with adding it to your environment.
>>>>>
>>>>> If I'm reading things correctly, 305 turns off RDMA PUT/GET and turns on SEND.
>>>>>
>>>>> OpenFabrics gurus - please correct me if I'm wrong :-).
>>>>>
>>>>> Samuel Gutierrez
>>>>> Los Alamos National Laboratory
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> Brock Palen
>>>>>> www.umich.edu/~brockp
>>>>>> Center for Advanced Computing
>>>>>> brockp_at_[hidden]
>>>>>> (734)936-1985
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Samuel Gutierrez
>>>>>>> Los Alamos National Laboratory
>>>>>>>
>>>>>>> On May 13, 2011, at 2:38 PM, Brock Palen wrote:
>>>>>>>
>>>>>>>> On May 13, 2011, at 4:09 PM, Dave Love wrote:
>>>>>>>>
>>>>>>>>> Jeff Squyres <jsquyres_at_[hidden]> writes:
>>>>>>>>>
>>>>>>>>>> On May 11, 2011, at 3:21 PM, Dave Love wrote:
>>>>>>>>>>
>>>>>>>>>>> We can reproduce it with IMB. We could provide access, but we'd have to
>>>>>>>>>>> negotiate with the owners of the relevant nodes to give you interactive
>>>>>>>>>>> access to them. Maybe Brock's would be more accessible? (If you
>>>>>>>>>>> contact me, I may not be able to respond for a few days.)
>>>>>>>>>>
>>>>>>>>>> Brock has replied off-list that he, too, is able to reliably reproduce the issue with IMB, and is working to get access for us. Many thanks for your offer; let's see where Brock's access takes us.
>>>>>>>>>
>>>>>>>>> Good. Let me know if we could be useful
>>>>>>>>>
>>>>>>>>>>>> -- we have not closed this issue,
>>>>>>>>>>>
>>>>>>>>>>> Which issue? I couldn't find a relevant-looking one.
>>>>>>>>>>
>>>>>>>>>> https://svn.open-mpi.org/trac/ompi/ticket/2714
>>>>>>>>>
>>>>>>>>> Thanks. In csse it's useful info, it hangs for me with 1.5.3 & np=32 on
>>>>>>>>> connectx with more than one collective I can't recall.
>>>>>>>>
>>>>>>>> Extra data point, that ticket said it ran with mpi_preconnect_mpi 1, well that doesn't help here, both my production code (crash) and IMB still hang.
>>>>>>>>
>>>>>>>>
>>>>>>>> Brock Palen
>>>>>>>> www.umich.edu/~brockp
>>>>>>>> Center for Advanced Computing
>>>>>>>> brockp_at_[hidden]
>>>>>>>> (734)936-1985
>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Excuse the typping -- I have a broken wrist
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> George Bosilca
>>>> Research Assistant Professor
>>>> Innovative Computing Laboratory
>>>> Department of Electrical Engineering and Computer Science
>>>> University of Tennessee, Knoxville
>>>> http://web.eecs.utk.edu/~bosilca/
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>