Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] About MPI_TAG_UB
From: Sébastien Boisvert (sebastien.boisvert.3_at_[hidden])
Date: 2012-09-28 09:50:32


Hi,

I did not know about shared queues.

It does not run out of memory. ;-)

But the latency is not very good.

** Test 1

--mca btl_openib_max_send_size 4096 \
--mca btl_openib_eager_limit 4096 \
--mca btl_openib_rndv_eager_limit 4096 \
--mca btl_openib_receive_queues S,4096,2048,1024,32 \

I get 1.5 milliseconds.

  => https://gist.github.com/3799889

** Test 2

--mca btl_openib_receive_queues S,65536,256,128,32 \

I get around 1.5 milliseconds too.

  => https://gist.github.com/3799940

With my virtual router I am sure I can get something around 270 microseconds.

Just out of curiosity, does Open-MPI utilize heavily negative values
internally for user-provided MPI tags ?

If the negative tags are internal to Open-MPI, my code will not touch
these private variables, right ?

Sébastien

On 28/09/12 08:59 AM, Jeff Squyres wrote:
> On Sep 27, 2012, at 7:22 PM, Sébastien Boisvert wrote:
>
>> Without the virtual message router, I get messages like these:
>>
>> [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] error creating qp errno says Cannot allocate memory
>
> You're running out of registered memory. Check out these FAQ items:
>
> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
> http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues
>
> The second one tells you how to change your receive queue types; Open MPI defaults to 1 per-peer receive queue and several shared receive queues. You might want to change to all shared receive queues.
>
>> The real message tag, the real source and the real destination are stored
>> in the MPI tag. I know, this is ugly, but it works. I can not store this
>> information in the message buffer because the buffer can be NULL.
>>
>> bits 0 to 7: tag (8 bits, values from 0 to 255, 256 possible values)
>> bits 8 to 19: true source (12 bits, values from 0 to 4095, 4096 possible values)
>> bits 20 to 31: true destination (12 bits, values from 0 to 4095, 4096 possible values)
>>
>> Without the virtual router, my code is compliant with the fact that
>> MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_TAG_UB,...) is at least 32767 (my tags are <= 255).
>>
>> When I try jobs with 4096 processes with the virtual message router, I get the error:
>>
>> MPI_ERR_TAG: invalid tag.
>>
>> Without the virtual message router I get:
>>
>> [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] error creating qp errno says Cannot allocate memory
>>
>> With Open-MPI 1.5.4, the upper bound is 17438272 (at least in our build). That explains MPI_ERR_TAG.
>
> +1 on what Hristo said -- remember that you get a pointer to an MPI_Aint. So you need to dereference it to get the value back.
>
>> My 2 questions:
>>
>> 1. Is there a better way to store routing information ?
>
> Seems fine to me. Just stay <=INT_MAX and you should be fine.
>
>> 2. Can I create my own communicator and set its MPI_TAG_UB to whatever I want ?
>
> As Hristo said, no. It's a limit in Open MPI.
>