On Sep 28, 2012, at 9:50 AM, Sébastien Boisvert wrote:
> I did not know about shared queues.
> It does not run out of memory. ;-)
It runs out of *registered* memory, which could be far less than your actual RAM. Check this FAQ item in particular:
> But the latency is not very good.
> ** Test 1
> --mca btl_openib_max_send_size 4096 \
> --mca btl_openib_eager_limit 4096 \
> --mca btl_openib_rndv_eager_limit 4096 \
> --mca btl_openib_receive_queues S,4096,2048,1024,32 \
> I get 1.5 milliseconds.
> => https://gist.github.com/3799889
> ** Test 2
> --mca btl_openib_receive_queues S,65536,256,128,32 \
> I get around 1.5 milliseconds too.
> => https://gist.github.com/3799940
Are you saying 1.5us is bad? That's actually not bad at all. On the most modern hardware with a bunch of software tuning, you can probably get closer to 1us.
> With my virtual router I am sure I can get something around 270 microseconds.
OTOH, that's pretty bad. :-)
I'm not sure why it would be so bad -- are you hammering the virtual router with small incoming messages? You might need to do a little profiling to see where the bottlenecks are.
> Just out of curiosity, does Open-MPI utilize heavily negative values
> internally for user-provided MPI tags ?
I know offhand we use them for collectives. Something is tickling my brain that we use them for other things, too (CID allocation, perhaps?), but I don't remember offhand.
I'm just saying: YMMV. Buyer be warned. And all that. :-)
> If the negative tags are internal to Open-MPI, my code will not touch
> these private variables, right ?
It's not a variable that's the issue. If you do a receive for tag -3 and OMPI sends an internal control message with tag -3, you might receive it instead of OMPI's core. And that would be Bad.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/