Thanks alot for your reply,
I'm using blocking Send and Receive. All the clients are sending data and
the server is receive the messages from the clients with MPI_ANY_SOURCE as
the sender. Do you think there is a race condition near this pattern?
I searched a lot and used totalview but I couldn't detect such case. I
really appreciate if you send me a link or give an example of a possible
race condition in that scenario .
Also, when I partition the message into smaller parts (send in sequence -
all the other clients wait until the send finish) it works fine. is that
exclude the race condition?
Regards,
Amr
>>We've seen similar things in our code. In our case it is probably due to a
>>race condition. Try running the segv'ing process in a debugger, and it
will
>>likely show you a bug in your code
>>On Feb 24, 2010 9:36 PM, "Amr Hassan " <amr.abdelaziz_at_[hidden]>
wrote:
>>Hi All,
>>I'm facing a strange problem with OpenMPI.
>>I'm developing an application which is required to send a message from
each
>>client (1 MB each) to a server node for around 10 times per second (it's a
>>distributed render application and I'm trying to reach a higher frame rate
>>). The problem is that OpenMPI crash in that case and only works if I
>>partition this messages into a set of 20 k sub-messages with a sleep
between
>>each one of them for around 1 to 10 ms!! This solution is very expensive
in
>>term of time needed to send the data. Is there any other solutions?
>>The error i got now is:
>>Signal: Segmentation fault (11)
>>Signal code: Address not mapped (1)
>>Failing at address: xxxxxxxxxxxxx
>>The OS is Linux CentOS. I'm using the latest version of OpenMPI.
>>I appreciate any help regarding that.
>>Regards,
>>Amr
|