Good to know that I'm not just not finding the solution, there simply is none.
The system is actually dedicated to the job. But the process may, while working, receive a signal that alters the ongoing job. Like for example a terminate signal or more data to be taken into consideration. That's why I need to listen in parallel and a CPU core less troublesome.

George Bosilca schrieb:
Currently there is no work around this issue. We consider(ed) that when you run an MPI job the cluster is in dedicated mode, so a 100% CPU consumption is acceptable. However, as we discussed at our last meeting, there are others reasons to be able to yield the CPU until a message arrives. Therefore, we plan to have a blocking mode in the near future. The is no timeframe for this, but the discussions already started (that is usually a good sign).


On Oct 23, 2007, at 9:17 AM, Murat Knecht wrote:

thanks for answering. Unfortunately, I did try that, too. The point is that i don't understand the ressource consumption. Even if the processor is yielded, it still is busy waiting, wasting system resources which could otherwise be used for actual work. Isn't there some way to activate an interrupt mechanism, so that the wait/recv blocks the thread, e.g. puts it to sleep, until notified?


Tim Mattox schrieb:
You should look at these two FAQ entries: To get what you want, you need to force Open MPI to yield the processor rather than be aggressively waiting for a message. On 10/23/07, Murat Knecht <> wrote:
Hi, Testing a distributed system locally, I couldn't help but notice that a blocking MPI_Recv causes 100% CPU load. I deactivated (at both compile- and run-time) the shared memory bt-layer, and specified "tcp, self" to be used. Still one core busy. Even on a distributed system I intend to perform work, while waiting for incoming requests. For this purpose having one core busy waiting for requests is uncomfortable to say the least. Does OpenMPI not use some blocking system call to a tcp port internally? Since i deactivated the understandably costly shared-memory waits, this seems weird to me. Someone has an explanation or even better a fix / workaround / solution ? thanks, Murat _______________________________________________ users mailing list
users mailing list

_______________________________________________ users mailing list