Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Drastic change in ORTE behavior between trunk and 1.5
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-12-14 20:59:35


On Dec 14, 2011, at 6:51 PM, George Bosilca wrote:

> To be honest I'm totally lost in the naming scheme, which got me confused about the RFC you're referring to. We had an MCA parameter to start a vm, so I thought VM is some kind of special virtualized environment and not the entire ORTE. Based on the behavior of the trunk and the RFC you referred to, it seems that ORTE is now a VM (and only that). What is the real truth? Why did we had a need for orte_vm_launch and why this need suddenly disappeared?

No mystery here. As was explained in the RFC and the calls, we have to launch the daemons -before- we can map the job in order to have the hardware topology info to support the hardware-based mapping schemes. Maintaining both vm and non-vm launch mechanisms made the code a mess, and so we opted to support only the vm-based launch.

>
> I'm really amazed. Open MPI is the only MPI library doing everything in the reverse way, and all this blessed by the community. We had features that no other MPI implementations supported (but were in the MPI standard), but we removed them (sic).

I have no idea what "feature" you are saying has been removed. We have significantly -increased- the feature set with the revised mapping system, supposedly at the express request of the user community. It was our understanding that most production users utilize their entire allocation, and so doing a vm launch costs nothing - we'd be launching the same daemons anyway.

The hostfile-based user was raised as a possible issue, as I noted before. Hopefully, the recent fix should help ease that situation.

> Meanwhile, the other MPI implemented them … Thus, their features list increases while our decreases. Clearly all successful projects should be inspired by our growing strategy.
>
> george.
>
> PS: Thanks for the fix regarding the --host. We have encountered another issue. A job that terminates abnormally (MPI_Abort or segfault), will leave daemons behind. Usually it is not very bothersome, except that now with the new VM, our entire cluster is full with useless processes, at a point where after a while we have to reboot the machines to liberate pids.

I'll look into the lingering daemon issue. Sounds like they (a) aren't properly terminating on abnormal termination, and (b) aren't suiciding when the HNP goes away.

>
> On Dec 14, 2011, at 10:08 , Ralph Castain wrote:
>
>> On Dec 13, 2011, at 9:10 PM, George Bosilca wrote:
>>
>>> I noticed today a drastic change in how ORTE deal with the hostfile between trunk and 1.5.
>>>
>>> 1. 1.5 and prior used the hostile as a suggestion, a placeholder where to pick the requested number of daemons during the launch. The current trunk spawn daemons on all the nodes provided on the host file, and then spawn the apps only on some of them.
>>
>> It was in the RFC about the revised mapping system, George, and discussed multiple times on the telecons. I even raised this specific point at least twice on those telecons.
>>
>>> 2. If a default hostfile is provided and --host was specified 1.5 and prior use the nodes to limit the number of nodes in the environment to the requested nodes. The current trunk seems to ignore the --host option if a default hostfile is available.
>>
>> I'll check that one - we should limit the operation to the --host list.
>>
>>> In my configuration the hostfile is system wide, specified in the /etc via orte_default_hostfile. It contains all the nodes in the cluster, the users are supposed to use --host to limit their mpirun to a specified subset.
>>>
>>> This seems a quite significant change. I would have expected an RFC.
>>>
>>> george.
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel