On 2/6/12 8:14 AM, "Reuti" <reuti_at_[hidden]> wrote:
>> If I need MPI_THREAD_MULTIPLE, and openmpi is compiled with thread support,
>> it's not clear to me whether MPI::Init_Thread() and
>> MPI::Inint_Thread(MPI::THREAD_MULTIPLE) would give me the same behavior from
>> Open MPI.
> If you need thread support, you will need MPI::Init_Thread and it needs one
> argument (or three).
Sorry, typo on my side. I meant to compare
MPI::Init_thread(MPI::THREAD_MULTIPLE) and MPI::Init(). I think that your
first reply mentioned replacing MPI::Init_thread by MPI::Init.
> I suggest to use a stable version 1.4.4 for your experiments. As you said you
> are new MPI, you could get misled between wrong error messages and bugs and
> error messages due to a programming error on your side.
OK. I'll certainly set it up so that I can validate what's supposed to
work. I'll have to check with our main MPI developers to see whether
there's anything in 1.5.x that they need.
>> 1. I'm still surprised that the SGE behavior is so different when I
>> configure my SGE queue differently. See test "a" in the .tgz. When I just
>> run mpitest in mpi.sh and ask for exactly 5 slots (-pe orte 5-5), it works
>> if the queue is configured to use a single host. I see 1 MASTER and 4
>> SLAVES in qstat -g t, and I get the correct output.
> Fine. ("job_is_first_task true" in the PE according to this.)
Yes, I believe that job_is_first_task will need to be true for our
>> If the queue is set to
>> use multiple hosts, the jobs hang in spawn/init, and I get errors
>> _complete_connect] connect() to 192.168.122.1 failed: Connection refused
> What is the setting in SGE for:
> $ qconf -sconf
> qlogin_command builtin
> qlogin_daemon builtin
> rlogin_command builtin
> rlogin_daemon builtin
> rsh_command builtin
> rsh_daemon builtin
> If it's set to use ssh,
Nope. My output is the same as yours.
> But I wonder, why it's working for some nodes?
I don't think that it's working on some nodes. In my other cases where it
hangs, I don't always get those "connection refused" errors.
I'm not sure, but the "connection refused" errors might be a red herring.
The machines' primary NICs are on a different private network (172.28.*.*).
The 192.168.122.1 address is actually the machine's own virbr0 device, which
the documentations says is a "xen interface used by Virtualization guest and
host oses for network communication."
> Are there custom configuration per node, and some are faulty:
I did a qconf -sconf machine for each host in my grid. I get identical
output like this for each machine.
$ qconf -sconf grid-03
So, I think that the SGE config is the same across those machines.
>> 2. I guess I'm not sure how SGE is supposed to behave. Experiment "a" and
>> "b" were identical except that I changed -pe orte 5-5 to -pe orte 5-. The
>> single case works like before, and the multiple exec host case fails as
>> before. The difference is that qstat -g t shows additional SLAVEs that
>> don't seem to correspond to any jobs on the exec hosts. Are these SLAVEs
>> just slots that are reserved for my job but that I'm not using? If my job
>> will only use 5 slots, then I should set the SGE qsub job to ask for exactly
>> 5 with "-pe orte 5-5", right?
> Correct. The remaining ones are just unused. You could adjust your application
> of course to check how many slots were granted, and start slaves according to
> the information you got to use all granted slots.
OK. That makes sense. In our intended uses, I believe that we'll know
exactly how many slots the application will need, and it will use the same
number of slots throughout the entire job.
>> 3. Experiment "d" was similar to "b", but I use mpi.sh uses "mpiexec -np 1
>> mpitest" instead of running mpitest directly. Now both the single machine
>> queue and multiple machine queue work. So, mpiexec seems to make my
>> multi-machine configuration happier. In this case, I'm still using "-pe
>> orte 5-", and I'm still seeing the extra SLAVE slots granted in qstat -g t.
> Then case a) could show a bug in 1.5.4. For me both we working, but the
OK. That helps to explain my confusion. Our previous experiments (where I
was told that case (a) was working) were with Open MPI 1.4.x. Should I open
a bug for this issue?
> allocation was different. The correct allocation I only got with "mpiexec -np
> 1". In your case 4 were routed to one remote machine: the machine where the
> jobscript runs is usually the first entry in the machinefile, but on grid-03
> you got only one slot by accident, and so the 4 additional ones were routed to
> the next machine it found in the machinefile.
FYI, I think that this particular allocation was a mis-configuration on my
side. It looks like SGE thinks that grid-03 only has 1 slot available.
>> 4. Based on "d", I thought that I could follow the approach in "a". That
>> is, for experiment "e", I used mpiexec -np 1, but I also used -pe orte 5-5.
>> I thought that this would make the multi-machine queue reserve only the 5
>> slots that I needed. The single machine queue works correctly, but now the
>> multi-machine case hangs with no errors. The output from qstat and pstree
>> are what I'd expect, but it seems to hang in Span_multiple and Init_thread.
>> I really expected this to work.
> Yes, this should work across multiple machines. And it's using `qrsh -inherit
> ...` so it's failing somewhere in Open MPI - is it working with 1.4.4?
I'm not sure. We no longer have our 1.4 test environment, so I'm in the
process of building that now. I'll let you know once I have a chance to run