Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: sadfub_at_[hidden]
Date: 2007-06-22 05:55:22


Markus Daene schrieb:
> Hi.
>
> I think it is not necessary to specify the hosts via the hostfile using SGE
> and OpenMPI, even the $NSLOTS is not necessary , just run
> mpirun executable this works very well.

This produces the same error, but thanks for your suggestion. (For the
sake of interest: how controls then ompi how many slots it may use?)

> to your memory problem:
> I had similar problems when I specified the h_vmem option to use in SGE.
> Without SGE everything works, but starting with SGE gives such memory errors.
> You can easily check this with 'qconf -sc'. If you have used this option, try
> without it. The problem in my case was that OpenMPI allocates sometimes a lot
> of memory and the job gets immediately killed by SGE, and one gets such error
> messages, see my posting some days ago. I am not sure if this helps in your
> case but it could be an explanation.

Hmm it seems that I'm not using such an option (for my queue the h_vmem
and s_vmem values are set to infinity). Here the output for the qconf
-sc command. (Sorry for posting SGE related stuff on this mailing list):
[~]# qconf -sc
#name shortcut type relop requestable consumable
default urgency
#----------------------------------------------------------------------------------------
arch a RESTRING == YES NO
NONE 0
calendar c RESTRING == YES NO
NONE 0
cpu cpu DOUBLE >= YES NO
0 0
h_core h_core MEMORY <= YES NO
0 0
h_cpu h_cpu TIME <= YES NO
0:0:0 0
h_data h_data MEMORY <= YES NO
0 0
h_fsize h_fsize MEMORY <= YES NO
0 0
h_rss h_rss MEMORY <= YES NO
0 0
h_rt h_rt TIME <= YES NO
0:0:0 0
h_stack h_stack MEMORY <= YES NO
0 0
h_vmem h_vmem MEMORY <= YES NO
0 0
hostname h HOST == YES NO
NONE 0
load_avg la DOUBLE >= NO NO
0 0
load_long ll DOUBLE >= NO NO
0 0
load_medium lm DOUBLE >= NO NO
0 0
load_short ls DOUBLE >= NO NO
0 0
mem_free mf MEMORY <= YES NO
0 0
mem_total mt MEMORY <= YES NO
0 0
mem_used mu MEMORY >= YES NO
0 0
min_cpu_interval mci TIME <= NO NO
0:0:0 0
np_load_avg nla DOUBLE >= NO NO
0 0
np_load_long nll DOUBLE >= NO NO
0 0
np_load_medium nlm DOUBLE >= NO NO
0 0
np_load_short nls DOUBLE >= NO NO
0 0
num_proc p INT == YES NO
0 0
qname q RESTRING == YES NO
NONE 0
rerun re BOOL == NO NO
0 0
s_core s_core MEMORY <= YES NO
0 0
s_cpu s_cpu TIME <= YES NO
0:0:0 0
s_data s_data MEMORY <= YES NO
0 0
s_fsize s_fsize MEMORY <= YES NO
0 0
s_rss s_rss MEMORY <= YES NO
0 0
s_rt s_rt TIME <= YES NO
0:0:0 0
s_stack s_stack MEMORY <= YES NO
0 0
s_vmem s_vmem MEMORY <= YES NO
0 0
seq_no seq INT == NO NO
0 0
slots s INT <= YES YES
1 1000
swap_free sf MEMORY <= YES NO
0 0
swap_rate sr MEMORY >= YES NO
0 0
swap_rsvd srsv MEMORY >= YES NO
0 0
swap_total st MEMORY <= YES NO
0 0
swap_used su MEMORY >= YES NO
0 0
tmpdir tmp RESTRING == NO NO
NONE 0
virtual_free vf MEMORY <= YES NO
0 0
virtual_total vt MEMORY <= YES NO
0 0
virtual_used vu MEMORY >= YES NO
0 0
# >#< starts a comment but comments are not saved across edits --------

thanks for your help.