Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Marcelo Maia Garcia (marcelomgarcia_at_[hidden])
Date: 2007-01-15 12:13:18


Hi

  I am trying to setup SGE to run DLPOLY compiled with mpif90 (OpenMPI 1.2b2,
pathscale Fortran compilers and gcc c/c++). In general I am much more
luckier running DLPOLY interactively then using SGE. The error that I got
is: Signal:7 info.si_errno:0(Success) si_code:2()[1]. A previous message in
the list[2], says that this is more likely to be a configuration problem.
But what kind of configuration? It is in the run time?

  Another error that I got sometimes is related with "writev"[3] But this is
pretty rare.

  My network is Gigabit, OS is Red Hat EL 4 the nodes are dual core Opteron
275 (4 cores). I did not use any (except for prefix) option during the
configuration process

  Any suggestion?

  Thanks

Marcelo Garcia

[1]
[ocf_at_master TEST2]$ mpirun -np 16 --hostfile
/home/ocf/SRIFBENCH/DLPOLY3/data/nodes_16_slots4.txt
/home/ocf/SRIFBENCH/DLPOLY3/execute/DLPOLY.Y
Signal:7 info.si_errno:0(Success) si_code:2()
Failing at addr:0x5107b0
(...)

[2] http://www.open-mpi.org/community/lists/users/2007/01/2423.php

[3]
[node007:05003] mca_btl_tcp_frag_send: writev failed with errno=104
[node007:05004] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05170] mca_btl_tcp_frag_send: writev failed with errno=104
[node007:05005] mca_btl_tcp_frag_send: writev failed with errno=104
[node007:05006] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05170] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05171] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05171] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05172] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05172] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05173] mca_btl_tcp_frag_send: writev failed with errno=104
[node006:05173] mca_btl_tcp_frag_send: writev failed with errno=104
mpirun noticed that job rank 0 with PID 0 on node node003 exited on signal
48.
15 additional processes aborted (not shown)