Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Solaris sigbus error in ompi_info
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2014-01-02 05:44:49


Hi,

> I have no ideas here as there isn't enough info - can you dig
> deeper to tell us where the sigbus happens?
>
> Meantime, I filed a ticket against it and we can capture your
> response there.
>
> https://svn.open-mpi.org/trac/ompi/ticket/4042

Yes. Jeff has an account on my machine, so that he can look at
the problem as well. He can even use our system with little and
big endian machines, if he has time and if he wants that Open
MPI can use a real heterogeneous environment with different
architectures and not only different operating systems (LAM-MPI
can use all of our machines in a single job).

tyr bin 49 /opt/solstudio12.3/bin/sparcv9/dbx ompi_info
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message
  7.9' in your .dbxrc
Reading ompi_info
Reading ld.so.1
Reading libmpi.so.0.0.0
...
(dbx) run -a
Running: ompi_info -a
(process id 12251)
Reading libc_psr.so.1
Reading mca_compress_bzip.so
...
            MCA compress: parameter "compress_base_verbose" (current value:
                          "-1", data source: default, level: 8 dev/detail,
                          type: int)
                          Verbosity level for the compress framework (0 = no
                          verbosity)
t_at_1 (l_at_1) signal BUS (invalid address alignment) in var_value_string at line
1685 in file "mca_base_var.c"
 1685 ret = asprintf (value_string,
var_type_formats[var->mbv_type], value[0]);
(dbx)
(dbx)
(dbx) check -all
dbx: warning: check -all will be turned on in the next run of the process
access checking - OFF
memuse checking - OFF
(dbx) run -a
Running: ompi_info -a
(process id 12253)
Reading rtcapihook.so
...
RTC: Running program...
Write to unallocated (wua) on thread 1:
Attempting to write 1 byte at address 0xffffffff79f04000
t_at_1 (l_at_1) stopped in _readdir at 0xffffffff53d74e40
0xffffffff53d74e40: _readdir+0x0064: call
_PROCEDURE_LINKAGE_TABLE_+0x23a0 [PLT] ! 0xffffffff53f42aa0
Current function is find_dyn_components
  393 if (0 != lt_dlforeachfile(dir, save_filename, NULL))
{
(dbx)

Kind regards

Siegmar

> On Jan 1, 2014, at 1:48 AM, Siegmar Gross
<Siegmar.Gross_at_[hidden]> wrote:
>
> > Unfortunately I still get a "SIGBUS Error" on "Solaris Sparc"
> > for "ompi_info -a".
> >
> > tyr openmpi-1.9 99 ompi_info | grep MPI:
> > Open MPI: 1.9a1r30100
> > tyr openmpi-1.9 100 ompi_info -a |& grep Signal
> > [tyr:09699] Signal: Bus Error (10)
> > [tyr:09699] Signal code: Invalid address alignment (1)
> > .../openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:0x1321b8
> > [ Signal 2099900312 (?)]
> > Bus error
> > tyr openmpi-1.9 101
> >
> >
> > I can compile and run a small MPI program without "SIGBUS Error".
> > Jeff, thank you very much for solving this problem.
> >
> > tyr small_prog 110 mpicc init_finalize.c
> > tyr small_prog 111 mpiexec -np 1 a.out
> > Hello!
> > tyr small_prog 112
> >
>