Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Troubles with MPI-IO Test and Torque/PVFS
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-04-13 08:27:11


It looks like you're seg faulting when calling some flavor printf
(perhaps vsnprintf?) in the make_error_messages() function.

You might want to double check the read_write_file() function to see
exactly what kind of error it is encountering such that it is calling
report_errs().

On Apr 10, 2008, at 3:29 PM, Davi Vercillo C. Garcia wrote:
> Hi all,
>
> I have a Cluster with Torque and PVFS. I'm trying to test my
> environment with MPI-IO Test but some segfault are occurring.
> Does anyone know what is happening ? The error output is below:
>
> Rank 1 Host campogrande03.dcc.ufrj.br WARNING ERROR 1207853304: 1 bad
> bytes at file offset 0. Expected (null), received (null)
> Rank 2 Host campogrande02.dcc.ufrj.br WARNING ERROR 1207853304: 1 bad
> bytes at file offset 0. Expected (null), received (null)
> [campogrande01:10646] *** Process received signal ***
> Rank 0 Host campogrande04.dcc.ufrj.br WARNING ERROR 1207853304: 1 bad
> bytes at file offset 0. Expected (null), received (null)
> Rank 0 Host campogrande04.dcc.ufrj.br WARNING ERROR 1207853304: 65537
> bad bytes at file offset 0. Expected (null), received (null)
> [campogrande04:05192] *** Process received signal ***
> [campogrande04:05192] Signal: Segmentation fault (11)
> [campogrande04:05192] Signal code: Address not mapped (1)
> [campogrande04:05192] Failing at address: 0x10000
> Rank 1 Host campogrande03.dcc.ufrj.br WARNING ERROR 1207853304: 65537
> bad bytes at file offset 0. Expected (null), received (null)
> [campogrande03:05377] *** Process received signal ***
> [campogrande03:05377] Signal: Segmentation fault (11)
> [campogrande03:05377] Signal code: Address not mapped (1)
> [campogrande03:05377] Failing at address: 0x10000
> [campogrande03:05377] [ 0] [0xffffe440]
> [campogrande03:05377] [ 1]
> /lib/tls/i686/cmov/libc.so.6(vsnprintf+0xb4) [0xb7d5fef4]
> [campogrande03:05377] [ 2] mpiIO_test(make_error_messages+0xcf)
> [0x80502e4]
> [campogrande03:05377] [ 3] mpiIO_test(warning_msg+0x8c) [0x8050569]
> [campogrande03:05377] [ 4] mpiIO_test(report_errs+0xe2) [0x804d413]
> [campogrande03:05377] [ 5] mpiIO_test(read_write_file+0x594)
> [0x804d9c2]
> [campogrande03:05377] [ 6] mpiIO_test(main+0x1d0) [0x804aa14]
> [campogrande03:05377] [ 7]
> /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7d15050]
> [campogrande03:05377] [ 8] mpiIO_test [0x804a7e1]
> [campogrande03:05377] *** End of error message ***
> Rank 2 Host campogrande02.dcc.ufrj.br WARNING ERROR 1207853304: 65537
> bad bytes at file offset 0. Expected (null), received (null)
> [campogrande02:05187] *** Process received signal ***
> [campogrande02:05187] Signal: Segmentation fault (11)
> [campogrande02:05187] Signal code: Address not mapped (1)
> [campogrande02:05187] Failing at address: 0x10000
> [campogrande01:10646] Signal: Segmentation fault (11)
> [campogrande01:10646] Signal code: Address not mapped (1)
> [campogrande01:10646] Failing at address: 0x1a0000
> [campogrande02:05187] [ 0] [0xffffe440]
> [campogrande02:05187] [ 1]
> /lib/tls/i686/cmov/libc.so.6(vsnprintf+0xb4) [0xb7d5fef4]
> [campogrande02:05187] [ 2] mpiIO_test(make_error_messages+0xcf)
> [0x80502e4]
> [campogrande02:05187] [ 3] mpiIO_test(warning_msg+0x8c) [0x8050569]
> [campogrande02:05187] [ 4] mpiIO_test(report_errs+0xe2) [0x804d413]
> [campogrande02:05187] [ 5] mpiIO_test(read_write_file+0x594)
> [0x804d9c2]
> [campogrande02:05187] [ 6] mpiIO_test(main+0x1d0) [0x804aa14]
> [campogrande02:05187] [ 7]
> /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7d15050]
> [campogrande02:05187] [ 8] mpiIO_test [0x804a7e1]
> [campogrande02:05187] *** End of error message ***
> [campogrande04:05192] [ 0] [0xffffe440]
> [campogrande04:05192] [ 1]
> /lib/tls/i686/cmov/libc.so.6(vsnprintf+0xb4) [0xb7d5fef4]
> [campogrande04:05192] [ 2] mpiIO_test(make_error_messages+0xcf)
> [0x80502e4]
> [campogrande04:05192] [ 3] mpiIO_test(warning_msg+0x8c) [0x8050569]
> [campogrande04:05192] [ 4] mpiIO_test(report_errs+0xe2) [0x804d413]
> [campogrande04:05192] [ 5] mpiIO_test(read_write_file+0x594)
> [0x804d9c2]
> [campogrande04:05192] [ 6] mpiIO_test(main+0x1d0) [0x804aa14]
> [campogrande04:05192] [ 7]
> /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7d15050]
> [campogrande04:05192] [ 8] mpiIO_test [0x804a7e1]
> [campogrande04:05192] *** End of error message ***
> [campogrande01:10646] [ 0] [0xffffe440]
> [campogrande01:10646] [ 1]
> /lib/tls/i686/cmov/libc.so.6(vsnprintf+0xb4) [0xb7d5fef4]
> [campogrande01:10646] [ 2] mpiIO_test(make_error_messages+0xcf)
> [0x80502e4]
> [campogrande01:10646] [ 3] mpiIO_test(warning_msg+0x8c) [0x8050569]
> [campogrande01:10646] [ 4] mpiIO_test(report_errs+0xe2) [0x804d413]
> [campogrande01:10646] [ 5] mpiIO_test(read_write_file+0x594)
> [0x804d9c2]
> [campogrande01:10646] [ 6] mpiIO_test(main+0x1d0) [0x804aa14]
> [campogrande01:10646] [ 7]
> /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7d15050]
> [campogrande01:10646] [ 8] mpiIO_test [0x804a7e1]
> [campogrande01:10646] *** End of error message ***
> mpiexec noticed that job rank 0 with PID 5192 on node campogrande04
> exited on signal 11 (Segmentation fault).
>
> --
> Davi Vercillo Carneiro Garcia
>
> Universidade Federal do Rio de Janeiro
> Departamento de Ciência da Computação
> DCC-IM/UFRJ - http://www.dcc.ufrj.br
>
> "Good things come to those who... wait." - Debian Project
>
> "A computer is like air conditioning: it becomes useless when you open
> windows." - Linus Torvalds
>
> "Há duas coisas infinitas, o universo e a burrice humana. E eu estou
> em dúvida quanto o primeiro." - Albert Einstein
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems