Hi all,
I think, I found a bug and a fix for it.
Could someone verify the rationale behind this bug, as I have this
SIGSEG on only one of two machines, and I don't quite see why it doesn't
occur always. (Same testprogram, equally compiled 1.2.4 OpenMPI).
Though the fix does prevent the segmentation fault. :)
Thanks,
Murat
Where:
Bug:
free() crashes when trying to free stack memory
ompi/communicator/comm_dyn.c:630
OBJ_RELEASE(apps[i]);
SIGSEG:
orte/mca/rmgr/rmgr_types.h:113
free (app_context->cwd);
There are two ways that apps[i]->cwd is filled:
1. dynamically allocated memory
548 if ( !have_wdir ) {
getcwd(cwd, OMPI_PATH_MAX);
apps[i]->cwd = strdup(cwd); // <--
}
2. stack
354 char cwd[OMPI_PATH_MAX];
// ...
516 /* check for 'wdir' */
ompi_info_get (array_of_info[i], "wdir", valuelen, cwd, &flag);
if ( flag ) {
apps[i]->cwd = cwd; // <--
have_wdir = 1;
}
Fix: Allocate cwd always manually and make sure, it is deleted afterwards.
1.
< char cwd[OMPI_PATH_MAX];
---
> char *cwd = (char*)malloc(OMPI_PATH_MAX);
2. And on cleanup (somewhere below line 624)
> if ( !have_wdir ) {
> getcwd(cwd, OMPI_PATH_MAX);
> apps[i]->cwd = strdup(cwd);
> }
|