Hi.

I have an issue with understanding  ompi_mpi_init() logic. Could you please tell me if you have any guesses about following behavior.

I wonder if I understand ringh, there is a block in ompi_mpi_init() function for exchanging procs information between processes (denote this block 'modex'):
    coll = OBJ_NEW(orte_grpcomm_collective_t);
    coll->id = orte_process_info.peer_modex;
    if (ORTE_SUCCESS != (ret = orte_grpcomm.modex(coll))) {
        error = "orte_grpcomm_modex failed";
        goto error;
    }
    /* wait for modex to complete - this may be moved anywhere in mpi_init
     * so long as it occurs prior to calling a function that needs
     * the modex info!
     */
    while (coll->active) {
        opal_progress();  /* block in progress pending events */
    }
    OBJ_RELEASE(coll);
and several instructions after this there is a block for processes synchronization (denote this block 'barrier'):
    coll = OBJ_NEW(orte_grpcomm_collective_t);
    coll->id = orte_process_info.peer_init_barrier;
    if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier(coll))) {
        error = "orte_grpcomm_barrier failed";
        goto error;
    }
    /* wait for barrier to complete */
    while (coll->active) {
        opal_progress();  /* block in progress pending events */
    }
    OBJ_RELEASE(coll);
So, initially ompi_mpi_init() has following structure:
...
'modex' block;
...
'barrier' block;
...
I made several experiments with this code and the following one is of interest: if I add sequence of two additional blocks, 'barrier' and 'modex', right after 'modex' block, then ompi_mpi_init() hangs in opal_progress() of the last 'modex' block.
...
'modex' block;
'barrier' block;
'modex' block; <- hangs
...
'barrier' block;
...
Thanks,
Victor Kocheganov.