Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Tim Prins (tprins_at_[hidden])
Date: 2007-08-16 08:49:53


Sorry, I pushed the wrong button and sent this before it was ready....

Tim Prins wrote:
> Hi folks,
>
> I am running into a problem with the ibm test 'group'. I will try to
> explain what I think is going on, but I do not really understand the
> group code so please forgive me if it is wrong...
>
> The test creates a group based on MPI_COMM_WORLD (group1), and a group
> that has half the procs in group1 (newgroup). Next, all the processes do:
>
> MPI_Group_intersection(newgroup,group1,&group2)
>
> ompi_group_intersection figures out what procs are needed for group2,
> then calls
>
> ompi_group_incl, passing 'newgroup' and '&group2'
>
> This then calls (since I am not using sparse groups) ompi_group_incl_plist
>
> However, ompi_group_plist assumes that the current process is a member
> of the passed group ('newgroup'). Thus when it calls
> ompi_group_peer_lookup on 'newgroup', half of the processes get garbage
> back since they are not in 'newgroup'. In most cases, memory is
> initialized to \0 and things fall through, but we get intermittent
> segfaults in optimized builds.
>
Here is a patch to a error check which highlights the problem:
Index: group/group.h
===================================================================
--- group/group.h (revision 15869)
+++ group/group.h (working copy)
@@ -308,7 +308,7 @@
  static inline struct ompi_proc_t* ompi_group_peer_lookup(ompi_group_t
*group, int peer_id)
  {
  #if OMPI_ENABLE_DEBUG
- if (peer_id >= group->grp_proc_count) {
+ if (peer_id >= group->grp_proc_count || peer_id < 0) {
          opal_output(0, "ompi_group_lookup_peer: invalid peer index
(%d)", peer_id);

> Thanks,
>
> Tim
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel