Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] [EXTERNAL] Possible memory leak(s) in OpenMPI 1.6.3?
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2013-01-21 17:50:20


Thanks for the bug report. I've fixed the leak in our development branch and it should make its way to both the 1.6 and 1.7 release series.

Brian

On 1/21/13 6:53 AM, "Victor Vysotskiy" <victor.vysotskiy_at_[hidden]<mailto:victor.vysotskiy_at_[hidden]>> wrote:

Since my question unanswered for 4 days, I repeat the original post.

Dear Developers,

I am running into memory problems when creating/allocating MPI's window and its memory frequently. Below is listed a sample code reproducing the problem:

#include <stdio.h>
#include <mpi.h>
#define NEL 8
#define NTIMES 1000000

int main (int argc,char *argv[]) {
  int i;
  double w[NEL];
  MPI_Aint win_size,warr_size;
  MPI_Win *win;

  win_size=sizeof(MPI_Win);
  warr_size=sizeof(MPI_DOUBLE)*NEL;

  MPI_Init (&argc, &argv);

  for(i=0;i<NTIMES;i++) {
      MPI_Alloc_mem(win_size,MPI_INFO_NULL,&win);

      MPI_Win_create(w,warr_size,sizeof(double),MPI_INFO_NULL,MPI_COMM_WORLD,win);
      MPI_Win_free(win);

      MPI_Free_mem(win);
  }

  MPI_Finalize();

  return 0;

}

During of the execution of this program it is eating more and more memory, regardless of Linux distribution and gcc version used. Indeed, I have already reproduced the problem on 32-bit Ubuntu 12.04.1 && gcc-4.6.3, and 64-bit Cent OS 5.8 && gcc 4.6.2. For instance, below is listed the corresponding Valgrind's (massif) results on the runtime memory usage:

Command: ./mleak.win
ms_print arguments: massif.out.15028
  n time(ms) total(B) useful-heap(B) )
  1 10,960 4,290,024 3,986,911
  2 26,979 5,248,248 4,562,911
  3 42,586 6,174,440 5,118,311
  4 54,892 6,904,736 5,556,391
  5 69,562 7,771,088 6,074,535
  6 79,334 8,351,144 6,422,575
  7 90,920 9,038,208 6,834,799
  8 103,449 9,787,760 7,286,215
  9 115,984 10,534,056 7,734,511
 10 130,692 11,407,016 8,258,287
 11 146,637 12,352,376 8,825,503
 12 155,095 12,854,016 9,126,487
 13 163,884 13,376,016 9,439,687
 14 173,036 13,919,256 9,765,631
 15 182,559 14,484,488 10,104,751
 16 192,465 15,072,688 10,457,671
 17 202,770 15,684,768 10,824,919
 18 213,499 16,321,768 11,207,119
 19 224,658 16,984,608 11,604,823
 20 236,275 17,674,320 12,018,631
 21 248,366 18,392,040 12,449,263
 22 260,954 19,138,960 12,897,415
 23 274,111 19,916,152 13,363,711
 24 287,781 20,724,912 13,848,967
 25 302,012 21,566,552 14,353,951
 26 316,817 22,442,344 14,879,407
 27 332,189 23,353,664 15,426,199
 28 348,179 24,301,984 15,995,191
 29 364,829 25,288,816 16,587,271
 30 373,404 25,797,176 16,892,287
 31 382,159 26,315,736 17,203,423
 32 391,086 26,844,696 17,520,799
 33 400,196 27,384,336 17,844,583
 34 409,491 27,934,808 18,174,847
 35 418,958 28,496,328 18,511,759
 36 428,621 29,069,168 18,855,463
 37 438,478 29,653,488 19,206,055
 38 448,539 30,249,560 19,563,679
 39 458,806 30,857,640 19,928,527
 40 469,271 31,477,920 20,300,695
 41 479,955 32,110,680 20,680,351
 42 490,849 32,756,120 21,067,615
 43 501,960 33,414,552 21,462,655
 44 513,294 34,086,232 21,865,663
 45 524,859 34,771,392 22,276,759
 46 536,729 35,470,304 22,696,087
 47 548,772 36,183,304 23,123,887
 48 561,047 36,910,624 23,560,279
 49 573,561 37,652,544 24,005,431
 50 579,928 38,029,080 24,231,343
 51 586,337 38,409,376 24,459,511
 52 592,820 38,793,496 24,689,983
 53 599,367 39,181,456 24,922,759
 54 605,983 39,573,296 25,157,863
 55 612,664 39,969,056 25,395,319
 56 619,412 40,368,776 25,635,151
 57 626,225 40,772,488 25,877,359
 58 633,113 41,180,248 26,122,015
 59 640,071 41,592,088 26,369,119
 60 647,095 42,008,048 26,618,695
 61 654,192 42,428,168 26,870,767
 62 661,358 42,852,488 27,125,359
 63 668,594 43,281,040 27,382,471
 64 675,905 43,713,880 27,642,175

On the other hand, the MPICH2 and MVAPICH2 binaries are always used a constant amount of memory during the execution of this test.

Could you please comment on the problem?

Enclosed please find the source code, and the corresponding valgrind's output generated via the ''--tool=massif --time-unit=ms" option.

Regards,
Victor.

--
  Brian W. Barrett
  Scalable System Software Group
  Sandia National Laboratories