Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] on cluster job slowdown near end
From: Tom Hilinski (tom.hilinski_at_[hidden])
Date: 2011-09-22 17:44:11


Hi, A job I am running slows down as it approaches the end. I'd
appreciate any ideas you may have on possible cause or what else I can
look at for diagnostic info.

Environment:
* Linux cluster, very recent version of Fedora.
* openmpi 1.5

Characteristics of job:
* Tasks are all the same size and duration.
* 56K tasks, but multiple tasks given to each process.
* Typically run 120 processes.
* Slowdown starts at ~52K completed, then rate of completion of each
task declines geometrically from ~1k/minute to 4/minute at 54K.

Here are some queries done when the slowdown occurs:

* "ps" on master node - most processes in suspend state:
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 3348 27933 15675 0 80 0 - 13608 poll_s pts/0 00:00:00 mpiexec
0 S 3348 28009 27933 14 80 0 - 227632 epoll_ pts/0 00:08:13 C5MPI
0 S 3348 28011 27933 14 80 0 - 227672 epoll_ pts/0 00:08:17 C5MPI
0 S 3348 28013 27933 13 80 0 - 227713 epoll_ pts/0 00:08:06 C5MPI
0 S 3348 28015 27933 13 80 0 - 227844 epoll_ pts/0 00:08:02 C5MPI
0 S 3348 28017 27933 14 80 0 - 227849 epoll_ pts/0 00:08:13 C5MPI
0 S 3348 28019 27933 13 80 0 - 227892 epoll_ pts/0 00:08:07 C5MPI

* file handles (allocated handle count is ~constant):
$ cat /proc/sys/fs/file-nr
3968 0 801014

* Processes in a suspend or run state (varies):
$ orte-top -pid 27933 | grep ' S |' | wc -l
124
$ orte-top -pid 27933 | grep ' R |'
Rank | Nodename | Command | Pid | State | Time | Pri | #threads |
Vsize | RSS | Peak Vsize | Shr Size |
   0 | rubel-001 | C5MPI | 14700 | R | 2.2H | 20 | 1 |
246208 | 12660 | 246208 | 17664 |
   1 | rubel-001 | C5MPI | 14702 | R | 2.2H | 20 | 1 |
245360 | 44860 | 245360 | 17664 |