Hi,
I find the reason why the program is killed by operating system in
the case that the problem size is large.
It consumes more memory and leads to more memory swap.
This also degrade the program performance.
But, I cannot determine which function of the worker process
causes the problem.
I have used try-catch in my code but no exception popped out.
I found that
-------------------------------------------------------------------
When the processes running on your server attempt to allocate more
memory than your system has available, the kernel begins to swap
memory pages to and from the disk.
This is done in order to free up sufficient physical memory to meet
the RAM allocation requirements of the requestor.
------------------------------------------------------------------
I am not sure it is really caused by CPLEX ( an optimization model solver)
or other routines or maybe by other dynamic memory allocation used by
CPLEX API libray at background.
Any help is really appreciated.
Jack
From: rhc@open-mpi.org
Date: Wed, 13 Apr 2011 10:34:38 -0600
To: users@open-mpi.org
Subject: Re: [OMPI users] OMPI monitor each process behavior
On Apr 13, 2011, at 10:19 AM, Jack Bryan wrote:
Hi, I am using
mpirun (Open MPI) 1.3.4But, I have these,
orte-clean orted orte-iof orte-ps orterun
Can they do the same thing ?
Unfortunately, no
If I use them, will they use a lot of memory on each worker node and print out a lot of things on some log files ?
No, but they won't help. orte-top would be run only on the head node (i.e., where you are logged in), and would generate output to your screen.
But you don't have it with that release, so the point is moot. Afraid there isn't much else you can do - you might talk to your sys admin and see what tools are available on your cluster for this purpose. Perhaps a nice parallel debugger is available?
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users