Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Solved: Visual debugging on the cluster
From: devendra rai (rai.devendra_at_[hidden])
Date: 2011-10-26 12:29:11

Hello Meredith Hmm.. Got my X forwarding to work, so can debug on multiple computers. So far, so good! Marking the problem as solved. Thanks for your time. Best Devendra Rai ________________________________ From: Meredith Creekmore <mtcreekmore_at_[hidden]> To: devendra rai <rai.devendra_at_[hidden]> Sent: Tuesday, 25 October 2011, 18:06 Subject: RE: [OMPI users] Visual debugging on the cluster Another dumb/obvious question, but have you tried to submit a sample compiled application across multiple nodes? I once did this and it was forever stuck in a wait state. The reasoning behind this was the admin did not clear my account to use multiple nodes. Once he realized the job had been stuck that way for over a month, he corrected it.   There are video tutorials available online, I think. I personally found a Power Point presentation which went step by step. The problem is Eclipse and the plugin changes so often, those tutorials can be a bit hard to follow because many things may have been changed, especially in the menus where they tell you to find things.     From:devendra rai [mailto:rai.devendra_at_[hidden]] Sent: Tuesday, October 25, 2011 5:29 AM To: Meredith Creekmore; Open MPI Users Subject: Re: [OMPI users] Visual debugging on the cluster   Hello Meredith,   Yes, I have tried the plugin already. The problem is that the plugin seems to be forever stuck in "Waiting for job information" stage. I scouted around a bit on how to solve the problem, and it did not seem straightforward. At least, the solution to me seemed like a one-time wonder.   And, this is how I shifted to parallel visual debuggers, using other tools like kdbg.   However, in case you have PTP plugin working for you on Linux, it would help a lot if you can send screenshots/notes on how to set it up for multiple machines.   So, summing up, I am still clueless.   Thanks for your time though.   Best   Devendra   ________________________________ From:Meredith Creekmore <mtcreekmore_at_[hidden]> To: devendra rai <rai.devendra_at_[hidden]>; Open MPI Users <users_at_[hidden]> Sent: Monday, 24 October 2011, 22:31 Subject: RE: [OMPI users] Visual debugging on the cluster Not a direct answer to your question, but have you tried using Eclipse with the Parallel Platform Tools installed?   From:users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of devendra rai Sent: Monday, October 24, 2011 2:50 PM To: users_at_[hidden] Subject: [OMPI users] Visual debugging on the cluster   Hello Community,   I have been struggling with visual debugging on cluster machines. So far, I tried to work around the problem, or total avoid it, but no more.   I have three machines on the cluster: a.s1.s2, b.s1.s2 and c.s1.s2. I do not have admin privileges on any of these machines.   Now, I want to run a visual debugger on all of these machines, and have the windows come up.   So for from: ( category=running)   13. Can I run GUI applications with Open MPI? Yes, but it will depend on your local setup and may require additional setup. In short: you will need to have X forwarding enabled from the remote processes to the display where you want output to appear. In a secure environment, you can simply allow all X requests to be shown on the target display and set the DISPLAYenvironment variable in all MPI process' environments to the target display, perhaps something like this: shell$ hostname shell$ xhost + shell$ mpirun -np 4 -x a.out However, this technique is not generally suitable for unsecure environments (because it allows anyone to read and write to your display). A slightly more secure way is to only allow X connections from the nodes where your application will be running: shell$ hostname shell$ xhost +compute1 +compute2 +compute3 +compute4 compute1 being added to access control list compute2 being added to access control list compute3 being added to access control list compute4 being added to access control list shell$ mpirun -np 4 -x a.out (assuming that the four nodes you are running on are compute1through compute4). Other methods are available, but they involve sophisticated X forwarding through mpirun and are generally more complicated than desirable. This still gives me "Error: Can't open display:" problem. My mpirun shell script contains: mpirun-1.4.3 -hostfile hostfile -np 3 -v -nooversubscribe --rankfile rankfile.txt --report-bindings  -timestamp-output ./ where rankfile and hostfile contain a.s1.s2, b.s1.s2 and c.s1.s2, and are proper. The file ./ #!/bin/bash echo "Running xeyes on `hostname`" DISPLAY=a.s1.s2:11.0 xeyes exit 0 I see that my xauth list output already contains entries like: a.s1.s2/unix:12  MIT-MAGIC-COOKIE-1  aa16a9573f42224d760c7bb618b48a6f a.s1.s2/unix:10  MIT-MAGIC-COOKIE-1  0fb6fe3c2e35676136c8642412fb5809 a.s1.s2/unix:11  MIT-MAGIC-COOKIE-1  a3a65970b5f545bc750e3520a4e3b872 I seem to have run out of ideas now. However, this works prefectly on any of the machines a.s1.s2, b.s1.s2or c.s1.s2: (for example, running from a.s1.s2): ssh b.s1.s2 xeyes Can someone help? Best Devendra Rai   ________________________________