On Apr 4, 2014, at 7:39 AM, Reuti <reuti@staff.uni-marburg.de> wrote:

Am 04.04.2014 um 05:55 schrieb Ralph Castain:

On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) <nishadhankher-coaeseeit@pau.edu> wrote:

thankyou Ralph.
Yes cluster is heterogenous...

And did you configure OMPI --enable-heterogeneous? And are you running it with ---hetero-nodes? What version of OMPI are you using anyway?

Note that we don't care if the host pc's are hetero - what we care about is the VM. If all the VMs are the same, then it shouldn't matter. However, most VM technologies don't handle hetero hardware very well - i.e., you can't emulate an x86 architecture on top of a Sparc or Power chip or vice versa.

Well - you have to emulate the CPU. There were products running a virtual x86 PC on a Mac with PowerPC chip. And IBM has a product called PowerVM Lx86 to run software compiled for Linux x86 directly on a PowerLinux machine.

As I said, "most" VM technologies won't do that :-)


-- Reuti


And i haven't made compute nodes on direct physical nodes (pc's) becoz in college it is not possible to take whole lab of 32 pc's for your work  so i ran on vm.

Yes, but at least it would let you test the setup to run MPI across even a couple of pc's - this is simple debugging practice.

In Rocks cluster, frontend give the same kickstart to all the pc's so openmpi version should be same i guess.

Guess? or know? Makes a difference - might be worth testing.

Sir 
mpiformatdb is a command to distribute database fragments to different compute nodes after partitioning od database.
And sir have you done mpiblast ?

Nope - but that isn't the issue, is it? The issue is with the MPI setup.



On Fri, Apr 4, 2014 at 4:48 AM, Ralph Castain <rhc@open-mpi.org> wrote:
What is "mpiformatdb"? We don't have an MPI database in our system, and I have no idea what that command means

As for that error - it means that the identifier we exchange between processes is failing to be recognized. This could mean a couple of things:

1. the OMPI version on the two ends is different - could be you aren't getting the right paths set on the various machines

2. the cluster is heterogeneous

You say you have "virtual nodes" running on various PC's? That would be an unusual setup - VM's can be problematic given the way they handle TCP connections, so that might be another source of the problem if my understanding of your setup is correct. Have you tried running this across the PCs directly - i.e., without any VMs?


On Apr 3, 2014, at 10:13 AM, Nisha Dhankher -M.Tech(CSE) <nishadhankher-coaeseeit@pau.edu> wrote:

i first formatted my database with mpiformatdb command then i ran command :
mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i query.fas -o output.txt
but then it gave this error 113 from some hosts and continue to run for other but with no  results even after 2 hours lapsed.....on rocks 6.0 cluster with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb ram to each


On Thu, Apr 3, 2014 at 10:41 PM, Nisha Dhankher -M.Tech(CSE) <nishadhankher-coaeseeit@pau.edu> wrote:
i also made machine file which contain ip adresses of all compute nodes + .ncbirc file for path to mpiblast and shared ,local storage path....
Sir
I ran the same command of mpirun on my college supercomputer 8 nodes each having 24 processors but it just running....gave no result uptill 3 hours...


On Thu, Apr 3, 2014 at 10:39 PM, Nisha Dhankher -M.Tech(CSE) <nishadhankher-coaeseeit@pau.edu> wrote:
i first formatted my database with mpiformatdb command then i ran command :
mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i query.fas -o output.txt
but then it gave this error 113 from some hosts and continue to run for other but with results even after 2 hours lapsed.....on rocks 6.0 cluster with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb ram to each



On Thu, Apr 3, 2014 at 8:37 PM, Ralph Castain <rhc@open-mpi.org> wrote:
I'm having trouble understanding your note, so perhaps I am getting this wrong. Let's see if I can figure out what you said:

* your perl command fails with "no route to host" - but I don't see any host in your cmd. Maybe I'm just missing something.

* you tried running a couple of "mpirun", but the mpirun command wasn't recognized? Is that correct?

* you then ran mpiblast and it sounds like it successfully started the processes, but then one aborted? Was there an error message beyond just the -1 return status?


On Apr 2, 2014, at 11:17 PM, Nisha Dhankher -M.Tech(CSE) <nishadhankher-coaeseeit@pau.edu> wrote:

error btl_tcp_endpint.c: 638 connection failed due to error 113

In openmpi: this error came when i run my mpiblast program on rocks cluster.Connect to hosts failed on ip 10.1.255.236,10.1.255.244 . And when i run following command linux_shell$ perl -e 'die$!=113' this msg comes: "No route to host at -e line 1." shell$ mpirun --mca btl ^tcp shell$ mpirun --mca btl_tcp_if_include eth1,eth2 shell$ mpirun --mca btl_tcp_if_include 10.1.255.244 was also executed but it did nt recognized these commands....nd aborted.... what should i do...? When i run my mpiblast program for the frst time then it give mpi_abort error...bailing out of signal -1 on rank 2 processor...then i removed my public ethernet cable....and then give btl_tcp endpint error 113....

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users