Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Alex Tumanov (atumanov_at_[hidden])
Date: 2007-02-06 11:40:31


Thank you for your reply, Reese!

> What version of GM are you running?
# rpm -qa |egrep "^gm-[0-9]+|^gm-devel"
gm-2.0.24-1
gm-devel-2.0.24-1
Is this too old?

> And are you sure that gm_board_info
> shows all the nodes that are listed in your machine file?
Yes, that was the issue - bad cable connection to my compute node
prevented it from being seen on the fabric :( Thanks for pointing this
out for me.

> Could you send
> a copy of your gm_board_info output , please?
Sure:
# ./gm_board_info
GM build ID is "2.0.24_Linux_rc20051223164441PST
@dr11.myco.com:/usr/src/redhat/BUILD/gm-2.0.24_Linux Tue Jan 30
23:07:45 EST 2007."

Board number 0:
  lanai_cpu_version = 0x0a00 (LANai10.0)
  lanai_sram_size = 0x001fe000 (2040K bytes)
ROM settings:
  MAC=00:60:dd:49:1e:bf
  SN=187449
  PC=M3F-PCIXD-2
  PN=09-02666
LANai time is 0x209b211b12 ticks, or about 1043 minutes since reset.
Mapper is 00:60:dd:49:99:96.
Map version is 1965903.
2 hosts.
Network is fully configured.
This node is "dr11.myco.com"
Board has room for 16 ports, 1559 nodes/routes, 16384 cache entries
          Port token cnt: send=61, recv=253
Port: Status PID
   0: BUSY 7489 (this process [gm_board_info])
   1: BUSY 25113
Route table for this node follows:
gmID MAC Address gmName Route
---- ----------------- -------------------------------- ---------------------
   1 00:60:dd:49:1e:bf dr11.myco.com (this node)
   2 00:60:dd:49:99:96 dr05.myco.com 81 (mapper)

> A mismatch between the list
> of nodes actually configured onto the Myrinet fabric and the machine file is
> a common source of errors like this. The mismatch could be caused by cable
> failure or other mapping issues.
Could you elaborate on the mapping issues you mentioned? What are they?

> Why GM instead of MX, by the way?
We have a few MX cards in-house, but no MX switch due to its current
market price. So we're only able to perform MX testing using
direct-connection cables, which is not very exciting :) On the
contrary, we've already had GM boards and a switch and found it
sufficient for OpenMPI testing purposes. Would be great to upgrade to
MX in the near future.

Thank you very much for your help.

Sincerely,
Alex.