I am actually running the released 1.1. I can send you my code, if you want, and you could try running it off a single node with -np 4 or 5 (oversubscribing) and see if you get a BUS_ADRALN error off one node. The only restriction to compiling the code is that X libs be available (display is not required for the execution though it's more fun :P)
Le mercredi 28 juin 2006 13:02, Terry D. Dontje a écrit :
> Well, I've been using the trunk and not 1.1. I also just built
> 1.1.1a1r10538 and ran
> it with no bus error. Though you are running 1.1b5r10421 so we're not
> running the
> same thing, as of yet.
> I have a cluster of two v440 that have 4 cpus each running Solaris 10.
> The tests I
> am running are np=2 one process on each node.
> Eric Thibodeau wrote:
> > I was about to comment on this. could you tell me the specs of your machine. As you will notice in "my thread", I am running into problems on Sparc SPM systems where the CPU borad's RTC are in a doubtfull state. Are-you running 1.1 on SMP machines. If so, on how many procs and what hardware/OS version is this running off?
> >Le mercredi 28 juin 2006 10:35, Terry D. Dontje a écrit :
> >>Can you set your limit coredumpsize to non-zero rerun the program
> >>and then get the stack via dbx?
> >>So, I have a similar case of BUS_ADRALN on SPARC systems with an
> >>older version (June 21st) of the trunk. I've since run using the latest
> >>trunk and the
> >>bus went away. I am now going to try this out with v1.1 to see if I get
> >>results. Your stack would help me try and determine if this is an
> >>OpenMPI issue
> >>or possibly some type of platform problem.
> >>There is another thread with Eric Thibodeau that I am unsure if it is
> >>the same issue
> >>as either of our situation.
Neural Bucket Solutions Inc.
T. (514) 736-1436
C. (514) 710-0517