Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Crashes over TCP/ethernet but not on shared memory
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-10-27 17:59:09


On Oct 24, 2008, at 12:10 PM, V. Ram wrote:

> Resuscitating this thread...
>
> Well, we spent some time testing the various options, and Leonardo's
> suggestion seems to work!
>
> We disabled TCP Segment Offloading on the e1000 NICs using "ethtool -K
> eth<X> tso off" and this type of crash no longer happens.
>
> I hope this message can help anyone else experiencing the same issues.
> Thanks Leonardo!
>
> OMPI devs: does this imply bug(s) in the e1000 driver/chip? Should I
> contact the driver authors?

Maybe? :-)

I don't think that we do anything particularly whacky, TCP-wise -- we
just open sockets and read/write plain vanilla data down the fd's. So
it might be worth contacting them and asking if there are any known
issues...?

-- 
Jeff Squyres
Cisco Systems