On Oct 24, 2008, at 12:10 PM, V. Ram wrote:
> Resuscitating this thread...
>
> Well, we spent some time testing the various options, and Leonardo's
> suggestion seems to work!
>
> We disabled TCP Segment Offloading on the e1000 NICs using "ethtool -K
> eth<X> tso off" and this type of crash no longer happens.
>
> I hope this message can help anyone else experiencing the same issues.
> Thanks Leonardo!
>
> OMPI devs: does this imply bug(s) in the e1000 driver/chip? Should I
> contact the driver authors?
Maybe? :-)
I don't think that we do anything particularly whacky, TCP-wise -- we
just open sockets and read/write plain vanilla data down the fd's. So
it might be worth contacting them and asking if there are any known
issues...?
--
Jeff Squyres
Cisco Systems
|