Another nice tools for ib monitoring.
1. perfquery (part of OFED), example of report:
Port counters: Lid 12 port 1
PortSelect:......................1
CounterSelect:...................0x0000
SymbolErrors:....................7836
LinkRecovers:....................255
LinkDowned:......................0
RcvErrors:.......................24058
RcvRemotePhysErrors:.............6159
RcvSwRelayErrors:................0
XmtDiscards:.....................3176
XmtConstraintErrors:.............0
RcvConstraintErrors:.............0
LinkIntegrityErrors:.............0
ExcBufOverrunErrors:.............0
VL15Dropped:.....................0
XmtData:.........................1930
RcvData:.........................1708
XmtPkts:.........................114
RcvPkts:.........................114
2. collectl - http://collectl.sourceforge.net/, example of report:
#<--------CPU--------><-----------Memory----------><----------InfiniBand---------->
#cpu sys inter ctxsw free buff cach inac slab map KBin pktIn KBOut
pktOut Errs
1 0 847 1273 1G 264M 3G 594M 1G 234M 2 29
2 29 123242
2 1 851 2578 1G 264M 3G 594M 1G 234M 1 5
1 5 123391
Pavel Shamis (Pasha) wrote:
> SLIM H.A. wrote:
>
>> Is it possible to get information about the usage of hca ports similar
>> to the result of the mx_endpoint_info command for Myrinet boards?
>>
>> The ibstat command gives information like this:
>>
>> Port 1:
>> State: Active
>> Physical state: LinkUp
>>
>> but does not say whether a job is actually using an infiniband port or
>> comunicates through plain ethernet.
>>
>> I would be grateful for any advice
>>
>>
> You have access to some counters in
> /sys/class/infiniband/mlx4_0/ports/1/counters/ (counters for hca -
> mlx4_0 , port 1)
>
>
--
Pavel Shamis (Pasha)
Mellanox Technologies
|