test_hca_stateΒΆ

test_hca_state (part of the BEF_Scripts for xCAT) v3.2.27

Usage: test_hca_state NODERANGE [FILTER] | xcoll

    --help  Display this help output.

    NODERANGE
        An xCAT noderange on which to operate.

    FILTER
        A string to match in the output, filtering out everything else.  This
        is passed to "egrep" and can be a simple string or a regular
        expression.

Purpose:

    This tool provides a quick and easily repeatable method of
    validating key InfiniBand adapter (HCA) and node based InfiniBand
    settings across an entire cluster.

    Having consistent OFED settings, and even HCA firmware, can be very
    important for a properly functioning InfiniBand fabric.  This tool
    can help you confirm that your nodes are using the settings you
    want, and if any nodes have settings discrepancies.


Example output:

    #
    # This example shows that all of rack 14 has the same settings.
    #
    root@mgt1:~ # test_hca_state rack14 | xcoll
    ====================================
    rack14
    ====================================
    OFED Version: MLNX_OFED_LINUX-2.0-3.0.0.3 (OFED-2.0-3.0.0):
    mlx4_0
      PCI: Gen3
      Firmware installed: 2.30.3200
      Firmware active:    2.30.3200
      log_num_mtt:      20
      log_mtts_per_seg: 3
      Port 1: InfiniBand    phys_state: 5: LinkUp
        state: 4: ACTIVE
        rate: 40 Gb/sec (4X FDR10)
        symbol_error: 0
        port_rcv_errors: 0
      Port 2: InfiniBand    phys_state: 3: Disabled
        state: 1: DOWN
        rate: 10 Gb/sec (4X)
        symbol_error: 0
        port_rcv_errors: 0

      IPoIB
        recv_queue_size: 8192
        send_queue_size: 8192
        ib0:
          Mode: datagram
          MTU:  4092
          Mode: up
        ib1:
          Mode: datagram
          MTU:  4092
          Mode: up


    #
    # This example uses a FILTER on the word 'firmware'.  In this case, we've
    # upgraded the firmware across rack11 and rack12.
    #
    #   - On rack11, we've also restarted the IB stack (/etc/init.d/openibd
    #     restart) to activate the new firmware.
    #
    #   - Rack 12 has also been updated, as we can see from the 'Firmware
    #     installed' line, but it's nodes are still running with their prior
    #     level of firmware and must reload the IB stack to have it take effect.
    #
    root@mgt1:~ # test_hca_state rack11,rack12 firmware | xcoll
    ====================================
    rack11
    ====================================
      Firmware installed: 2.30.3200
      Firmware active:    2.30.3200

    ====================================
    rack12
    ====================================
      Firmware installed: 2.30.3200
      Firmware active:    2.11.1260

Author: Brian Finley