For switch based hardware discovery, the servers are identified through the switches and switchposts they are directly connected to.
In this document, the following configuration is used in the example
Management Node info:
MN Hostname: xcat1 MN NIC info for Management Network(Host network): eth1, 10.0.1.1/16 MN NIC info for Service Network(FSP/BMC nework): eth2, 22.214.171.124/16 Dynamic IP range for Hosts: 10.0.100.1-10.0.100.100 Dynamic IP range for FSP/BMC: 126.96.36.199-188.8.131.52
Compute Node info:
CN Hostname: cn1 Machine type/model: 8247-22L Serial: 10112CA IP Address: 10.0.101.1 Root Password: cluster Desired FSP/BMC IP Address: 184.108.40.206 DHCP assigned FSP/BMC IP Address: 220.127.116.11 FSP/BMC username: ADMIN FSP/BMC Password: admin
Switch name: switch1 Switch username: xcat Switch password: passw0rd Switch IP Address: 10.0.201.1 Switch port for Compute Node: port0
Configure network table¶
Normally, there will be at least two entries for the two subnet on MN in
networks table after xCAT is installed:
#tabdump networks #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable "10_0_0_0-255_255_0_0","10.0.0.0","255.255.0.0","eth1","<xcatmaster>",,"10.0.1.1",,,,,,,,,,,,, "50_0_0_0-255_255_0_0","18.104.22.168","255.255.0.0","eth2","<xcatmaster>",,"22.214.171.124",,,,,,,,,,,,,
Run the following command to add networks in
networks table if there are no entries in it:
Set the correct NIC from which DHCP server provide service:
chdef -t site dhcpinterfaces=eth1,eth2
Add dynamic range in purpose of assigning temporary IP address for FSP/BMCs and hosts:
chdef -t network 10_0_0_0-255_255_0_0 dynamicrange="10.0.100.1-10.0.100.100" chdef -t network 50_0_0_0-255_255_0_0 dynamicrange="126.96.36.199-188.8.131.52"
Update DHCP configuration file:
makedhcp -n makedhcp -a
Config passwd table¶
Set required passwords for xCAT to do hardware management and/or OS provisioning by adding entries to the xCAT
# tabedit passwd # key,username,password,cryptmethod,authdomain,comments,disable
For hardware management with ipmi, add the following line:
Verify the genesis packages¶
The xcat-genesis packages should have been installed when xCAT was installed, but would cause problems if missing. xcat-genesis packages are required to create the genesis root image to do hardware discovery and the genesis kernel sits in
/tftpboot/xcat/. Verify that the
genesis-base packages are installed:
rpm -qa | grep -i genesis
dpkg -l | grep -i genesis
If missing, install them from the
xcat-deps package and run
mknb ppc64 to create the genesis network boot root image.
In order to differentiate one node from another, the admin needs to predefine node in xCAT database based on the switches information. This consists of two parts:
The predefined switches will represent devices that the physical servers are connected to. xCAT need to access those switches to get server related information through SNMP v3.
So the admin need to make sure those switches are configured correctly with SNMP v3 enabled. <TODO: The document that Configure Ethernet Switches>
Then, define switch info into xCAT:
nodeadd switch1 groups=switch,all chdef switch1 ip=10.0.201.1 tabch switch=switch1 switches.snmpversion=3 switches.username=xcat switches.password=passw0rd switches.auth=sha
Add switch into DNS using the following commands:
makehosts switch1 makedns -n
Predefine Server node
After switches are defined, the server node can be predefined with the following commands:
nodeadd cn1 groups=powerLE,all chdef cn1 mgt=ipmi cons=ipmi ip=10.0.101.1 bmc=184.108.40.206 netboot=petitboot installnic=mac primarynic=mac chdef cn1 switch=switch1 switchport=0
[Optional] If more configuration planed to be done on BMC, the following command is also needed.
chdef cn1 bmcvlantag=<vlanid> # tag VLAN ID for BMC chdef cn1 bmcusername=<desired_username> chdef cn1 bmcpassword=<desired_password>
In order to do BMC configuration during the discovery process, set
chdef cn1 chain="runcmd=bmcsetup"
[Optional] More operation plan to do after hardware disocvery is done,
ondiscover option can be used.
For example, configure console, copy SSH key for OpenBMC, then disable
powersupplyredundancychdef cn01 -p chain="ondiscover=makegocons|rspconfig:sshcfg|rspconfig:powersupplyredundancy=disabled"
|is used to split commands, and
:is used to split command with its option.
Set the target osimage into the chain table to automatically provision the operating system after the node discovery is complete.
chdef cn1 -p chain="osimage=<osimage_name>"
For more information about chain, refer to Chain
Add cn1 into DNS:
makehosts cn1 maekdns -n
Discover server and define¶
After environment is ready, and the server is powered, we can start server discovery process. The first thing to do is discovering the FSP/BMC of the server. It is automatically powered on when the physical server is powered.
Use the bmcdiscover command to discover the BMCs responding over an IP range and write the output into the xCAT database. This discovered BMC node is used to control the physical server during hardware discovery and will be deleted after the correct server node object is matched to a pre-defined node. You must use the
-w option to write the output into the xCAT database.
To discover the BMC with an IP address range of 220.127.116.11-100:
bmcdiscover --range 18.104.22.168-100 -z -w
The discovered nodes will be written to xCAT database. The discovered BMC nodes are in the form node-model_type-serial. To view the discovered nodes:
bmcdiscover command will use the username/password from the
passwd table corresponding to
key=ipmi. To overwrite with a different username/password use the
-p option to
Start discovery process¶
To start discovery process, just need to power on the PBMC node remotely with the following command, and the discovery process will start automatically after the host is powered on:
rpower node-8247-42l-10112ca on
[Optional] If you’d like to monitor the discovery process, you can use:
makegocons node-8247-42l-10112ca rcons node-8247-42l-10112ca
Verify node definition¶
The following is an example of the server node definition after hardware discovery:
#lsdef cn1 Object name: cn1 arch=ppc64 bmc=22.214.171.124 cons=ipmi cpucount=192 cputype=POWER8E (raw), altivec supported groups=powerLE,all installnic=mac ip=10.0.101.1 mac=6c:ae:8b:02:12:50 memory=65118MB mgt=ipmi mtm=8247-22L netboot=petitboot postbootscripts=otherpkgs postscripts=syslog,remoteshell,syncfiles primarynic=mac serial=10112CA supportedarchs=ppc64 switch=switch1 switchport=0