Replacing nodes nondisruptively: 2145-SV1

The following procedures describe how to nondisruptively replace most nodes with SAN Volume Controller 2145-SV1 nodes.

Before you begin

The replacement procedures are nondisruptive because no changes are required to your networking environment. The replacement 2145-SV1 node uses the same worldwide node name (WWNN) as the node that you are replacing. An alternative to this procedure is to replace nodes disruptively either by moving volumes to a new I/O group or by rezoning the SAN. However, the disruptive procedures require more work on the hosts.

Some system performance might be lost when the nodes are being replaced. Volumes that are managed by the I/O group that contains the node to be replaced becomes degraded when one of the nodes is shut down at the start of this procedure. System performance returns when both nodes are running and are accessing a new array of SSD or HDD drives in an expansion enclosure or in backend storage.

This task assumes that the following conditions are met. If any conditions are not met, do not continue this task unless you are instructed to do so by IBM® support.

  • The system must be running system software level 7.7.1 or higher to recognize the new 2145-SV1 node. Use the management GUI to display information about the system software level or enter the lssystem command.
  • If encryption is enabled on the system, a new encryption licence must be installed on each new node before it can be added to the system. Use the management GUI to install the new license; see Activating encryption license for more information.
  • If you are replacing a 2145-CG8 node, that node must not contain any flash drives. Remove any flash drives from the storage pool and change the drive status to unused before you remove them from the 2145-CG8 node.
  • The replacement 2145-SV1 node must have at least as many Fibre Channel, Fibre Channel over Ethernet (FCoE), and Ethernet ports as the node that is being replaced.
  • All nodes that are configured in the system are present and online.
  • All errors in the system event log are addressed and marked as fixed.
  • No volumes, managed disks (MDisks), or external storage systems have a status of degraded or offline.
  • You backed up the system configuration and saved the svc.config.backup.xml file.
  • 2145-SV1 nodes support 4-port 16 Gbps Fibre Channel and 10 Gbps Ethernet adapters. 2145-SV1 can also support optional 2-port 25 Gbps Ethernet adapters (RoCE or iWARP) for iSCSI.
  • Set the Fibre Channel device driver on each Fibre Channel attached host to time out a missing fibre path in 3 seconds or less. If it is not practical to check the parameters of the Fibre Channel driver on each host, you must reboot the new 2145-SV1 node shortly after it is added to the system. The fibre paths to the host then stop long enough to ensure that they are recovered properly when the 2145-SV1 is active again.
    Tip: The timeout setting for the Emulex Fibre Channel device driver might default to 30 seconds, so it needs to be changed.
Important Notes:
  1. Review all of the following steps before you proceed with this task. If you are not familiar with the system environment or the tasks that are described, do not continue this procedure.
  2. Review the detailed information in Setting the Fibre Channel port mapping: 2145-SV1. You need to use this information to complete this task.
  3. Ensure the replacement 2145-SV1 node has at least as much RAM as the node that is being replaced.
  4. The node ID might change during this task; the node name might also change. After the system assigns the node ID, the ID cannot be changed. However, you can change the node name after this task is complete.

Procedure

  1. Confirm that the system is running software level 7.7.1 or later. If system software level 7.7.1 or later is not installed, the system software must be upgraded before you continue this procedure. You can use the management GUI to view and update the software level. For more information, see Updating the system.

Collect important information about the node you are replacing

  1. Determine the ID, name, I/O group ID, I/O group name, and system configuration node status for the node that you want to replace.

    To determine this information, you can use the management GUI or complete the following steps.

    1. Issue the lsnode command from the command line interface.
      svcinfo lsnode -delim : 
      The system displays information about the nodes that are currently defined in the system.
    2. Record the information from the lsnode command output in Table 1. This information identifies the node, the I/O group in which it belongs, and iSCSI information.
      Tip: If one of the nodes that you want to replace is the system configuration node (config_node:yes), replace it last.
      Table 1. Configuration information about the nodes to be replaced
      lsnode command output lsnodevpd command output
      id name WWNN IO_group_id IO_group_name config_node iscsi_name front_panel_id
                     
                     
                     
                     
    3. Find the front panel ID of the node you want to replace. Use this ID to determine the physical location of the node. Issue the lsnodevpd command, where node_name or node_id is the name or ID of the node. (If you already know the physical location of the node that you want to replace, you can go to the next step.)
      lsnodevpd node_name or node_id
      The system displays detailed information about the node.
    4. Record the value in the front_panel_id column in Table 1.
  2. Confirm that no hosts have dependencies on the node that you are replacing. Use either the management GUI or issue the lsdependentvdisks command.
    • If you used the management GUI in step 2, select Monitoring > System. Right-click the node and select Show Dependent Volumes to display all the volumes that depend on a node.
    • If you issued commands in step 2, use the lsdependentvdisks command to view dependent volumes. Specify the node parameter, where node_name or node_id is the name or ID of the node.
      lsdependentvdisks -node node_id_or_name
    1. If dependent volumes exist, determine whether the volumes are being used. If the volumes are being used, either restore the redundant configuration or suspend the host application.
    2. If a dependent quorum disk is reported, repair the access to the quorum disk or modify the quorum disk configuration.
  3. Issue the lsservicestatus command to display information about the FC ports of the node to be replaced.
    sainfo lsservicestatus
  4. Record the fc_io_port_id and fc_io_port_WWPN for each port in Table 2. This information is required to check the port mapping when you add the new node.
    Table 2. Information about the FC ports of the node to be replaced
    lsservicestatus command output
    fc_io_port_id fc_io_port_WWPN
       
       
       
       
  5. If Ethernet port IP addresses are configured on the system, enter the lsportip command to display the current settings so that they can be applied to the replacement nodes.
    lsportip -delim : 
    The system displays information about the Ethernet ports that are defined on the specified node.
  6. Record the information about the Ethernet ports on the node that you want to replace in Table 3.
    Table 3. Information about the Ethernet ports of the node to be replaced
    lsportip command output
    node_id node_name IP_address subnet_mask IP_address_6 prefix gateway_port_id
                 
                 
                 
                 

Remove the node from the system

  1. Record and mark the order of the Fibre Channel or Ethernet cables with the node port number before you remove the cables from the back of the node.
    Important: Do not connect the replacement node to different ports on the switch or to a different switch.

    You must reconnect the cables in the exact order on the replacement node to avoid issues when the replacement node is added to the system. If the cables are not connected in the same order, the port IDs can change. If the port IDs change, the host system might not be able to access volumes. See the hardware documentation specific to your model to determine how the ports are numbered.

  2. If the node has 10 Gbps Ethernet IP addresses configured, delete these settings by using the rmportip command, ensuring that you note the current settings.
    rmportip -node node_ID_or_name port_ID
  3. If encryption is active on the node you are replacing, enter the following command to deactivate the feature.
    deactivatefeature feature_id

    Issue the lsfeature command to determine the correct license_key value. See Disabling encryption for more details.

  4. Issue the rmnode command to delete this node from the system and I/O group. The node_ID_or_name value identifies the node that you want to delete.
    rmnode node_ID_or_name
  5. Enter the lsnode command to ensure that the node is no longer a member of the system:
    lsnode
    The system displays a list of nodes. Before you continue to the next step, ensure that the removed node is not listed in the command output.
  6. Optional: If you want to use the removed node as a spare node, change the WWNN and iSCSI name of each node that you deleted to 1FFFF.
    • For a 2145-DH8 or 2145-SV1 node.
      1. Power on the node.
      2. Issue the following chvpd command.
        satask chvpd -wwnn FFFFFFFFFFFFFFFF

Prepare the replacement 2145-SV1 node

  1. Install the replacement node in the rack. For more information, see Installing the SAN Volume Controller 2145-SV1 hardware.
    Important: Do not connect the Fibre Channel or Ethernet cables during this step.
  2. Power on the replacement node.
  3. Use a CAT 5 Ethernet cable to directly attach a computer with a web browser to the technician port of the replacement node.
    1. If DHCP is configured on the computer, the installation GUI automatically displays when the new web page opens. For more information, see Technician port for node access.

      To access the service assistant GUI, select the wrench (spanner) icon in the installation GUI.

    2. If Secure Shell (SSH) software is installed on the computer, you can also access the command line interface at 192.168.0.1.

      You can then log on as superuser, where the default superuser password is passw0rd.

  4. Find the WWNN of the replacement 2145-SV1 node. This name can be reused by another 2145-SV1 node.

    To find the WWNN, use the service assistant GUI or enter the following command.

    sainfo lsservicestatus
  5. Assign the WWNN and a hardware location in the new 2145-SV1 node for each FC port that is defined on the node you are replacing.

    To do so, use the service assistant GUI or enter the appropriate chvpd command for the port mapping information.

    satask chvpd -wwnn wwnn -fcportmap AB-CD,AB-CD,AB-CD,AB-CD
    Note: You must create the port mapping before you can add the new node to the system. For more information, see Setting the Fibre Channel port mapping: 2145-SV1.
    When the command completes, the system creates the new port mappings on the replacement 2145-SV1 node. The node then reboots to apply the new settings.
  6. Attach the Fibre Channel and Ethernet cables to the replacement node.
  7. Verify that the last 5 characters of the WWNN are correct. To do so, use the management GUI or enter the lsnodecandidate command on the system command line.
    lsnodecandidate
  8. If encryption is active on the system, it must also be installed and active on the replacement node. To activate the feature, issue the following command, where key is the encryption key.
    activatefeature -licensekey key

    If you do not activate the license on the new node, you will receive message CMMVC8784E.

  9. Enter the lsservicestatus command to verify that the fc_io_port_id and fc_io_port_WWPN on the 2145-SV1 node match the values that are recorded from the lsservicestatus output from the original node.
    sainfo lsservicestatus
    1. If there are differences, review Setting the Fibre Channel port mapping: 2145-SV1 and correct the mapping, as needed.
    2. If the values match, connect the Fibre Channel or Ethernet cables to the host adapters.
  10. Add the new 2145-SV1 replacement node to the system. You can use the management GUI or enter the addnode command, where WWNN and iogroup_name_or_id are the values that you recorded for the original node.
    addnode -wwnodename WWNN -iogrp iogroup_name_or_id

    Ensure that the new node has the same name as the original node and is in the same I/O group as the original node. Refer to the data that you recorded in Table 1 in Step 2.b.

    The system reassigns the 2145-SV1 node with the name that was used originally for the node that was replaced. If the original name of the node was automatically assigned by the system, it is not possible to reuse that name. If the name starts with node, it was automatically assigned. In this case, either specify a different name that does not start with node or do not use the name parameter so that the system automatically assigns a new name to the node.

    Important: Ensure that all other nodes in the cluster are running system software level 7.7.1 or later. Otherwise, the replacement 2145-SV1 node will not be recognized. For more information, see Updating the system.
  11. If Ethernet IP addresses were previously configured on the replaced node, configure the Ethernet ports on the new node to reuse those settings. Ethernet port IP addresses can be configured by using the management GUI or the cfgportip command. Specify the appropriate values that you noted in Table 3 in Step 7.
    • For IPv4 IP addresses
      cfgportip -node node_name_or_ID -ip IPv4_addr
      -mask subnet_mask -gw gateway port ID
    • For IPv6 IP addresses
      cfgportip -node node_name_or_ID -ip_6 IPv6_addr
      -prefix_6 prefix -gw_6 gateway port ID
    Important:
    1. Both nodes in the I/O group cache data; however, the cache sizes are asymmetric. The replacement node is limited by the cache size of the partner node in the I/O group. Therefore, it is possible that the replacement node does not use the full cache size until you replace the other node in the I/O group.
    2. You do not need to reconfigure the host multipathing device drivers because the replacement node uses the same WWNN and WWPN as the previous node. The multipathing device drivers detect the recovery of paths that are available to the replacement node.
    3. The host multipathing device drivers take approximately 30 minutes to recover the paths. Do not update the other node in the I/O group for at least 30 minutes after you successfully update the first node in the I/O group. If you have other nodes in different I/O groups to update, you can do those updates while you wait.
    4. If you are unable to check that the Fibre Channel device driver of every host is set to time out a Fibre Channel path in 3 seconds or less, reboot the new 2145-SV1 node now to ensure that the fibre path becomes active when the node becomes active again.
  12. Important Ask the host administrator to query the paths on each host to ensure that all paths to the replacement node are active before you proceed to the next step. If you are using the IBM Multipath Subsystem Device Driver (SDD), issue the datapath query device command to query the paths. Documentation that is provided with your multipathing device driver shows how to query paths. Force the multipath driver to rescan for paths if the expected paths are not active.
  13. Repeat Step 2 through Step 25 for each node that you replace.