PowerVM

 View Only

PowerVM Remote Restart Enhancements

By HARIGANESH MURALIDHARAN posted Fri June 05, 2020 05:30 AM

  
SRR

Overview

PowerVM’s Remote Restart technology allows VMs to be restarted on another system when a Power system has failed.  This functionality requires PowerVM Enterprise Edition and accelerates recovery times by allowing VMs to be restarted on other systems without needing manual processes to recreate the VM definitions, VM states and storage configurations.  Remote Restart handles all this for you.  When deployed with PowerVC, you can initiate a restart of all VMs on a failed host with just a few clicks of the mouse.

 

PowerVM Remote Restart is not a replacement for PowerHA, which is an application clustering solution and has a much faster recovery time objective than Remote Restart.  The Remote Restart solution requires that the VM is rebooted on the new server, the file systems are checked, and the application recovers just like if you had a power failure.  These steps are much longer than a PowerHA failover which typically can be done in seconds.

 

Unlike Live Partition Mobility (LPM), the partition is restarted on target system which results in downtime for the applications. LPM cannot be used in this situation since the source system is down.

 

There are two types of Remote Restart

  • Legacy Remote Restart with Reserved Storage Device
    • Introduced in Power 7 & supported with HMC V8R8.1.0 & 760 firmware or later
  • Simplified Remote Restart
    • Introduced in Power 8 with firmware 820 , HMC V8R8.2.0 & VIOS 2.2.3.4.

This blog only provides details on the enhancements for the Simplified Remote Restart (SRR) function (unless explicitly specified as legacy remote restart).

Simplified Remote Restart Configuration

SRR config

Simplified Remote Restart Enhancements

The following enhancements are being introduced with HMC V8R8.5.0.  Remote Restart operations can be done via the CLI or REST API.

  • Remote MC Remote Restart
    • Source and target systems managed by different HMCs
  • Remote Restart with no connection to system ("dead host")
    • Complete server outage including the service processor.  One example is that the system has completely lost power.
  • Live Partition Mobility Override
    • Migrating SRR-capable partitions between Power 7 & Power 8
  • Manage Partition UI & Template
    • Templates for creating partition with SRR capability
    • Manage Partition (enhanced+ UI) to enable/disable SRR capability
  • Auto cleanup on source system after successful RR
  • User Specifications/Overrides
    • Shared Processor Pool
    • Virtual FC Mappings
  • Increase in number of concurrent Remote Restart operations to 32 per target system.

Remote MC Remote Restart

  • Source and target systems managed by two different HMCs.  Both HMCs should be V8R8.5.0 or later.
  • Need to setup authentication between the managing HMC’s before remote restart
    • mkauthkeys –-ip <target hmc ip/host name> -u <target hmc user name> --passwd <password>
  • CLI :
    • rrstartlpar –o validate –m <source system> -t <target system> -p  <partition name> | --id <partition id> --ip <target hmc ip/host name> [-u target hmc user name]
    • rrstartlpar –o restart –m <source system> -t <target system> -p  <partition name> | --id <partition id> --ip <target hmc ip/host name> [-u target hmc user name]

Remote Restart without Service Processor Connection

  • System state as seen in HMC is “No Connection”
  • Source system must have been connected in Operating state before the server outage or connection was lost.
  • New override introduced to be used after making sure that there is actually a server outage.
  • CLI :
    • rrstartlpar –o validate –m <source system> -t <target system> -p  <partition name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>] [-u target hmc user name]
    • rrstartlpar –o restart –m <source system> -t <target system> -p  <partition name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>] [-u target hmc user name]

LPM Override for SRR Partition Migration

  • Override to allow migration of SRR capable partitions between P7 & P8 systems
  • For Cross-HMC LPM Operations, both HMC’s need to be V8R8.5.0 or later if the override is specified
  • CLI    
    • migrlpar –o v –m <source system> -t <target system> -p <partition name> | --id <partition id> --requirerr 1|2
    • migrlpar –o m –m <source system> -t <target system> -p <partition name> | --id <partition id> --requirerr 1|2
    • 1 – Yes  2 – If possible

SRR Partition Migration Override Options

Source system partition SRR Capable

Target system supports SRR

LPM Override

LPM Success

Partition SRR capable on target

Yes

 Yes

Yes/If Possible

Yes

Yes

Yes

 No

Yes/Override not specified

No

NA

Yes

 No

If Possible

Yes

No

No

 Yes

Yes/If Possible

Yes

Yes

No

 No

Yes

No

NA

No

 No

If Possible/Override not specified

Yes

No

Manage Partition GUI

  • SRR capability can be enabled/disabled when partition is not active
    • If system supports SRR, only the Simplified RR option is shown in the UI even if the partition is enabled with legacy Remote Restart.
  • Remote Restart State is displayed
  • Option to refresh configuration data stored for SRR

Partition Templates

  1. Starter/pre-defined partition template with Simplified Remote Restart enabled
  2. Capture a partition enabled with SRR as a template
  3. Deploy partition with SRR capability from templates
    • Enabled
      • Partition is deployed with SRR capability if system supports Simplified RR
      • Template Deploy fails if system doesn’t support Simplified RR
    • Disable
      • Partition is deployed without SRR capability
    • Enable If Possible
      • Partition is deployed with SRR capability if system supports SRR.
      • Partition is deployed without SRR capability if system doesn’t support SRR

Auto Cleanup

  • Auto Cleanup of a remote restarted partition on source system is performed when
    • Source system state comes back to operating state
    • Partition remote restart status is “Remote Restarted”
    • RMC for the VIOS partitions serving the clients is active
  • Auto Cleanup is done without force option which means if there is any failure in the auto cleanup like server adapter for a client adapter is not found or RMC command sent to the VIOS fails, then the cleanup would fail and leave the LPAR on the source system with remote restart status of “Source Side Cleanup Failed”.
  • User can trigger the manual cleanup using the rrstartlpar command
  • When PowerVC is used to orchestrate Remote Restart
    • Auto cleanup can be disabled (by default, auto cleanup is enabled)
    • Setting is maintained across upgrades, but not on fresh install.
    • CLI :
      • rrstartlpar –o set -r mc –i “auto_cleanup_enabled=0|1”
      • lsrrstartlpar –r mc

User Overrides/Specifications

  • User can specify the Shared Processor Pool to be used on the target system
  • User can specify the Virtual FC Mappings
    • like target vios lpar, target vios server adapter slot number, target FC physical port
  • CLI
    • rrstartlpar –o validate –m <source system> -t <target system> -p <partition name> | --id <partition id> -i “shared_proc_pool_id=<target spp id>|shared_proc_pool_name=<target spp name>” [--­­ip <IP Address>] [­-u <user id>]
    • rrstartlpar –o restart  –m <source system> -t <target system> -p <partition name> | --id <partition id> -i “shared_proc_pool_id=<target spp id>|shared_proc_pool_name=<target spp name>” [--­­ip <IP Address>] [­-u <user id>]
    • rrstartlpar -o validate -­m <source system> -­t <target system> -­p <partition name> | --id <partition id> [--­­ip <IP Address>] [­-u <user id>]  -­i|-­f “virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp ar_id/[vios_slot_num]/[vios_fc_port_name]”
    • rrstartlpar -o restart -­m <source system> -­t <target system> -­p <partition name> | --id <partition id> [--­­ip <IP Address>] [­-u <user id>]  -­i|-­f “virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp ar_id/[vios_slot_num]/[vios_fc_port_name]”

Increase in number of concurrent Remote Restart

  • Number of concurrent remote restart operations supported per destination system is increased to 32
  • Command to list the system level & lpar level remote restart details
  • CLI :
    • lsrrstartlpar –r sys | lpar
    • lsrrstartlpar –r sys –m <system name>
    • lsrrstartlpar –r lpar –m <system name>

Remote Restart Feature Support/Matrix

Function

HMC Level

Firmware Level

VIOS Level

PowerVC Level

HMC CLI for RR

V8R8.1.0 or later

FW760.00 or later

2.2.2.0 or later

NA

Toggle RR with Reserved Storage Device

V8R8.1.0 or later

FW810.00 or later

NA

NA

Simplified Remote Restart

V8R8.2.0 or later

FW820.00 or later

2.2.3.4 or later

1.2.3 or later

SRR with Shared Storage Pool Storage

V8R8.4.0 or later

FW820.00 or later

2.2.4.0 or later

1.3.1 or later

SRR Enhancements (refer to the list mentioned above)

V8 R8.5.0 or later

FW820.00 or later

2.2.3.4 or 2.2.4.0 or later

NA or Not yet supported

Contacting the PowerVM Team

Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions



#HMC
#PowerVM
#powervmblog
0 comments
50 views

Permalink