Protocols support overview: Integration of protocol access methods with GPFS
IBM Spectrum Scale™ provides additional protocol access methods. Providing these additional file and object access methods and integrating them with GPFS™ offers several benefits. It enables users to consolidate various sources of data efficiently in one global namespace. It provides a unified data management solution and enables not only efficient space utilization but also avoids having to make unnecessary data moves just because access methods might be different.
The additional protocol access methods that are integrated with GPFS are file access that use NFS and SMB and object access by using OpenStack Swift. While each of these server functions (NFS, SMB and Object) use open source technologies, this integration adds value by providing scaling and by providing high availability by using the clustering technology in GPFS.
The CES BLOCK service provides iSCSI protocol support. You can export a file resident in a GPFS™ file system as a virtual iSCSI volume to an iSCSI initiator by using the iSCSI protocol.
- The integration of file and object serving with GPFS allows the ability to create NFS exports, SMB shares, and OpenStack Swift containers that have data in GPFS file systems for access by client systems that do not run GPFS.
- Some nodes (at least two recommended) in the GPFS cluster have to be designated as protocol nodes (also called CES nodes) from which (non-GPFS) clients can access data that resides in and managed by GPFS by using the appropriate protocol artifacts (exports, shares, or containers).
- The protocol nodes need to have GPFS server license designations.
- The protocol nodes must be configured with
external
network addresses that are used to access the protocol artifacts from clients. The (external) network addresses are different from the GPFS cluster address that is used to add the protocol nodes to the GPFS cluster. - The integration that is provided allows the artifacts to be accessed from any of the protocol nodes through the configured network addresses. Further, the integration that is provided allows network addresses that are associated with protocol nodes to fail over to other protocol nodes when a protocol node fails.
All the protocol nodes must be running the supported operating systems, and the protocol nodes must be all Power®, in both big and little endian mode, or all Intel. Although the other nodes in the GPFS cluster could be on other platforms and operating systems.
For information about supported operating systems for protocol nodes and their required minimum kernel levels, see IBM Spectrum Scale FAQ in IBM® Knowledge Center.
Like GPFS, the protocol serving functionality is also delivered only as software.
- The intent of the functionality is to provide access to data managed by GPFS through more access methods.
- While the protocol function provides several aspects of NAS file serving, the delivery is not a NAS appliance. Therefore, in other words, the GPFS-style command line interface requiring root access is still available, and it is not like an appliance from an administrative management perspective.
- Role-based access control of the command line interface is not offered.
- Further, the types of workloads suited for this delivery continue to be workloads that require
the scaling/consolidation aspects that are associated with traditional GPFS.Note: Some NAS workloads might not be suited for delivery in the current release. For instance, extensive use of snapshots, or support for a large number of SMB users.
For more information, see IBM Spectrum Scale FAQ in IBM Knowledge Center.
Along with the protocol-serving function, the delivery includes the installation toolkit as well as some performance monitoring infrastructure.
- The GPFS code, including the server function for the three (NFS, SMB, Object) protocols, along with the installation toolkit and performance monitoring infrastructure, are delivered via a self-extracting archive package just like traditional GPFS.
- The use of the protocol server function requires extra licenses that need to be accepted. A GPFS package without protocols continues to be provided for those users who do not wish to accept these additional license terms.
Several commands are introduced or enhanced to enable the use of the described functions.
- The commands for managing these functions include spectrumscale, mmces, mmuserauth, mmnfs, mmsmb, mmobj, and mmperfmon.
- In addition, mmdumpperfdata and mmprotocoltrace are provided to help with data collection and tracing.
- Existing GPFS commands are expanded with some options for protocols include mmlscluster, mmchnode, and mmchconfig.
- The gpfs.snap command is enhanced to include data by gathering about the protocols to help with problem determination.
For information on the use of CES including administering and managing the protocols, see Implementing Cluster Export Services.
IBM Spectrum Scale enables you to build a data ocean solution to eliminate silos, improve infrastructure utilization, and automate data migration to the best location or tier of storage anywhere in the world. You can start small with just a few commodity servers by fronting commodity storage devices and then grow to a data lake architecture or even an ocean of data. IBM Spectrum Scale is a proven solution in some of the most demanding environments with massive storage capacity under the single global namespace. Furthermore, your data ocean can store either files or objects and you can run analytics on the data in-place, which means that there is no need to copy the data to run your jobs.
The installation toolkit is provided to help with the installation and configuration of GPFS as well as protocols.
- While it is designed for a user who might not be familiar with GPFS, it can help ease the installation and configuration process of protocols even for experienced GPFS administrators.
- The installation toolkit can help with prechecks to validate environment, distribution of the RPMs from one node to the other nodes, and multiple GPFS administrative steps. spectrumscale deploy can be used to configure protocols on an existing GPFS cluster with an existing GPFS file system.
In addition to the installation toolkit, IBM Spectrum Scale also includes a performance monitoring toolkit. Sensors to collect performance information are installed on all protocol nodes, and one of these nodes is designated as a collector node. The mmperfmon query command can be used to view the performance counters that have been collected.
The mmhealth command can be used to monitor the health of the node and the services hosted on that node.
- At time of release, several features in GPFS are not explicitly tested with protocol functionality. These include Local Read Only Cache, Multicluster, Encryption, File Placement Optimizer, and Hierarchical Storage Management. This is expected to work with protocols and will be tested with protocols over time. However, if you use one of these features before IBM claims support of it, ensure that it is tested with the expected workloads before you put it into production.
- Use of Clustered NFS (CNFS) is not compatible with use of Clustered Export Service (CES). You must choose one or the other. CNFS (which uses kernel NFS, a different NFS stack than the stack used by the CES infrastructure) continues to be supported; however, if you choose CNFS, you cannot take advantage of the integration of SMB and Object server functionality. If you choose to migrate from CNFS to CES, the CES infrastructure does not support a complete equivalent of CNFS group feature to control failover of IP addresses.
- For information regarding specific limitations about protocols and their integration with GPFS, see the IBM Spectrum Scale FAQ in IBM Knowledge Center and IBM Spectrum Scale in IBM Knowledge Center.