Accelerating TensorFlow Inference on IBM z16™ with IBM-zDNN-Plugin

Back to Blog List

Accelerating TensorFlow Inference on IBM z16™ with IBM-zDNN-Plugin

Like

IBM-zDNN-Plugin

Today we are excited to announce the release of IBM-zDNN-Plugin which is a device plugin for TensorFlow for accelerated inferencing. The release is available on the Python Package Index (PyPI) as ibm-zdnn-plugin. A tutorial is available at https://github.com/IBM/ibm-zdnn-plugin.

Accelerated Model Inference with IBM z16™

With IBM z16, IBM has brought the Telum processor to market. First announced in 2021, IBM Telum features a dedicated on-chip AI accelerator focused on delivering high-speed, real-time inferencing at scale. This feature is designed to accelerate the compute intensive operations commonly found in deep learning models. You can read more about the Integrated accelerator for AI in our blog and in our recent IBM z16 announcement letter.

While the IBM Integrated Accelerator for AI is clearly an amazing technology, the story would be incomplete without enabling the AI software ecosystem to utilize it.

The TensorFlow community proposed the PluggableDevice architecture, which provides a plug-in mechanism that allows devices to be registered in TensorFlow without changing the TensorFlow code so that the accelerator and TensorFlow can be seamlessly integrated. Based on that, we have developed a high-performance plugin called IBM-zDNN-Plugin, which allows accelerated deployment of TensorFlow models on IBM z16 hardware. The IBM-zDNN-Plugin is a device plugin for TensorFlow that leverages the IBM z Deep Neural Network (zDNN) library. The zDNN library contains a set of primitives that support Deep Neural Networks. These primitives transparently target the IBM Integrated Accelerator for AI on IBM z16 hardware.

With IBM-zDNN-Plugin, we are enabling one of the most popular open-source frameworks for AI to leverage the IBM z16 Integrated Accelerator for AI. TensorFlow is an end-to-end open AI platform with a rich and robust ecosystem. We have observed strong adoption of TensorFlow in our enterprise clients due to the wide range of capabilities it provides - in particular, the focus on deployment of AI assets.

On IBM zSystems and LinuxONE, TensorFlow has the same ‘look and feel’ as any other platform. You can continue to build and train your TensorFlow models on the platform of your choice – whether x86, cloud, or IBM zSystems. TensorFlow models trained on other platforms are portable to zSystems.

With IBM-zDNN-Plugin, you can now bring TensorFlow models to IBM z16 and exploit the Integrated Accelerator for AI - with no changes to the model. IBM-zDNN-Plugin will detect the operations in your model that are supported by the Integrated Accelerator for AI and transparently target them to the device.

Model Usage

IBM-zDNN-Plugin enables IBM's train anywhere and deploy on IBM Z strategy. We’ve provided detailed documentation on deployment, model validation, execution on Integrated Accelerator for AI, modifying default execution paths, etc.

Details are in the document, but the general procedure is as follows:

Build and train the TensorFlow model using the platform of your choice.
Install TensorFlow 2.9 and IBM z Deep Neural Network Library (zDNN >= 0.4.0).
Install IBM-zDNN-Plugin from The Python Package Index (PyPI).
On IBM z16 system, TensorFlow will transparently target the Integrated Accelerator for AI for several compute-intensive operations during inferencing with no changes necessary to TensorFlow models.

We’ve provided sample scripts and a detailed tutorial that includes download and setup instructions. For samples and tutorials please visit the github repository here.

Model Validation

We have found that the following AI models (but not limited to) benefit significantly when leveraging IBM Integrated Accelerator for AI on IBM z16 for inferencing.

BERT
Biomedical Image Segmentation
Credit Card Fraud
DenseNet121
DenseNet169
DenseNet201
InceptionResNet
InceptionV3
NASNetLarge
NMT
ResNet101
ResNet152
ResNet50
VGG16
VGG19
Xception
YOLOV3
YOLOv4

Many of these models can be found here.

Becoming a sponsor user

Sponsor users are active participants who work alongside our product teams to help refine IBM products by providing feedback, ideas, and domain expertise. You will be participating in our research study to understand your current approaches to AI for our team to build future products that will benefit you.

If you have questions on getting started with AI on IBM Z, reach out to us at aionz@us.ibm.com!

AI on IBM Z & IBM LinuxONE - Group home

Accelerating TensorFlow Inference on IBM z16™ with IBM-zDNN-Plugin