InfoSphere Streams is an advanced computing platform that can quickly ingest, analyze and correlate information as it arrives from thousands of real-time sources.
IT professionals are being asked to do more with less and highly skilled resources are in demand. As streaming applications play a growing role in critical applications so does the need for simplicity. InfoSphere Streams empowers IT users of all types and skill levels to have deeper insights into operations and performance. In today’s engaged world, a five minute delay means business goes elsewhere. A new administration console, a Java Management Extensions (JMX) management and monitoring application programming interface (API), simpler security and adoption of Apache Zookeeper are now available in InfoSphere Streams
This presentation is an introduction to InfoSphere Streams. First, we position current market challenges in the area of big data. Then we discuss how context-aware stream computing from IBM InfoSphere Streams addresses these challenges. Finally we present how InfoSphere Streams provides unique value across a range of industries. You can get started now with our InfoSphere Streams Quick Start program and new open source project.
Quick Start: http://www-01.ibm.com/software/data/infosphere/streams/quick-start/
Open Source: https://github.com/IBMStreams
Clients need to move from data management to action based on real-time insight. Speed isn’t just about how fast data is produced or changed, BUT the speed at which data must be received, understood, and processed. This presentation will outline how to harness fast moving data inside and outside of your organization.
Your organization needs to shift from management of data to action. Organizations should:
Select valuable data and insights to be stored for further processing
Process and analyze perishable data to take real-time action
Harness and process streaming data such as video, acoustic, thermal, geospatial or sensors
InfoSphere Streams is a development platform using a scale-out architecture. It includes comprehensive tools for development and management of the environment. The development environment also includes a set of toolkits that provide high-level functionality to accelerate development of solutions.
Since InfoSphere Streams processes data in memory, it has high velocity – it can respond to events in microseconds, 1/1000 of a millisecond. It is orders of magnitude faster than databases, which must first store data on disk drives. InfoSphere Streams can analyze and correlate any type of data (Variety)– audio, video, network logs, sensors, social media such as Twitter, in addition to structured data. InfoSphere Streams is designed to scale to process any size of data from terabytes to zetabytes per day. InfoSphere Streams can run a large variety of analytics – from historic analysis like data mining, to predictive analytics and also custom analytics such as image analysis, voice recognition, etc. InfoSphere Streams also provides tremendous agility. With the ability to dynamically added new applications that can tap into existing data streams and applications, businesses can respond more quickly to a changing world.
What is InfoSphere Streams?
Platform: InfoSphere Streams is not a solution or application, nor is it a limited-purpose tool. Instead, it is a platform. It comes with the tools, language, and building blocks that let you build programs for it, and with a runtime environment that lets you run those programs.
Real-time: The InfoSphere Streams programs you create do their processing and analysis in as close to real-time as it is possible to get on a standard IT platform. In this case, real-time means very low latency, where latency is the delay from the time a packet of data arrives to the time the result is available. A key factor here is that InfoSphere Streams does everything in memory; it has no concept of mass storage (disk).
Analytics: Because InfoSphere Streams is fast, scalable, and programmable, the kinds of analysis you can apply ranges from the simple to the extremely sophisticated. You are not limited to simple averages or if-then-else rules.
BIG data: Actually, make that infinite data. For purposes of program and algorithm design, streaming data has no beginning and no end and is therefore by definition infinite in volume. In practical terms, this means that InfoSphere Streams can process any kind of data feed, including those that would be much too slow or expensive to capture and store in their entirety.
This chart provides a very detailed architecture for the smart grid. InfoSphere Streams fits into this picture by analyzing high volume, high velocity data; it acts as a pre-processing filter to various landing zones.
IBM offerings your choice of deployment methods, on premise or on cloud.
Stream computing in the cloud allows organizations to tap into data in motion easily and cost-effectively. The shift toward cloud computing is also a response to the realization that big data and analytics must take a more central role in today’s businesses, becoming an engine that helps drive the business forward.
Organizations need to transition from passive, siloed “systems of record” designed around discrete pieces of information to “systems of engagement,” which are more decentralized, incorporate technologies that encourage peer interactions, and often leverage cloud technologies to enable those interactions. To accomplish this transition to systems of engagement, integrate their systems and support enhanced collaboration, companies need to deploy appropriate platform technologies—and cloud-based formats are
an ideal fit.
Organizations need optimized analytics to address unique industry requirements at the right time like holiday season for retailers, weather conditions for insurers, & marketing campaigns for teleo. The InfoSphere Streams cloud offerings name this possible, while also delivering faster and simplified systems management for the developer. Tap into data-in-motion easily without massive investments in infrastructure or additional staff time. Scale up or down quickly and easily to meet demand.
IBM Geospatial Analytics on Bluemix - Leverage real-time geospatial analytics to monitor IoT/devices
Expand applications to include analytics on fast moving, high volume, streaming geospatial data without huge investments
Both cloud solutions enable organizations to:
Prevent data overload
Respond in real time to business requirements, speed decisions and get ahead faster
Better decision making mining all ubiquitous data sources and fusing data
Continuously adapt to changes
Improve developer agility to respond to all data
IBM Streams on IBM Cloud marketplace - Deploy real-time analytics on cloud to prevent data overload and lower development, storage and administrative costs
Respond to business requirements with real-time analytics
Deploy real-time analytics with flexible cloud deployment options
With cloud-based real-time analytic offerings, IBM® InfoSphere® Streams allows you to quickly ingest, analyze and correlate information as it arrives from thousands of sources without the burden of managing all infrastructure operations in house. It enables you to rapidly develop real-time analytic applications in the cloud and respond quickly to changing business environments by analyzing larger volumes of data more cost-effectively. And it’s fast. The era of big data requires sub-millisecond response times and extremely high throughput rates to enable insight and action on millions of events or messages per second. InfoSphere Streams is able to pull any amount of data, from any number of sources, scaling up and down as needed. It also enables a breadth of deep analytics including text, geospatial, sensor, video and more.
Context-aware stream computing is a different paradigm – the left shows the traditional way data is accessed using queries to pull the data from a data storage device such as a data warehouse or database – which is still valid for many requirements.
The new context-aware stream computing paradigm brings data to the query – data is pushed or flows through the analytics.
Common drivers for those new use cases include:
When you need an immediate response/action and persisting and analyzing stored data isn’t fast enough.
When it is too expensive to store the data to be analyzed – e.g. most of it is throw-away and its more efficient to analyze/filter as you receive it and store the filtered results.
As discussed above, InfoSphere Streams is a development platform for limitless applications of real-time analytics. However, there is a pattern to how InfoSphere Streams applications are designed.
Ingest data from many sources & prepare it for analysis
Transform, filter, correlate, aggregate and enrich the data for analysis
Detect & predict events and patterns in the data
Decide how the results should be handled and act on them
Store any data that is of longer term value
The goal of InfoSphere Streams Quick Start Edition is to allow clients to experiment on their own terms with stream computing. InfoSphere Streams Quick Start Edition provides an alternative option to open source because clients can experiment without a capital investment or time researching the open source options.
NOTE – The scale out architecture is available in the native installation option, not the VMware image. The VMware image is restricted to where the VM is running.
IBM context-aware stream computing gives you the ability to analyze massive data volumes quickly, often in real time, and turn data into actionable insight. InfoSphere Streams is an advanced computing platform that can quickly ingest, analyze and correlate information as it arrives from thousands of real-time sources. Because it can handle high throughput rates, InfoSphere Streams can analyze millions of events per second, enabling sub-millisecond response times and instant decision-making. Now you can get your hands on this technology with InfoSphere Streams Quick Start Edition, a no charge, downloadable, non-production version. With InfoSphere Streams Quick Start Edition, there is no data capacity and no time limitation, so you can experiment with streaming data and work with different use cases, on your own timeframe.
NOTE: InfoSphere Streams Quick Start Edition does not come with a support option. To explore support options, visit the InfoSphere Streams product page. - http://www-03.ibm.com/software/products/us/en/infosphere-streams
A place for developers by developers. It is your direct channel to the InfoSphere Streams development team and a place to discuss, learn and share ideas.
IBM has decided to create an open source project for some InfoSphere Streams components to speed development of applications, and harness the energies of the development community. Other developers can now extend the IBM source with new capabilities. In future releases, we expect to incorporate new function from the projects into InfoSphere Streams. We also expect other developers to contribute new InfoSphere Steams native functions, operators and toolkits into the new community to further accelerate adoption.
We believe a mix of open source and closed source is the best way to drive adoption in the marketplace, as seen by success with open source offerings like Apache Web Server and Eclipse. Having the full support of a vendor like IBM can lower risk while open source can help achieve customer requirements.
There are many resources for additional reading. Explore both business and technical resources. All resources publically accessible.