Another exciting conference week in Las Vegas has come to an end. (Unless you count the flight home… then it still has a few more hours to go.) This time it was the 1st IBM Interconnect conference, combining 3 previous events into a mega event focused on innovations for the Cloud era. From what I saw and heard from others it was a terrific event – kudos to all the teams involved.
The conference was a great opportunity for me to speak with clients, partners, prospects and IBMers about Spectrum Storage and Platform Computing. It also turned out to be a great opportunity to brainstorm a bit with our new marketing VP, Eric Herzog , about how to communicate the value of this Software Defined Infrastructure portfolio in a way that is clear and concise.
Ideas crystalized and were tested live with press and analysts, clients and sellers. My top 2 are condensed in the title of this post.
IBM Software Defined Infrastructure provides a unique set of capabilities that help our clients:
- Avoid cluster sprawl by using all compute and storage resources as a single pool that is efficiently shared among a broad set of large scale, high performance applications and analytics
- Safely ride the wave in an ocean of data by deploying highly efficient and deeply integrated storage solutions, with unprecedented deployment flexibility: as software, as a service, as a system – on-premises, in the Cloud and across hybrid environments
What the heck is Cluster Sprawl and who cares?
Remember the days when folks put each new application on its own physical server with its own storage? Remember how large those inefficient server and storage “farms” grew? And how much money was spent on underutilized resources? And how much more money was spent on evolving them to a virtualized compute and storage environment where a physical resource could be shared among many apps? (Following the well proven lead of IBM z Systems, a.k.a. the mainframe)
Well it is starting to happen again. New generation apps and analytics are increasingly looking like traditional high performance / supercomputing workloads that rely on compute and storage clusters to handle large volumes of data at high speed through parallel processing. As each new scale-out application appears, a new cluster appears to run it. With the expected result: a growing number of underutilized clusters that are costing their owners more than they should. What’s worse is that the apps and analytics are often running slower than they could if unused resources in other clusters could temporarily pitch in to help.
One of our clients, a global financial services provider, has prevented cluster sprawl by implementing a software defined infrastructure with IBM Platform Computing and Spectrum Storage – reducing costs and increasing performance of some workloads by 100x (That is not a type-o, 100 times… not 100% which = only 2x)
In a world where faster business processes and deeper business insights are increasingly run on platforms such as Hadoop, Spark, Cassandra as well as traditional scale-out databases and data warehouses, maybe the better follow-on question to “What the heck is cluster sprawl?” should be: “Is there any organization that can afford to not care?”
When did data pools overflow data lakes and become a Data Ocean?
I will save this answer for my next blog entry. If you can’t wait, watch Eric Herzog, Live from Interconnect 2015.
I have been fortunate in my career to have been involved in a number of significant technology transformations that have impacted both IBM and the clients we serve. I was on the team that worked with Sun Microsystems and across IBM to establish Java as a “write once, run anywhere” application platform; on the team that worked with Microsoft to launch the early Web Services standards and Service Oriented Architecture; and on the IBM team that launched the Eclipse open source community and led the transformation to an open application development environment. It is exciting to be part of new generation of information technology that enables IBM clients around the world to better serve their customers, patients, and citizens.
My good fortune continues as I am now part of the IBM team leading a transformation to Software Defined Storage as we launch IBM Spectrum Storage.
We live in a world that is increasingly data driven. The growth of Mobile and Social apps used by people all around the world; growth of meters, sensors and cameras on practically everything capturing data to be analyzed in near real time; and growth of laws and regulations requiring long term data retention, are all factors driving an explosive growth of stored data. Traditional storage solutions were designed for a different set of applications and usage patterns. They are too rigid and inefficient to address today’s needs cost effectively.
A more agile storage environment is required to cost effectively handle a broad scope of data whose business value may change over short periods of time. The insatiable “need for speed” must be answered with breakthroughs that do much better than throwing more inefficient systems at the problem. That is why there is growing market buzz about the difference Software Defined Storage can make.
IBM Spectrum Storage is a comprehensive set of storage intelligence that is delivered with unmatched flexibility – as software, as Cloud services or pre-integrated in systems and appliances. These capabilities can be used to optimize storage of files, objects and data on storage-rich-servers and 100s of storage systems from IBM and other companies. And can be used on premises, in the cloud, and across hybrid cloud environments to optimize both performance and cost.
While IBM Spectrum Storage is new and includes a number of new innovations such as cloud based storage analytics; it is also based on a proven set of technologies that include more than 700 IBM patented innovations used by thousands of clients around the world. This combination of innovation and proven reliability is a compelling value for solutions that manage and protect the data that are among an organization’s most valuable assets.
Now that we are launched, and I am back on the blogging bandwagon, I look forward to sharing stories about how clients are redefining data economics with IBM Spectrum Storage.
I have to admit – I had hit a period of burn out. I had been leading IBM Database Software & Systems Marketing & Strategy for some time, and it was time for a change.
A little over a year ago I moved to a new role – leading Strategy for Software Defined Environments in IBM Systems & Technology Group. It was a wild first year, with a lot of changes – including picking up business line responsibilities for Elastic Storage, our scale-out software defined file & object storage offering – known to most as General Parallel File System, and our software defined computing portfolio, Platform Computing software. It was a busy year and I had a lot to learn.
We are now at a very exciting point in IBM, having worked though a challenging transformation year in 2014. I am now business line VP for Software Defined Infrastructure in the new IBM Systems team, and am fired up about our new organization, our product portfolio and the plans we have for 2015. As such, I figure it is time for me to get back to posting regularly.
That’s enough about me and the transition of topic for this blog. As a tease for the next post.. I will have a lot to talk about next month as we head into the IBM InterConnect conference. I invite you to follow the link and register to join us in Vegas at the end of February. Travel safe.
I would imagine you are thinking that headline is a pretty bold statement. And when I tell you that BLU Acceleration is an exciting capability being introduced in the new DB2 this quarter, you may think it bolder still.
If you have not read any of my past blogs, you may be asking “what does database software have to do with Big Data?” The most important thing to remember is that meeting today’s “big data” challenges requires different types of systems that use different technologies for managing and analyzing different data in different ways. This is why the world now has a diverse set of NoSQL systems that have been added to the traditional SQL database systems. And this why IBM has added new systems (e.g., for Stream and Hadoop processing) as well as new NoSQL capabilities added to SQL systems (e.g., XML and RDF Graph database adds to DB2, and TimeSeries and Spatial database capabilities in Informix.)
In a recent discussion with an industry analyst, I was surprised to learn that he considers in-memory, columnar management of a SQL relational database to also be NoSQL. He revised my definition of NoSQL to be – Not Only traditional row-based relational data management via SQL. And so with the introduction of BLU Acceleration in the new DB2, it becomes a NoSQL data system for another reason. BLU Acceleration is dramatically easier and faster for analytics on terabytes of data. For many organizations, this enables cost effective analytics of more data and for more users.
In his blog, consultant and IBM Champion Dave Buelke called BLU Acceleration – Best yet for Big Data! He asserts that there are cases where Hadoop systems are being used or considered for analyzing data, where using BLU Acceleration will be a more simple and lower cost solution. (Note: neither he nor I am asserting this is true for all Hadoop uses cases. The point is – no one technology, including Hadoop, is the best answer for all needs.)
Speaking of User Groups, my thanks to the International Informix User Group team that hosted their conference this past week in San Diego. It was great meeting with members of this community and seeing both new and familiar faces among the attendees. A lot of positive feedback about the enhanced capabilities in the new Informix 12. This includes extending the use of Dynamic In-memory (technology shared with BLU Acceleration) for TimeSeries data – simplifying and accelerating operation analysis and reporting of growing smart meter and sensor data.
For more Big Data stories and to add your thoughts, I encourage you to join the conversation at the Big Data Hub.
Big data is all about scaling the use of data beyond the norms of the current era of information technology.
You could reasonably argue that the first big data era began more than a half-century ago. On May 25, 1961, President John F. Kennedy gave a speech to the U.S. Congress in which he declared the goal of landing a man on the moon, and returning him safely to Earth. The amount of data generated and managed throughout the program quickly outgrew data systems of the time. A brand new “Information Management System” (IMS) was created by IBM and other members of the Apollo team to tackle this new big data challenge.
Now, fast forward more than 50 years and we have ushered in a new era of big data, ignited by the global “Internet of things,” mobile, social and cloud computing, and instrumented systems of all kinds. Now every transaction, tweet or meter reading has potential value to enhance or destroy a customer relationship; to drive a new business opportunity; or to catch a bad guy. New types of data systems are needed to handle more data and more types of data, faster and more cost effectively than systems that were state of the art just a few years ago.
The key to making big data work for business is using systems that are designed for workload optimized performance and simplicity. In some cases that means completely new systems to handle challenges like analyzing data in motion, or spreading complex work among a large number of distributed systems. In other cases, new capabilities are added to proven systems such as IBM DB2 and Informix, to provide a new mix of production grade capabilities – e.g., for both SQL and NoSQL databases.
Solving today’s big data challenges often requires combining the structured, optimized approach of traditional database systems with the less structured, exploratory approach of new systems. In fact, modern versions of technology created decades ago may be the best choice for new enterprise challenges; ones that also benefit from their time-proven stability, maturity, and manageability.
So what’s the role of a relational data system in this big data era?
Some IT professionals may take relational and pre-relational database technologies for granted, but they remain the trusty workhorse in most data centers. These proven platforms continue to handle the growing volume of data and faster transactions from applications that conduct business every second of every day. They also enable deep analysis of that data to help organizations make better decisions with the speed needed to affect business operations as they execute.
Organizations leading the pack in big data ingenuity are the ones using the best combination of systems – traditional or new – for each need. For many organizations building complex systems, running global banking networks, or delivering millions of packages around the world everyday, that includes using the modern descendent of the data system that played a small role in a giant leap for mankind.
Look for more thoughts about Big Data at the speed of business from me and other followers of database technology in the coming weeks.
And if you’re interested in IBM’s next Big Data event, go to this link for details. http://ibm.co/BigDataEvent
Information on Demand 2012 was another great week this year, with a record number of attendees – over 12,000 IBM clients, partners, analysts, reporters and IBMers from around the world.
For those of you who did not join us last month, here is a summary of announcements made at the event. Also, the folks at Wikibon have assembled a nice set of videos and articles you should check out. Actually, those of you that were there would also find these summaries valuable.
A few to interviews to highlight given the subject of this blog:
- Tim Vincent: Rolling Your Own Database Distracts from Delivering Big Data Business Value
- Nancy Kopp-Hensley: PureData Helps Customers Transition form Planning to Executing on Big Data
- Nancy Pearson: We’re Changing the Economics of IT
- Jason Gartner: PureSystems is Innovative, Not Just Repackaging
- Pete McCaffrey: PureSystem Removes Admin Burdens for Customers
- and shameless plug for my interview: Big Data Requires Mix of Technologies
For me it was a particularly exciting year as it also marked the end of “launch month” for our new PureData System. But the real excitement of this event is the in person interaction with clients, partners, IBM Information Champions, and analysts. In a job dominated by conference calls and video chats, having the opportunity to participate in less formal conversations is a welcome change. It is particularly interesting to listen to exchanges among different clients about the challenges they face, and how they are using IBM technologies to meet them.
Speaking of clients and IBM technology.. the InfoSphere, Data Management, and System z product demo rooms and hands-on lab sessions were packed all week. This conference continues to be a nice mix of technical details and strategic discussions about the application of technology to improve business results. I spoke to several clients who each had a large group at the conference made up of business and IT leaders as well architects, developers and data professionals.
On a final note, Barenaked Ladies and One Republic both put on great shows. A great mid-week break from the very full days of business and technical talk.
I hope we see all of you next year… November 3-7, 2013
For those who may have noticed, I should explain my long absence from this blog. For the better part of this year my team and I have been “heads down” on preparing for and executing the introduction of the new IBM PureData System. Not having much time to spare was only a part of my excuse. The real reason was lack of energy and inspiration to write even one more piece beyond what was needed for the launch and for the IOD 2012 Conference last week..
Now that both events are behind us, it is time for me to get back on track….
PureData System is the newest member of the IBM PureSystems family of expert integrated systems I wrote about in April. It is offered in 3 models that deliver optimized performance for transactional, analytic and reporting, and operational analytic workloads. As an expert integrated system, each PureData System model is integrated software, hardware and built-in expertise that simplify the entire system life cycle – from procurement through retirement.
PureData System provides an efficient, high-performance and high-scale data platform – delivering data services needed for different types of transactional and analytic application workloads. Providing these values for data services needed for different types of applications requires software and hardware that are designed, integrated and tuned specifically for each type. Typically, organizations spend their valuable time and resources to design systems of general purpose components and then procure, integrate, configure, tune, manage and maintain each system for its specific use. PureData System dramatically reduces time, cost and risk when deploying and maintaining these systems.
- PureData for Transactions: integrates DB2 pureScale to deliver high-available, high-throughput transaction database clusters that easily scale without the need to tune the application or database. This PureData System is available in 3 size configurations and can be used to consolidate more than 100 database servers.
- PureData for Analytics: is powered by Netezza technology and is the newly enhanced replacement to the Netezza 1000 (formerly known as TwinFin). It is optimized for simplicity and performance for analytics and reporting data warehouses. This new model delivers 20x concurrency and throughput for tactical queries compared to the previous version Netezza technology, and offers the industry’s richest library of in-database analytics functions.
- PureData for Operational Analytics: integrates InfoSphere Warehouse software for operational data warehousing that can support continuous data ingest and more than 1000 concurrent operational queries, while balancing resources for predictable analytics performance. It also delivers DB2’s adaptive compression which has been used by clients to achieve up to 10x storage space savings. This PureData System model is a new generation that replaces the Smart Analytics System 7700.
And if that were not enough, we have also integrated the power and simplicity of Netezza technology with the reliability and security of System z to deliver cost efficient, high-performance analytics and operational analytics on data manages by DB2 for z/OS. System z clients now have the opportunity to greatly simplify and reduce cost of analyzing their most critical business data.
- DB2 Analytics Accelerator: The same Netezza technology that powers the PureData System for Analytics, also powers the newly enhanced DB2 Analytics Accelerator which integrates with DB2 for z/OS for high performance analytics – without modifying applications or the database. The new High-performance Storage Saver capability reduces demand on System z storage space without sacrificing performance.
- zEnterprise Analytics System: combines the new zEnterprize EC12 and DB2 Analytics Accelerator for a hybrid system that merges capabilities optimized for different workloads in a single, highly reliable, and secure system. The zEnterprise Analytics System 9700 and 9710 models have now replaced the Smart Analytics System 9700 and 9710.
That’s a good (re-)start… I will save my IOD 2012 recap for next week to make sure I get back on my weekly pace.
PS. My thoughts and prayers are with all those still suffering the effects of Sandy.