Skip to content
February 1, 2012 / berniespang

“The Cloud” for data management and Hadoop-based Big Data analytics

One of my colleagues pointed out to me this week that I have not yet commented much on the topic of Cloud computing.  I have to be honest – it could be because I have a bit of an eye-rolling reaction to “the Cloud.”

Value behind the marketing hype

I do think there is a great deal of client value in the agility and cost efficiency of implementing an IT environment that takes advantages of “Cloud technologies” – public and/or private.  My eye-rolling is provoked by the marketing hype use of Cloud as a buzzword.

For example, Microsoft’s “To the Cloud” TV commercials.  These showcase what I have for many years called an Internet app.   It reminds me of when an IBM distinguished engineer explained Enterprise Java Beans to me years ago and I said “OK, it a data structure – I learned about those in high school.”   Then a few years later he educated me on Web Services  – “Again,  I learned about remote procedure calls in high school.”   His response both times was along the lines of – “basically they are the same concepts – but this is a new evolution that merges in important advances in technology standards.

I similarly consider Cloud computing to be an evolution of Internet applications and services that merges in advances in virtualization standards.   This important evolution automates capacity growth – dramatically saving time and cost.  What is even more interesting is how this advance enables computing infrastructure and platform services via the Internet – in addition to the applications themselves.

That’s the real value here – not just the marketing buzzword value of saying a Web app is on the Cloud.

Data management and Big Data analysis via the Cloud

So how does that relate to the topic of this blog?  Database software and other information systems, such as the Hadoop-based InfoSphere BigInsights, are now being used more easily and cost effectively by those accessing them as either private or public Cloud services.  IBM offers DB2, Informix and InfoSphere software as services deploy-able  in a private Cloud environment, or accessible as a service from the IBM SmartCloud or via Cloud service partners.

For those interested in understanding the value  of a private cloud environment, I would suggest reading:  A study on reducing labor costs through the use of IBM Workload Deployer.     And for an understanding of how the competition stacks up, Roman Kharkovsi does a nice job in: Comparison of two private cloud tools from IBM and Oracle.

While I am at it, Roman has another related post:  Comparison of IBM and Oracle public cloud offerings (SaaS, PaaS, IaaS)    You may also want to check out the client success stories highlighted at the IBM SmartCloud.

It’s all about improving IT Economics

So what do you think?  Is all this Cloud talk just over hype, or are you using Cloud technologies to improve IT economics – enabling you to actually do more with less?

January 23, 2012 / berniespang

In today’s economy, budgets are for rapid value, not misleading hype

I read and heard a few seemingly unrelated items this week that led me to the title of this week’s post.    First were the IBM 4Q and 2011 results and our CFO’s presentation and Q&A with financial analysts, next an analysis of these results in context of other technology providers, and finally was Conor O’Mahony’s analysis of recent Oracle benchmarks.

Positive results

IBM results were positive and accompanied by other encouraging indicators of IT demand, while Oracle had recently suggested that its latest disappointing results were  largely due to a general market slowdown.   I was particularly happy to hear IBM CFO Mark Loughridge say:

“Information Management grew 9 percent and again gained share. Our Distributed Database grew double digits led by strong performances from our Netezza offerings, which were up nearly 70 percent. …. For the quarter, almost a third of the transactions were with new Netezza clients.  Since acquiring Netezza, IBM has expanded its customer base by over 40 percent. And when we go head-to-head against competition in Proof of Concepts we had a win rate of over 80 percent this quarter. Our business analytics software offerings, most of which are part of Information Management, continue to outpace the market with double-digit growth.”

But it was an answer he gave to an analyst question that really stuck with me.   An analyst asked something along the lines of do you see a general tightening of CIO budgets?   Mark’s reply included a point that clients are focusing their spending on things that will rapidly deliver value to the business.   And that IBM results reflect our ability to bring that value to our clients.  (You can find a replay of the call at the link above).

Business Value vs. misleading hype

I thought of Mark’s answer as I read Conor’s analysis of the Oracle marketing spin on its benchmark results.  A 3x faster claim is based on a meaningless comparison of its current offering with IBM results from 2007,  and a claim of 60% faster is based on a comparison with a more current IBM system that used half the number of processors.   When you look at the apple to apples comparison that matters – current price for performance – IBM is 39% less expensive.   This is the real business value that we see our clients paying close attention to these days.

The bottom line is that most organizations are demanding a smarter use of their IT investments.   They no longer accept business as usual as the automatic answer.  It is true among solution architects and developers who are considering NoSQL / Big Data management systems in addition to relational database systems; and among IT leaders who are seriously evaluating the most effective and cost efficient systems for their business.

January 14, 2012 / berniespang

I forgot to make my 2012 predicition: Growth of operational analytics

Recently I explained operation analytics to a colleague who suggested I write more about it.  Thinking about it made me recall an article by Philip Russom, Senior manager of TDWI Research, in which he captured a collection of 25 tweets he had recently shared explaining operational data warehousing.

Real-time does not equal operational

Too often I have heard people equate the idea of operational warehousing or operational analytics with real-time loading of data into a warehouse.  While that can be one attribute, the more important aspect is real-time access to the data and the insights generated within a data warehouse, by the applications that support business operations.  The use of these insights during  operations such as sales, service and customer support – at the point of each business transaction – can dramatically elevate an organization’s performance.

100’s or even 1000’s of answers per second

A data warehouse system that supports such operational applications must not only be able to handle complex analytics, it must also support concurrent access rates that can be in the 1000’s per second.   I know of one client that uses the IBM Smart Analytics System (powered by InfoSphere Warehouse software) to support operational analytics for a solution that executes over 10,000 transactions per second.  This is a great example of using a system that is both designed for data and tuned for the task.

Organizations and technology are both now ready

Today, analytics applied directly in operations with a large number of concurrent users is in the minority of analytic applications.   But based on data like those cited by Philip, and feedback over the last 6 months from colleagues who work on solutions with clients around the world, I believe we will see recognition in 2012 that operational analytics growth is accelerating.  And that organizations which have realized the power of using information for competitive advantage, will pull further away from the pack by achieving that advantage in many more aspects of their business operations.

January 14, 2012 / berniespang

Is Quality of Service as important as Qualities of Service?

I had the opportunity this week to talk to Laura Didio after reading her article at E-Commerce Times: Oracle’s Downward Spiral.   I’ll save specific comment on the results of a recent ITIC survey re: Database Reliability and Deployment Trends until after the report is published.  However, the article and our discussion reminded me of an important add to the “workload optimized” concepts I wrote about in earlier posts.

In addition to most cost efficiently optimizing a system to meet a solution’s required Qualities of Service (Performance, reliability, security..),  organizations also need to cost efficiently optimize the Quality of Service required of their technology partner(s).    As I wrote earlier,  when considering the old axiom “if it isn’t broken, don’t fix it,” you need to think about the total cost you are spending as well as the function and qualities of services.    While the latter 2 may be fine, if you are paying a lot more to deliver that service than you could be paying using alternatives… then it is “broken.”

Quality of Service is a factor in Cost Efficiency

As the ITIC survey indicates, clients are increasingly aware of the imbalance between the price they are paying and the value they are receiving – especially as they are experiencing declining value of service and support.     Those in the IT game can easily imagine the high costs added to operations when expensive human intervention and poor performing workarounds are required to keep business solutions running until proper support, fixes and updates are delivered by a technology provider.

So obviously I believe the answer my question is:  “Yes, Quality of service is as important.”

Anyone think otherwise and up for a debate?

January 9, 2012 / berniespang

Have we entered a new era of data management? I believe so.

I recently read a TechTarget article based on an interview with Michael Stonebraker, a computer scientist who has been in the database software game for some time.  While the title of the article (“Michael Stonebraker predicts trouble for relational databases in 2012”) serves as a provocative attention grabber, I do not agree it is the correct, substantive conclusion.   As my earlier posts outline, I agree with the substance of Michael’s points – that we are no longer in an era where solution developers believe “the answer to all data challenges is a relational database.”   But that only means trouble for providers of relational database management software, if they do not recognize this reality and offer the market a broader set of capabilities for the new generation of solutions.

NoSQL, Big Data and Cloud, while each important to the new generation of solutions, are likewise not the answer to all needs – individually or collectively.  And as Micheal points out in his interview, ACID [atomicity, consistency, isolation, durability] qualities of service can be just as important when using these technologies as they are for business critical data best managed in a relational database.  In this new era of data management it is critical that we  optimize for both functional and qualities of service requirements.

The Information Management team at IBM certainly recognizes this market need.  We offered an Hadoop-based “Big Data” management system, InfoSphere BigInsights, in both a no-charge Basic Edition as well as a more robust Enterprise Edition –  just as we do with DB2 and Informix relational data systems.  Contrary to how I read Oracle’s introduction of its Big Data system, IBM does not consider Hadoop as merely a mechanism for filtering data into a relation database system for subsequent analysis.  Sure that is a valid use case, but perhaps more so is the use of such a system itself for information analysis.  Most client engagements we have seen involve a very complementary integration of the 2 types of systems – each supporting the analysis it does best and sharing the resulting insights appropriately with the other for use in further analysis.  The same complementary relationship is true when InfoSphere Streams is used for analyzing information as it flows in greater volume and/or velocity than can be cost effectively stored.  Or when Informix TimeSeries is being used to save and analyze instrumentation data with greater performance and efficiency than a relational database system can.

The growth of something new does not mean the death of something old

In the late 80’s/early 90’s the death of the mainframe was predicted because of the new era of  “distributed computing.”  In reality, the new types of systems opened new uses of computing to drive business growth, while the volume of business computing best done on the System z “mainframe” also continued to grow.  IBM recognized this reality and is the leader in business computing servers because it offers leading products across Power System (POWER/Unix), x (x86/Linux) and z (z/OS) system lines.

I believe this new era of data management similarly does not mean the “death of the relational database.”  It is fair to say, however, that it could mean trouble for software providers that ignore this new wave of data management needs, and instead tell customers to fit the new square pegs into their existing round hole.  I can assure you that IBM is not one of those.

January 9, 2012 / berniespang

My Christmas presents from Steve Jobs

I have to offer my thanks to Steve Jobs for my favorite gifts this Christmas.

First, for my new iPhone.  A very exciting change from my 4 year old “other device.”    My hat’s off to Steve and the team at Apple for creating such a great product.

But more importantly, I am thankful for the gift of his biography.  While it was bought and wrapped for me by my daughter, this gift started with Steve’s decision to share his story with all of us while he was able to contribute to it personally.

Reading it over the holiday break was not only entertaining, it gave me several insights and a renewed inspiration for my role at IBM.  It is that last part for which I am most grateful.    It is a heck of gift as we start the new year.

Assuming there is no off switch…  Thank you, Steve

December 16, 2011 / berniespang

Social media has changed the game for marketing profressionals

Data management software and systems

I’m several posts into this new blog now and it feels like I am overdue on a statement to clarify what the blog is about and why I started it.  This purpose of this blog is to share my perspectives about data management software and systems – both generally and specifically about the IBM portfolio.   I will not be making general sales pitches about IBM products with no broader context.  But I will also not be  apologizing for writing about them or trying to cleverly hide them within an “industry generic” perspective.

Which brings me to the topic on my mind this week… why I started this blog.

Bringing your audience to you… or going to them?

I have been in the business software marketing game for more than 10 years – and it has changed quite a bit.  The difference between the early days of the Internet – initial product websites and traditional media publishing articles online – and today’s use of social media, is dramatic.  Those early days were about extending the reach of traditional models.  That meant attracting the audience to a place where we could share information that we wanted them to know – a site, a publication, an event, etc.    The hasn’t gone away, of course.   But with the evolution of social media, there is a new wrinkle that requires us to reverse the thinking.

We now have the opportunity – and frankly the imperative – to bring our message to where our audience is already engaged on topics that matter to them.  And to bring our messages into their community in a way that is helpful to their discussions.  Done well, this can make us helpful members of the community where our information adds value.

Connecting to developer communities

I had an interesting discussion about this just yesterday with Stephen O’Grady from Redmonk.   Building on a discussion we started at the Information on Demand conference this year, he has really helped me see the importance of this point with respect to introducing the value of our portfolio to application developers.  There are various developer communities that are increasingly aware that data management choices for their needs go well beyond “the answer = Relational Database system only.”

What I plan on doing is to have this blog serve as a source of information about the IBM portfolio and its strategic direction that is helpful to the many communities where our clients and future clients are already engaged.

December 9, 2011 / berniespang

Workload Optimized Systems need to be optimized for required qualities of service, too

I really, really was planing to get onto a different topic for this week’s post.  But I was fortunate to be invited to a consulting session this week with a group of analysts that focus on the server market (among other things).    The discussion helped expand my thinking re:  Workload Optimized Systems.

Achieving simplicity and cost efficiency by using systems optimized for each type and mix of workload means considering required qualities of service along with functional and information characteristics of the workload.  For example, the functional and information characteristics of a workload could be the same – but the qualities of service needed are likely different for running it in a research sandbox, a development environment, a test environment, or a production environment.   Different requirements on performance, reliability, security, scalability, etc. affect your choice of server – even if the software function and information structures remains consistent.

For those of you who have ever asked why IBM offers 3 different server lines – System x, Power Systems and System z – the above is one way to answer that question.  IBM systems offer a range of qualities of service levels so that our clients can meet their service level needs – from modest to the most extreme – without overpaying for more than is needed.  In fact, I just recalled a discussion with a client that standardized on DB2 database software so they could use System x in development, deploy initially on Power Systems and then move workload to System z when ready to expand to their global deployment.

Anyone disagree with this expanded view of workload optimization – or have an additional dimension we should also consider?

November 11, 2011 / berniespang

The convergence of Big Data, NoSql and Smart Consolidation

Thanks to a number of recent blogs and discussions with colleagues and analysts, I have realized that a many of  the topics I am involved in are converging.   Big Data, NoSQL, workload optimized systems and smart consolidation/logical data warehouse  for an analytics ecosystem are typically considered separate topics and dealt with in isolation.  But as I have observed the similarrities in how they are discussed, I realize it makes a lot of sense to view them as different aspects of a common idea.  (I am not sure yet what a good lable might be so I am not offering one here.)

A recent example is Philip Howard’s blog from Bloor Research IBM and Big Data.   Philip writes about the different systems for addressing different “big data” challenges in a similar way that I did regarding different analytics systems.  And discussions about the value of new (and in some cases not so new) data management approaches under the label “NoSQL” are additional examples of matching the workload requirements to the proper system to optimize for top performance, efficiency and simplicity.  Watch this interview with IBM Fellow Curt Cotner on NoSQL and Hadoop at the recent IOD 2011 conference for more detail on what IBM is up to on this topic.

In 2005 when I joined the IBM Information management team, many of my friends questioned a move to a segment of the business that was considered “done”.   The answer to every data management challenge was Relational Database Management Systems and the major and minor RDBMS players were firmly established and the market had achieved a steady state. 

One of the reasons I love this business is that there is no such thing as a permanent steady state.  I believe that since 2005 the Information Management and Analytics advances have been among the most dynamic in the industry and impactful for our clients.  The current set of  hot topics indicates we are not nearly done with this current period of advancement.

How do you see these various threads melding together?  Do you see others that are as well?

November 4, 2011 / berniespang

Smart consolidation to a logical data warehouse

For the first of what I am hoping will become a weekly habit of sharing what is top of mind in my role at IBM, I am essentially adding to my previous post re: Workload Optimized Systems

Single Data Warehouse -> Logical Data Warehouse

There is a growing recognition that the notion of a single “data warehouse” (one that incorporates all available data and supports all types of analytics workloads) is outdated given the explosive growth of information and types of analytics systems.   Of course I am not the first to make this observation.

I have been discussing this concept with my team at IBM and with analysts who spend a great deal of time with clients.  While I could not tell you who coined the term Logical Data Warehouse until I read Mark Beyer’s guest post on Merv Adrian’s blog, I can tell you the concept is becoming fairly well recognized.  As an example, it is a fundamental part of the game plan at IBM, given our growing portfolio of systems that are optimized for different types of analysis.  For more details, read The Logical Data Warehouse: Smart Consolidation for Smarter Warehousing by my colleague Phil Francisco.

Different systems for different analytics

To give you a brief idea why this is important, consider systems optimized for a variety of analytics:

1) Operational Analytics (IBM Smart Analytics Systems)

What is it? Balanced performance for complex analytic queries and a high volume of concurrent operational transactions

How it can be used: To support a call center with operational insights at time of contact

2) Deep Analytics (Netezza appliances)

What is it? Optimized performance and simplicity for analytics workloads that do not include operational transactions

How it can be used: Complex data mining and predictive analytics

3) Time Series Analytics (Informix TimeSeries)

What is it?  Using time series specific data structure instead of a typical relational one to dramatically reduce storage space required and speeds both data loads, analytics  and operation reports.

How it can be used:  For unlocking hidden insights among the growing volume of data from Smart Meters and sensors in all kinds of “smarter” system

12/14/11  add:  I just read this article that does a fine job explaining this value – and including the spatial data management optimization of Informix software, too.     Efficient Vehicle Tracking System Software Solution with Informix

4) Streams Analytics (InfoSphere Streams)

What is it?   Ultra low latency analysis of information flowing through a system before it is even stored – if ever

How it can be used:  Telemetry from medical devices in an Intensive Care Unit

5) “Map-Reduce” (Hadoop) Analytics (Infosphere BigInsights)

What is it?   Sometimes used as a synonym for Big Data, these systems enable analysis over a very broad and diverse set of information such as available to us via the Internet.

How it can be used:   To analyze petabytes of structured data including weather reports, tidal phases, geospatial and sensor data, satellite images, deforestation maps and weather modelling research, in an effort to pinpoint the right installation location for new Wind Turbines… all within one hour!  (Read more about how Vestas is using BigInsights)

Do you know of other kinds of  data analytics systems, or how organizations are using various types in concert to gain new insights and greater competitive edge?