Update: The articles are once again behind the IBM paywall. Going to assume that is the policy from now on until something official from IBM says otherwise. Lame.
I hate the way that when newspapers have to publish an apology, they cram them in to a tiny space somewhere on the latter pages. So, a new blog post, rather than a footnote to the old one is deserved. Although not really an apology, the spirit of redress is the same
It’s an excellent decision and made with some alacrity, given IBM’s size and doubtless spools of internal red tape.
So now, those interested in the Godfather of Business Intelligence (Hans Peter Luhn) and the (often unsung) pioneers of Data Warehousing (Barry Devlin & Paul Murphy) can read their seminal articles without let or hindrance from IBM.
April 28, 2009
Yesterday Jos Van Dongen (@JosVanDongen) discovered that H.P. Luhn’s seminal paper on Business Intelligence, dating from 1958 was no longer accessible from IBM. (Edit: See also the ur-post from Seth Grimes on this – not the first time I’ve been a johnny-come-lately to a topic!) Not only that, but any attempt to read Barry Devlin’s work on Data Warehousing was also thwarted by the ominous-sounding “IBM Journal of R & D | IP Filtering Page”.
“The IBM Journals are now only available online for a fee.” Barked the page.
Instead of the prescient words of Hans Peter, one is now greeted by the announcement that you now have to pay to read the words of wisdom of the godfather father of BI, who happened to work at IBM.
I know these are tough economic times, and IBM need to extract every last cent from their assets too, but the benefits of associating IBM with BI giants such as Luhn and Devlin far outweighs the meagre revenue they will gain from those who are forced to subscribe just for the few BI-related IBM articles.
They should be shouting about their BI bona fides, not locking them up in a subscription to a journal that most people have never even heard of and are unlikely to spring $1000 for.
Maybe the best way is to have some kind of ‘Heritage’ collection, featuring the superstars of the IBM back catalogue which are made available for free. These might even be promoted to improve IBM’s image as an innovator and not a staid old behemoth, associated with mainframe monopoly and expensive services engagements.
The other issue is the multitude of links out there from a wide variety of people including analysts, business intelligence practitioners, academics, students, even Wikipedia. The Wikipedia definition of the term ‘Business Intelligence’ even includes a link to the paper. Over time, these links will either get removed, leaving Luhn’s work unread, just a name in a history of BI, or just serve to annoy those who come across them whilst researching and reading about BI, wondering why IBM is nickel and diming them.
Just in the small Twitter business intelligence community, there are quite a few people who have linked to Luhn’s paper:
There are thousands more links back to this paper, after all, he is the godfather father of Business Intelligence, not just any old IBM researcher.
Edited April 28, 2009 5:44:29 pm GMT – preferred Mark’s suggestion of ‘godfather’.
Edited May 11, 2009 5:06:30 GMT – Link to Seth Grimes’ earlier post on this topic.
February 11, 2009
There is a lot of talk about how <insert letter(s) here>AAS, especially in BI, is going to dominate 2009, mainly due to low startup costs and the hope of expert analysis of any organization’s complex business model by outsiders.
This is all well and good, but as a cynic and a slightly paranoid one at that, I can see certain risks that others with a more sunny disposition may not entertain.
I’m not alone though and in good company at that. For example Larry Ellison (“It’s insane”), Richard Stallman (“It’s worse than stupidity”), Bill McDermott (“It just won’t work). Admittedly they have their own agendas, but they give good quote.
The top tier providers do have a pretty good record here, but there is still the odd outage or two, even for Google Apps and Salesforce. I know that it is fairly rare for internal IT to be more reliable, but you can be more granular. For example, if you have a critical campaign or similar event, then you can step up the level of investment in disaster recovery with more hardware/software/staffing etc for the critical event and then ramp down again. In addition, some of these stats don’t take into account an internal IT’s PLANNED downtime, which when done correctly should have very minimal impact on the business. With SaaS, you’re in the pool with everyone else, no special treatment, no DEFCON 1 SLA on demand. Same as disaster recovery – no 80/20 option of just getting something up and running or a small amount of data to be going on with while the whole thing is fixed, it’s all or nothing.
And what happens if you do suffer problems with business continuity? In most cases you can get your money back for a specific period (or a portion of it). Some of the stories I have heard regarding downtime have ended up with much larger business impact costs than a month of SaaS payments, that’s for sure.
Who can you trust?
I started drafting this post even before the Satyam business (Yes, I know that’s a long time ago, but I’ve been busy). The answer is you can’t really trust anyone, but you just have to make an informed decision and live with the compromise.
If you are in the UK, then Sage would certainly be a name you could trust, but their recent security faux-pas with their Sage Live beta would likely make any consumers of a future service from them think twice.
A third party can certainly lose your data.
This is not so much about losing the data forever, in some kind of data disaster, where it cannot be retrieved by backups, it’s losing it outside the realms of who should be allowed to see it. This happens all the time, as shown by the British Government’s suppliers, unknown small outfits like PA Consulting, EDS, Atos Origin, etc etc. I could go on and on, but you can read about countless others here.
This can lead to it falling into the hands of those you don’t want to have the data, but in a passive way. As we know, august organizations like SAP have allegedly filched data in a less passive way as well.
Another very recent one where they did actually lose it completely was magnolia.com, not really a business critical service, but certainly affected those users that had invested their IP for up to 3 years.
Your data can be easily converted into cash. For someone else.
For data that has been lost or stolen, there is almost certainly a ready market for that data if it is in the hands of a less ethical organization. Of course, it requires an unethical organization to purchase and use the data, but I don’t think they are in short supply either, especially if the data can severely hurt a competitor or dramatically help the business. In these lean times, it may be the case that the moral high bar is lowered even more.
This may be the unethical company itself, or far more likely, some disgruntled employee that wants to make a quick buck.
New territory in the Sarbox / Gramm-Leach-Bliley world.
Data bureaux are nothing new, industry has been outsourcing data processing for years, but this has been mainly in administrative areas such as payroll, or transactional such as SWIFT. This stuff is pretty tedious and not easy to get any kind of edge on your competitors with.
Salesforce.com are the SAAS darlings, but they have already have had their data loss moment. And that’s only the one that was public. One might say that the information held on Salesforce.com is not that critical, but it certainly might be very useful to your competitors. However, you’re not likely to get hauled over the coals in the court of Sarbox for a competitor poaching your deals.
Once you start handing over key financial data to a third party, then the CEO and CFO are signing off on their deeds too, since you are responsible for the data, not the third party.
You probably need to think about buying insurance for this eventuality.
Another consideration is where in the world your data is stored, in the nebulous cloud, as not all geographic locations are equal, as regards privacy.
Under new management.
To use Salesforce as an example, they have Cognos as a customer. I don’t know if that’s still true, but let’s say it is. Now, our old friends SAP decide to buy Salesforce.com. Allegedly no strangers to a bit of data voyeurism, it would not be beyond the realm of the imagination (hypothetically, of course) that they may let the Business Objects folks (sorry, SAP Business Objects) take a sly peek.
On the more mundane side, should a more high quality vendor divest a SAAS business to a smaller, less blue-chip organization, you have a review and possible migration on your hands. See the Satyam debacle for the sort of ructions switching an outsourcer creates, especially in the context of a disastrous event.
Who pays the integration costs?
The fly in the ointment in the nirvana of throwing the problem over the side and getting the low capital outlay, useful BI within weeks etc etc is the dirty old job of integration. It’s generally one of the most painful aspects of the BI stack even working within the organization, but then dealing with the issues of feeding an external provider makes it even hairier.
In the case of Salesforce or other outsourced data, it’s far less of a problem, since theoretically, the outsourcer can just easily suck that data using clean, documented APIs. However, there are costs involved in moving the data to two sites, the usual operational use of the customer and the BI use of the outsourcer. That could be bandwidth or other charges for data exporting etc, or when the SAAS fraternity wake up and start creating a new license and premium for providing your data to external entities. Kind of like the oil companies keeping the price of diesel high (in the UK anyway), so those folks trying to save money by buying a car with better economy end up paying roughly the same anyway.
So what’s the mood?
I observed a very interesting straw poll at the 2009 Gartner BI Conference in the Hague. At a large session, Donald Feinberg of Gartner asked the audience how many were considering SaaS BI. The show of hands was either non-existent or maybe just one. The reason, trust. I imagine the attendees at this type of conference are more at the larger end of the enterprise spectrum, so there may be more interest in the lower leagues.
January 29, 2009
As promised in the Mini-Summary, which was written in some haste to appease those who weren’t enjoying the 24-hour party city that isn’t The Hague, a little (in fact, rather a lot) more on what went on at the Gartner BI Summit. In the Mini-Summary, I covered the keynote, in somewhat light detail. It was probably enough to give a flavour.
I’ll outline the sessions that I attended so you know I wasn’t in the Hague for the crack. It’ll serve as an aide-memoire for me too. It was great to meet up with some of the folks I met on Twitter and also others that I first met at the conference. On with the summit.
Data Integration Architecture and Technology: Optimizing Data Access, Alignment and Delivery – Ted Friedman – Gartner
This is an area of interest to me, as one of the products I look after is firmly in this space. A very good presentation containing plenty of survey-based facts, and a case study on Pfizer, who have a ‘data integration shared services’ organization. I suppose this is a DI version of the Business Intelligence Competency Centre.
ETL is still by far the largest area of DI, with replication, federation and EAI following. In addition, standardization of DI tools/architecture within organizations is still some way off.
The high-level message was that Data Integration is an absolute foundation of any data operations, whether BI or otherwise. Without good DI, you just end up with the old GIGO scenario. Not too much new for me, as to be expected, but Ted did put the kibosh on the BI as a service by reflecting my own personal view that in most cases, these data environments are ‘too diverse’ to lend themselves to easily to the SAAS model due to being hamstrung by the data integration piece of the puzzle. Narrow, specialized solutions can work, as well as simple data environments. However, as was pressed home later in the conference, that’s not the main reason BIaaS will not be as popular as many are projecting.
This session started with Timo mashing up some Obama data in Xcelsius and was generally designed to show that SAP Business Objects still has some innovation to show, even now it is part of the Walldorf Borg. The main highlight (from their point of view) was Polestar. I took a very quick look at the site, but was diverted by the typos “quentity” and “dectect” as well as noting it was not tested on IE8, so I left it for another day. Looks interesting though.
SAP generously conceded that less than 50% of business data exists in SAP. I am assuming they mean within organizations running SAP. Even then, that’s probably an underestimation. To that end SAP are introducing federation capabilities.
The Role of BI in Challenging Economic Conditions – Panel Discussion
The panel consisted of some large customers from around Europe. They were giving their views on how the climate affected their BI activities. Key point here include reducing the number of tools and vendors in the BI stack, squeezing licence costs – either by forcing prices down via negotiation, redeploying seldom used licenses or other BI ROI audit activities. Naturally, I imagine some licenses will become available as headcount shrinks this year.
The customers were focusing their BI efforts more on costs than on revenue and margins, which were previously the focus. In this uncertain environment, the speed of decision making is critical and some of the selection criteria for BI tools and initiatives have changed a lot. One of the customers noted that they used to talk about the look and feel, get down to details such as fonts etc, now its “how much, how fast to implement?”
BI is going to be more tactical for the short term, with small-scope projects targeted at answering key questions quickly.
Emerging Technologies and BI: Impact and Adoption for Powerful Applications – Kurt Schlegel – Gartner
This session looked at the macro trends in BI, which were as follows:
- Interactive Visualization (didn’t DI-Diver do this back in the late 90’s?)
- In-Memory Analytics
- BI Integrated Search (they showed Cognos here, but strange there was no mention of MOSS 2007 / BDC which does this quite nicely)
- SaaS (showed a good example where the SaaS provider had a ton of industry information that could be leveraged for decision making, rather than just some generic solution shipping in-house data back and forth)
- SOA / Mashups
- Predictive Modelling
- Social Software
None of this was new to me, but there were some good case studies to illustrate them and the SaaS example was the most realistic I’d seen from a business benefits point of view.
Using Corporate Performance Management to understand the drivers of Profitability and Deliver Success – Nigel Rayner – Gartner
This was an area I wasn’t too familiar with, but Nigel Rayner did an extremely good job in pitching the information and delivery as to not overwhelm novices, but not oversimplify and thus bore the teeth off seasoned practitioners.
Kicked off with increasing CEO turnover, then how the market measures CEO performance. Most organizations don’t have a handle on what actually drives profitability, which is where CPM can help with profitability modelling and optimization. The whale curve was discussed and Activity Based Costing.
A key point that was made is that BI is very often separate from financial systems and CPM links the two together.
Driving Business Performance in a Turbulent Economy with Microsoft BI – Kristina Kerr – Microsoft , Ran Segoli – Falck Healthcare
MS BI case study, focusing on the cost-effectiveness and speed to implement of Microsoft BI. I have had a lot of exposure to the stack and other case studies, so didn’t make notes. Sorry.
Does BI=Better Decision Making – Gareth Herschel – Gartner
Really enjoyed this session, for the main reason that this was a welcome step back from BI per se, and looking at decision making in general. He looked more at the theory of decision-making first, then linked that to BI.
The first area was predicting (root cause) or managing events, if this can be done effectively, then the increased speed of detection can allow more time to make appropriate decisions, especially as the more time you have, the more options you have available. This ties in to CEP (complex event processing) and BAM (business activity monitoring). In addition, data mining can assist in predicting events and scenarios.
This is a discipline that must be constantly reviewed, as what happens when prediction and analysis disagrees with reality? Either the situation has changed, or you didn’t understand it correctly in the first place.
He went through 4 key approaches to decision making and their rating of explicable vs instinctive and experience required.
- Recognition Primed
- Thin-slicing (“Blink”)
This fed in to information delivery methods. This would be selective displays such as dashboards, alerts/traffic lights, or holistic displays such as visualization, which are more ‘decision-primed’ than data-centric displays such as tabular representations.
It was clear that he saw visualization and very narrow, selective displays as the best way to aid decision-making.
In my opinion, all that’s fine and dandy, if you’re measuring and delivering the right targeted data 100% of the time, otherwise it is very easy to be blindsided.
Would certainly seek him out at other Gartner events for some thought-provoking content.
Gareth made some good book recommendations:
Various Dan Gilbert stuff on Emotional Decision Making – This is his TED talk.
A very good session, surprising at least one of the open source advocates in the audience with it’s upbeat message. A highlight was Donald Feinberg’s prediction that Unix is dead and the funeral is in 30 years. This is in response to Unix ceding to Linux in the DBMS world. It appears Gartner have relaxed their usual criteria in order to give OSS a chance to be evaluated based on support subscription revenue.
Feinberg also strongly recommended that anyone using Open Source must get a support subscription, to do otherwise being tantamount to lunacy.
On to the BI side of OSS and market penetration is low, with less than 2% of Gartner-surveyed customers using it. However, a growth area with small ISVs using it as an OEM strategy for their BI requirements.
The functionality gaps are getting smaller between commercial and OSS, with Reporting, Analysis, Dashboarding and Data Mining all now being offered, but still no Interactive Visualization, Predictive Modelling, Mobile play, Search or Performance Management.
On the DI side, other than the Talend/Bitterer argument, it’s not hotting up too quickly. DI is mostly limited to straight ETL of fairly meagre data volumes, daily batches of around 100K records.
Functionality gaps here are in the following areas: Metadata management, Federation/EII, Replication/Sync, Changed Data Capture, Unstructured Content, Native App Connectivity and Data Profiling/Quality.
An overarching issue to adoption in all areas is the lack of skills.
An interesting scenario that was floated was the creation of an open source stack vendor, namely Sun, snapping up some of the OSS BI players.
The Right Information to the Right Decision-Makers — Metadata Does It – Mark Beyer – Gartner
This was a useful presentation for me, as I am familiar with metadata, but not the systems and processes used to manage it. So the definition of metadata as per Gartner is data ABOUT data, not data about data. Metadata describes and improves data, unlocking the value of data.
I knew some classic metadata SNAFUs such as projects where measurements across country-separated teams were in metric or imperial, leading to untold costs.
Some others that Mark mentioned were very amusing, such as the data members of Gender. I can’t recall the exact figures, but one government organization had 21.
On to why metadata matters in decision making – it can be an indicator of data quality, it can indicate data latency and can provide a taxonomy of how to combine data from different areas.
In addition, metadata can help provide a business context of the data, in addition to mapping importance, user base and various other elements to give an idea of how critical data may be and the effects of improving that data or the impact of any changes in the generation or handling of the data.
Obviously SOX and Basel II also put increased pressure in managing metadata for the purposes of compliance, governance and lineage.
I think the takeaway for me was this, in terms of key questions that metadata should seek to answer.
- What are the physical attributes of the data (type, categorization etc) ?
- Where does it come from?
- Who uses it?
- Why do they use it?
- How often do they use it?
- What is it’s quality level?
- How much is it worth?
Stupidly, I ran out of paper, so had to take some notes on the phone. I don’t like doing that as it looks like you’re texting people, or Twittering. So, I limited myself to the bare minimum.
Performancepoint is weak with respect to the competition. I guess it’s even weaker now they’ve ditched planning.
Donald Feinberg is not a fan of SaaS BI. A view I agree with, party due to the data integration issues in the real world, as highlighted by Ted Friedman, earlier in the week. So, Donald decided to do a straw poll on who would be interested in/ implementing SaaS BI. I think there might have been 1 person, but possibly zero. There goes a bunch of predictions for 2009. The reason for this retiscence was one of trust, they just don’t want to throw all this over the firewall.
Another straw poll was the consolidation to a single vendor, most are doing this and very few said they were going to buy from a pure play vendor. I suppose you have to take into account the self-selecting group of end users at a Gartner BI summit though.
BI Professional – Caught in a Cube? – Robert Vos – Haselhoff
Entertaining presentation, but I was suffering with a bad cold and insufficient coffee, so didn’t get the full benefit. He did help me wake up fully for the next presentation, so can’t have been all bad. No talk of vendors and technology per se here, more stepping back and looking at strategy, organizational elements and making BI work from a people perspective.
This was an interactive session. Like a mock exam for BI folks where a bunch of people were randomly put in groups and asked to design a BI strategy. The results were pretty good and Andy Bitterer’s wish that they didn’t start naming vendors was fulfilled. However, I did note an issue with people really thinking details first, rather than strategy first. I also found it slightly strange that the CEO did not tend to come up as a contender for involvement. I saw more of this in Nigel Rayner’s CPM presentation, with CPM giving the CEO insight into profitability, so it seems to me to make absolute sense to have CEO involvement in the BI strategy, since the BI goals need to be aligned with the business goals. Some others did pick up on the alignment, but still saw it as in the CIO remit. All in all a pretty good showing, but the IT and ‘the business’ lines were still visible, if somewhat more hazy than before.
I took a LOT of notes in this session, so I’ll try and boil it down. Typical situation is a bunch of folks in the boardroom, all claiming different numbers. This leads to risky decision-making if unnoticed and a huge time sink reconciling when it is noticed.
Once again, there is a turf aspect involved, with data being considered IT’s problem, so they should be responsible for data quality. However, IT is not the customer for the data, so don’t really feel the pain that the business feels from data quality issues. In addition, IT don’t know the business rules or the domain expertise. It’s not a pure technology problem, but IT need to be involved to make it work.
There were some examples of the costs of bad data quality, leading to working out ROI for investment. With Sarbox et al, of course there is a new cost involved for the CEO/CFO, the one of going to jail if the numbers are wrong.
Another aspect of the ROI was based on the level of data quality, it may be that 80% is enough, especially when the move to 100% is astronomically expensive. The return on incremental improvements needs to assessed.
So, who’s on the hook for DQ then? Data stewards, who are seen as people that take care of the data, rather than owning it (the organisation owns it) they should know the content and be part of the business function, rather than IT.
An example to show how exposing data quality within an organisation was a DQ ‘scorecard’. This gives an idea of the quality, in terms of completeness, duplication, audited accuracy etc. A problem that I see with this is a kind of data quality hubris versus a data quality cynicism. If it works well, then the scorecard can give the right level of confidence to the decision makers, but if not, then it could lead to overconfidence and less auditing.
So, operationally the key elements are:
- Data Profiling / Monitoring – e.g. how many records are complete.
- Cleansing – de-duping & grouping
- Standardization – rationalizing SSN formats, phone nos etc
- Identification & Matching (not 100% sure here, I see some of this in cleansing)
- Enrichment – bringing in external data sources, e.g. D&B to add more value to the data.
Ideally DQ should be services, which are then reusable and repeatable – used by many different data sources. SOA model, although SOA is supposed to be dead isn’t it? Who knows, maybe the term has died – the technology and approach certainly lives on.
Lastly DQ ‘Firewalls’ were discussed. This is a set of controls used to stop people/systems from poisoning the well. Inbound data is analyzed and given the elbow if it isn’t up to snuff. It even incorporates a ‘grass’ element, where DQ criminals are identified and notified.
The conference starting to take its toll by this point, a flu-like cold and no more tablets left. Add that to a few pretty late nights, notably with folks from Sybase, Kognitio, the BeyeNetwork, end-users and even Gartner (not analysts, I hasten to add) and the writing is on the wall. Deciphering my handwriting is like translating hieroglyphics written by a 3 year old.
So, the summary of this session is ultra-short.
- BI MQ SAP/BO moved down a little, counterintuitive to some.
- DI MQ Data services / SOA capability is key. Tools need to supply and potentially consume metadata to play well in a real world environment. Currently 54 vendors ‘eligible’ for this MQ
- DQ MQ Pace of convergence between DI and DQ is increasing, it will become critical. Acquisitions will increase from DI vendors having to fill out their feature sets.
Overcoming The BIg Discrepancy: How You Can Do Better With BI – Nigel Rayner
I made a herculean effort to stay conscious in this session, mainly because I had enjoyed Nigel’s CPM session and he proved also to be a very nice chap when we chatted over a cup of coffee earlier in the week. In addition, I had paid for the 3rd day, so was going to extract every drop of value ;-)
Nigel kicked off with “the downturn”, of course. The message was do not hit the panic button. BI and PM will play a key role in navigating the downturn:
- Restoring business trust
- Understanding why, what and where to cut
- Identifying opportunities of business growth, or which parts of the business to protect
There was some reality also, in that it is unlikely that “Big BI” projects will be approved in the short term and you will need to do more with what you already have.
The plan of attack is the 3 I’s – Initiatives, Investments and Individual Actions
- BI/PM Competency Centre
- BI and PM Platform Standards
- Enterprisewide Metrics Framework
- Inject Analytics into Business Processes
Prioritization of investments is critical. Targeted short-term, cost-effective investments are the order of the day. Some suggestions include:
- Data Quality
- Data Mining
- Interactive Dashboards
- CPM Suites
There was a mention of ‘Spreadsheet hell’ being addressed by CPM.
- Take advantage of key skills as companies undertake knee-jerk cost-cutting, AKA get good laid-off people on the cheap.
- Redeploy key employees to tactical, short term roles rather than RIF-ing them.
- Respect “conspicuous frugality” but don’t be defined by it.
- Learn from others (i.e BI award winners, case studies, social networks)
- Evangelize BI
Then, it was a mad rush for the taxi to the station.
For more, detailed coverage of the event, check out Timo Elliott’s blog post.
January 21, 2009
Not going to be a long post, this, but wanted to get a few things down. Will edit/append later with some info from other sessions. Look at my Twitter feed also for some snippets.
Keynote: ‘The BIg Discrepancy: Strategic BI, but no BI Strategy’
BI Analyst Techno with Andy Bitterer and Nigel Rayner AKA Star Schema. BI-related refrains to the sound of banging techno with an ‘Every Breath You Take’ melody.
Once again, BI is #1 on CIO agenda. This has been the case since 2006, but not much further along. This is more due to human factors than BI tools. Many organizations don’t appear to have a strategy for BI and there are still problems in the following areas:
Still a lot of silo thinking and a proliferation of tools. Adding that to internal politics leads to a heady mix.
A straw poll revealed only 15 hands from customers that had a formal BI strategy. Some BI from Gartner is needed about how many customers were in the keynote, even an approximation would help.
Another key point was the ability of BI to support change, as well as the effects of making changes to variables. BI must be able to adapt quickly to external conditions, such as the ability to optimize to cost reduction instead of revenue growth, for example.
I was left with the impression that the biggest problem is not the tools, but the bad craftsmen (in the nicest possible sense).
Another idea that BI Competency Centres would help to make BI initiatives succeed, since it would likely be tasked with addressing the problem areas above explicitly.
The next musing was why does IT often sell BI to the business, when it really should be the business users driving it. One possible problem with business users creating BI requirements is that they may not know what is or isn’t possible, resorting to the comfort of reports when asked to define their own requirements. I suppose this is another plus point for a BI competency centre, which could serve the function of business user training and/or demonstration of the techniques and technologies available.
As a follow-on to this, the point was made that IT building BI systems in isolation, away from business users will very likely lead to failure.
This all reminds me of the work I did on ‘Expert Systems’ shells back in the early 90s, with the Knowledge Engineer (read IT BI person) and Domain Expert (read business user) working in conjunction. This was a pre-requisite of the approach, not a nice-to-have, as it seems to be with BI.
Unfortunately the web seems to have failed me for any really good references and solid examples of this, certainly none as detailed, iterative or collaborative as the processes we were using at IBIES back in 1992.
From Engelmore & Feigenbaum:
‘A knowledge engineer interviews and observes a human expert or a group of experts and learns what the experts know, and how they reason with their knowledge. The engineer then translates the knowledge into a computer-usable language, and designs an inference engine, a reasoning structure, that uses the knowledge appropriately’
Then they went through the 2009 Predicts, which are available here.
I’ll probably add updates later, just wanted to get some info on the keynote down.
December 19, 2008
Business Intelligence is a term that covers a multitude of sins. It is also a term which is extremely open to interpretation, depending on your viewpoint, technology mastery, user skillset and information environment.
Creating new terms, especially acronyms is what the technology industry does best, they delight in it, but it does serve some purpose other than the amusement of marketing folks and analysts.
To go back to an old paradigm, creating labels or categories is an essential part of the market. Not just the BI market, but any village market, or supermarket.
Categories help consumers navigate quickly to the types of products they are interested in, like finding the right aisle to browse by looking up at the hanging signs in the supermarket, or the area in the village market where the fruit vendors gather. Labels give more information, such as pricing, size etc and then it is down to the product packaging and the rest of the marketing the consumer has been exposed to in terms of advertising, brand awareness and so on.
Business intelligence is a pretty long aisle. At one end, the labels are pretty narrow but at the other, very very wide, to accommodate the zeros after the currency symbol and ‘days to implement’ information.
The problem is the the long aisle – vendors need to break that aisle up into manageable (walkable) segments to help the customer navigate quickly to the solution they need.
The other problem is that in this case, the supermarket is not in charge of the category names, not even the vendors or analysts are – it’s a free for all.
This means chaos for the poor consumer, all capering around in the aisle like some kind of Brownian motion.
Thinking about this, after being bombarded with a panoply of BI terms lately, I thought of INCOTERMS, which is a standard set of terms used in international sales contracts. These terms are strictly defined and published by an independent organization so that both parties in the transaction know exactly where they stand.
According to Boris Everson of Forrester Business Intelligence is “a set of processes and technologies that transform raw, meaningless data into useful and actionable information“
Not sure about that one myself – who acquires and stores meaningless information? Other than maybe Twitter. Other suggestions most welcome. It might help show the possible technologies Forrester are referring to.
This certainly excludes my product, since we work with data that theoretically, people are probably already making decisions from. They just need to slice and dice it differently.
The concept of transforming raw data is easier to work with (in the Forrester BI definition sense anyway) as it could refer to something like a web log, which is pretty difficult to gain any insight from by looking at it in a text editor, unless you have an eidetic memory and the ability to group and summarize the various keys and measures in your head.
Now, as often is the case when you start writing about a topic, the research you do unearths people who have written pretty much the same thing before you.
Going back to definitions, finding Todd Fox’s decent definition of BI – “A generic term to describe leveraging the organizations’ internal and external information assets for making better business decisions.” from a define:Business Intelligence search on Google, leads to Todd’s own attempts from a Data Warehousing perspective, which, in turn was prompted by James Taylor’s post on the confusion around the term analytics (in the context of BI). In addition, even Kimball was involved with his “Slowly Changing Vocabulary” section in one of his books.
This at least tells me I’m on the right track, if not entirely original.
In 1989 Gartner’s Howard Dresner defined BI as “a set of concepts and methods to improve business decision making by using fact-based support systems“
More definitions can be found from Larry English and probably ad infinitum, or at the least, ad nauseam.
The depressing thing here is that we have only got as far as the “umbrella term” as BI becoming popularly known.
<Aside> A Dutch student at the University of Amsterdam even wrote a paper titled “Business Intelligence – The Umbrella Term” complete with an umbrella graphic on the cover page. (It’s a doc file, so I won’t include the link. Google it if you’re interested)</Aside>
When we start to address even Forrester’s BI buzzword hoard, never mind the others out there, it begins to lead to a total breakdown in the tried and tested categorization mechanism.
To revisit the source of the proliferation, it appears that analysts (likely as a proxy for large vendors) and vendors themselves are the main culprits. The analysts, by virtue of some level of independence and a cross-vendor view can be seen to be the arbiters of the terms. The problem here is that the analysts often use slightly different terms or at least different meanings for the same terms.
Naturally, both vendors and analysts want to proliferate and blur terms to aid in differentiation, or to try give the perception of innovation and progress.
Although this is very seldom the case, as new terms are often just fracturing or rehashing existing categories and terms.
However, in some cases, drilling down into more narrow categories or updating terms due to changes in technologies or approaches is not necessarily a bad thing, if the terms/categories still aid in establishing a contract of understanding between vendor and consumer.
If we want to accommodate this, the ability to establish a common understanding, based on input from across the board – analysts and vendors, would be beneficial to all. The problem is, you need a real independent organization that can accommodate the horse-trading, as well as maintaining an authoritative definition of terms which is acceptable to all parties.
Some amusing aspects of this I can foresee would be “Active Data Warehouse” – would you have to then create a new term “Passive Data Warehouse” to group the applications that did not fit the criteria of “Active”. I imagine a semantic arms race that would have to be kept in check – IBMCognoStrategyFire pushes for a “Smart ETL” category, which forces the other ETL vendors into the “Dumb ETL” pigeonhole. Dealing with this is what standards bodies do.
This is more musing than actually being stupid enough to think this is ever going to happen. I do have sympathy with the poor customer trying to navigate the shelves of the BI supermarket though. As someone just trying to keep a lazy eye on the machinations of the industry, it can be overwhelming.
Here’s a short quiz.
What BI term does this refer to?
“centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format.”
No. Much earlier.
Maybe we could just provide a thesaurus, so when someone is puzzling over the latest buzzword, they can look it up and say ahh, I know what that is, we tried to implement something like that back in the early nineties.
UPDATE: Read this excellent article from Colin White – I didn’t see it before I wrote this – I promise!
November 26, 2008
Following an exchange with some BI folks on Twitter, in addition to the various articles on spreadmarts and compliance issues with Excel usage by those pesky ‘users’ there is definitely an element of the old ‘high priest’ model in all of this. However, it is most likely economics that calls the tune.
In essence, when something new is introduced, whether that be technology, or religion, there is always a period of control by the early adopters (and indeed charlatans), who are naturally keen to implement a hierarchy to benefit themselves. They become the required intermediaries in order for the masses to get what they want. In the IT world, I don’t necessary claim it’s full of charlatans, or the hierarchy is there for job security, but the traditional model is what it is.
Quakers, for example, broke away from this and decided to cut out the middle men.
In their case, it was more a theological gap, but with BI, the gap is the one between supply and demand.
In real world BI today, this is reflected by the use of familiar and available tools, primarily Excel, to bridge requirement/delivery gaps. To an extent, Microsoft have historically recognized this and provided plenty of rope for users to hang themselves with.
To backtract a little and revisit what I consider to be BI – some folks associate BI solely with the more complex (and/or esoteric) analysis found with data mining and heavy statistics on massive data sets, for example the (sadly untrue) beer and nappies (diapers) story. I am a little more catholic than that.
To lift an idea from Michael Gentle, here’s Karl Marx on complex BI:
“A commodity appears at first sight an extremely obvious, trivial thing. But its analysis brings out that it is a very strange thing, abounding in metaphysical subtleties and theological niceties.”
BI is all that, but covers much more of the mundane as well. If you cannot easily find out how many of a certain widget you sell in the week before a holiday, or even how much of a certain widget you sell in a given geographic area, then when you eventually get your hands on this information, it’s business intelligence, or decision support, depending on your vintage.
“From each, according to his ability; to each, according to his need”
Therein lies the rub.
The maths just doesnt add up in the arms race of IT vs information workers. IT’s ability is completely swamped by the needs of users. Nothing new here, I would contend that this is accepted wisdom.
To give you an example. Of 50 finance users, every week, 5 will require new custom information. Sounds reasonable. However, I am talking about a real world example here. So, each piece of new information information could come come directly, or from a combination of 20 different systems. In addition, some systems are not in-house, so a short specification has to be written, sent to the provider, a quote comes back, is examined, eventually approved and implemented. So we are talking about 250 requests per year, but as you can see, these requests can be pretty costly and/or difficult to fulfill. <potential_product_plug_warning>I know about this stuff, as our product is often used in a guerilla-style way to sidestep these issues</potential_product_plug_warning>.
The planned economy just don’t work here my friends, we’re going to have to consult Adam Smith.
Information needs come from many sources, many levels of seniority and also have different profiles in terms of how time-critical they are. The internal market for information, when in a planned economy scenario (i.e. IT-centric) is often dictated by how important the information requestor is and how easy the request is to fulfill – to meet internal SLAs and so forth.
“The monopolists, by keeping the market constantly understocked, by never fully supplying the effectual demand, sell their commodities much above the natural price.”
When the market is opened up to users, who generally know their data and exactly what to do with it – look – there’s the benevolent figure of Adam Smith again!
“The greatest improvement in the productive powers of labour, and the greater part of the skill, dexterity and judgement with which it is any where directed, or applied, seem to have been the effects of the division of labour”
And the users rejoice (whilst mangling a Smith quote):
“The natural effort of every individual to do their own analysis … is so powerful, that it is alone, and without any assistance, not only capable of carrying on the society to wealth and prosperity, but of surmounting a hundred impertinent obstructions with which the folly of IT too often encumbers its operations.
However, much as users would rejoice in having these freedoms, there is a downside.
“The property which every man has in his own labour; as it is the original foundation of all other property, so it is the most sacred and inviolable… To hinder him from employing this strength and dexterity in what manner he thinks proper without injury to his neighbour is a plain violation of this most sacred property.”
The downside is the small matter of the injury to his neighbour, which in this case is probably a whole street of neighbours – Database Team(s), Apps Team(s), Data Quality folks, Internal Audit/Compliance etc etc.
So how do you resolve the conundrum of creativity/productivity vs control/compliance?
Microsoft’s stab at this is Gemini, using an analogy of the twin aspects of IT control, with ETL processes feeding into SQL/DatAllegro/Analysis Services and the user empowerment of the ubiquitous, familiar Excel client plus the “social” aspect of sharing their creative works through SharePoint.
I have only limited information on Gemini at the moment, so my summary is probably sketchy, although I hope, still accurate.
The hope is, as I see it, to prevent the users from poisoning the well, but still allowing them to drink deeply with the paternalistic arms of IT around them. Kind of like socialism, rather than rabid, laissez-faire capitalism.
Hat tips & further reading
Sean O’Grady (Control vs Creativity)
Michael Gentle (Good guy/Bad guy)
Watch MS BI Conference Keynotes (Mosha Pasumansky)