Data warehousing, like Topsy on speed, is growing at a phenomenal rate. So much so that, while only a fifth of organisations have such an architecture in place at the moment, soon after the turn of the century four out of five firms will be using it.
That?s the prediction of David Wells, senior consultant at Ovum, and he?s not alone in predicting explosive growth. Equally illustrious lights such as Gartner, IDC, Meta and IBM also now forecast a sizeable data warehouse market, ranging in value up to $30 billion by the year 2000.
Meta alone estimates that the 400,000 users of business intelligence information in the US will explode to 10 million ?knowledge workers? by the turn of the century.
Why all the excitement? Well, as we enter the knowledge age, many believe organisations will exploit a previously unmined asset and that the user will finally win the data access battle.
A data warehouse is a flexible environment made up of technologies that take an organisation?s operational, historical and external data, consolidate it into a separately designed relational database, manage it and then mould it into a subject-oriented format for users to access and analyse.
The US pioneered the early data warehouse projects while UK customers were implementing early pilots and reserving judgement, but that has changed in the last twelve to 18 months.
According to Wells, this rise in interest has forced platform suppliers, software vendors and consultants, to create a ?virtuous circle? of interest. ?While the idea of the data warehouse has been around for 10 years or so, the current explosion in the popularity of the data concept is due to technology push and user pull,? Wells says.
Certainly, the worldwide chopping of middle management has meant that users can no longer rely on large support staffs to assemble the data they need to make a critical decision. Because competition is so tough now, mistakes can be expensive and are not tolerated as much as before.
?Relational database management systems have also improved significantly in performance over the past five years, partly through exploitation of parallel hardware,? says Wells. ?Numerous graphical query tools have also come to market, promising to make it possible to navigate large stores of relationally-structured data without having to master the subtleties of SQL.?
A company?s ultimate objective should be to use the data warehouse to offer better customer service, create greater customer activity, focus customer acquisition and retention on a company?s most profitable customers, achieve intimacy, increase revenue and reduce operating costs. In other words, help it become more profitable and competitive.
Another key aspect of data warehousing that is often overlooked is the importance of business process re-engineering and its effect on the user. As senior executives develop strategy, managers should co-ordinate change and redirect resources from mainframes to networked distributed computing and review the quality of the information.
Currently there are three major groups of data warehouse users, each with specific skills and data demands. The first is the casual or executive user who is comfortable with what used to be called executive information systems, or EIS.
The second is the business analyst who is spreadsheet-literate and knows how to manipulate and present data for maximum impact. The last is the power user with the sophisticated skills to manage meta data (data about data).
The goal of data warehousing is to resolve data access difficulties that have been recognised for many years, but have remained insurmountable because suitable technology was lacking.
The key problems include: data that is unavailable because it is hidden in transaction systems; delays as under-powered systems try to perform complex queries; user-unfriendly software interfaces; the difficulty of discerning patterns in large amounts of data; the costs and complexity of supporting remote users; and competition for computer resources between transaction systems and decision support systems.
Now, however, users have the storage capacity, along with fast, inexpensive parallel processors and new analytical software, enabling them to fashion thousands of data warehouses. More user-friendly query and analysis tools let more professionals exploit data warehousing. The data warehouse is integrated with popular tools customised for Web use and data mining applications.
A case in point is NatWest Insurance Services, a wholly-owned subsidiary of the high street bank. It is one of the largest independent insurance intermediaries in the UK, selling mainly household, business, travel and credit protection insurance.
The organisation has 30 different operational systems of varying sizes. ?Executives were complaining that information was never timely and never what they really wanted,? says Julie Pratten, the NWIS manager of its management information project. ?If the managing director wanted the number of new policies put on the books last week, with their premium value and commission value, we didn?t have a central source of information to provide it?, she says.
An EIS system had previously met reporting requirements of NWIS, but as the company repositioned its business and took greater control in the pricing of its products, the limitations of the system became apparent.
?We didn?t know we needed a data warehouse at the time. We were about to embark on a project to replace some of the legacy systems. We wanted to exploit the new information we would capture, so management information was the key,? says Pratten.
The company?s new head of operations commissioned a consultancy project to define the pricing analysts and the marketing department?s information requirements. Pratten, with a team of five, went off to build a larger, more complex management information system catering for their needs.
?We didn?t just want statistics, but information about the key drivers of the business,? Pratten explains. ?We needed a large database to hold all the data in a common format. It would then be available to a range of user tools sitting on top. We wanted a structured system, like an EIS but on a larger scale, with more detail and more complex statistical models.?
To fit in with NatWest?s architecture, Microsoft?s SQL Server 6.0 database was chosen, running under Microsoft Windows NT Server 4.0 on an NCR server. The team didn?t have a way of moving data relatively painlessly to the warehouse from the other SQL Server (operational) systems. Resources were limited and Pratten didn?t want to employ a huge team of programmers and experts copying databases, so she used SAS/ Warehouse Administrator to automate the process.
The team then built a prototype using data from the company?s household claims department to prove the concept of migrating data from one system to another and then interrogating it.
Automating the process shifted the team?s focus from moving data to ensuring its quality. As in most companies, improving the quality of data is a challenge. An important first step is ensuring that users understand the data they are using. For example, ?gross premium? may mean one thing in one business area and something different in another. So the team has helped the business to agree a standard set of data definitions and terminology that everybody understands.
Validating data is another important aspect of data quality. The team provides operational departments weekly with an extract of data to validate.
?It helps them to be involved and maintain ownership of their data,? says Pratten. ?Data input people have ways of short-cutting the system which we pick up. We have had instances of people entering the year 1066 when they don?t know when a property was built. The implications of this otherwise only come to light when data modelling and analysis take place.?
The team is also providing feedback to new operational systems, which makes data extraction easier, and is trying to introduce standards across the company: ?It is recognised that our team can identify a lot of problems and add value back to new projects,? says Pratten.
The data warehouse development team has set up a cross-company information management group with the original aim of bringing people together who provided, supported or used data as a communications forum. But its focus is shifting towards best practice in information usage, as people use the information in the warehouse more and develop their analytical skills.
The team?s goal is to structure the warehouse in an uncomplicated way. The challenge is to make people who are not computer literate feel comfortable using the warehouse and understand the information in it. The project team is to become a business unit in 1998.
?The company has recognised the investment it has made in the warehouse and is committed to exploiting the data in it,? says Pratten. ?We are seen as the experts in cleaning and managing data.?
Building the data warehouse will involve a continuous process of extracting more data from more source systems. The marketing department is the key driver as it works hard to understand customers. The team is currently investigating a link to NatWest UK?s customer information database which will provide data to a Maestro marketing database.
To make the data more meaningful, the team is about to embark on a data mining prototype for the business insurance department. ?We need to move beyond reporting into modelling and analysis,? says Pratten. ?We must train our staff and develop our modelling capabilities.?
The department already knows how many quotations are issued and how many policies result from them, and can calculate a conversion rate. It now needs to understand why people don?t convert and to see what underlying factors can be changed to improve the conversion rate. This is one example of how data modelling techniques will be applied to the warehouse.
?We recognise that within NWIS we have been data rich, but information poor,? says Pratten. ?If we are to achieve our strategic objectives, we need to understand more about our customers and more about our key business drivers and where we can make changes. The data warehouse has brought us more business focus. It has highlighted the awareness of data and how it is being used.?
Do you agree?
Have your say on this article