In its report, Raising the Corporate IQ, Forrester Research concluded that in order to gain competitive advantage, businesses today need to provide all employees with quick access to comprehensive information about the business, its customers and its rivals.
Data warehousing can be of considerable help when turning structured data into useful information. However, Forrester found that traditional database vendors are slow off the mark when it comes to dealing with non-structured data such as office documents, Email messages and reports.
A considerable amount of corporate information is still stored in this way - it has been suggested the figure is as high as 90% in many companies.
Search engines are an obvious and ideal way to provide quick and effective access to a variety of data types. The success of search engines on the Internet itself has brought this sort of technology very much into public view, and the increase in intranet Web servers has naturally turned attention to using search engines internally. However, search engines are by no means restricted to searching HTML pages. They can index literally hundreds of different document types and most also offer some sort of connectivity to structured data sources.
Search engines can do more than save time by speeding access to information.
They can also save work by avoiding duplication - you can use them to find work that people have already done.
Microsoft has recently entered the search engine marketplace with Microsoft Index Server. This is available free from Microsoft's Web site, and is designed to integrate closely with Windows NT and Microsoft's Internet Information Server.
Although Microsoft Index Server is free, users cite many other reasons to be happy with it. Ian Moran, network manager at Unibol, says: "I've tried most of the available search engines, and as far as I'm concerned I'm using the best available software - price was not a consideration.
If there was something better, I'd buy it."
At Unibol, Index Server is used to index some 5Mb to 10Mb of data on the company intranet. "Index Server will index Microsoft Word documents, and will also index on any document property, like author, date and title," says Moran. He found it very easy to use: "It's just plug-and-play and then forget about it. The results are good, and it automatically updates its index."
Although Microsoft Index Server is free, and clearly aimed at one end of a very wide market, it's nevertheless fully featured. In particular, Bull Information Systems makes use of the multilingual capabilities, as technical support consultant Mark Rowlands explains. "It's reasonably smart at sorting and indexing documents, carrying out contents searches and producing synopses, in seven or eight different languages. We work in three languages - French, Swedish and English - so this is particularly valuable for us."
As you move upmarket, vendors such as Verity, Fulcrum and Muscat have a very wide range of offerings. For example, there are 52 different items on Verity's price lists. This alone should be enough to give you an indication that there's a lot more work involved in selecting and using these products - but of course, the results you get should be commensurably better. Verity, Fulcrum and Muscat all expect some of their users to have to carry out customisation work in order to get the best business value from stored information.
But Muscat also offers a free version, capable of indexing up to 1,000 pages. This provides users with an ideal opportunity to experiment with the product, before deciding which of the hugely complex range of offerings is most appropriate for their full system.
There are a number of interesting features to be found on the more elaborate products. The importance of these to you can be useful in making a choice between the different vendors.
Some vendors, such as Verity, offer a personal version of their search software. This can start out by searching Email folders (public, private and shared), your own hard disk, and then move on to look in the main Verity indexes on the corporate intranet.
Both Verity and Fulcrum offer intranet "spiders". A spider can go out and search sites in your local intranet, and also the Internet itself, to compile its indexes. This can be particularly useful for indexing a particular company's site (for example, a competitor's) or industry sector.
For Muscat, you provide a list of the sites (internal and external) that you want indexed.
Most products will index a huge range of document types. Since there are relatively few vendors from whom to license the required technology, it's quite normal to find competitors quoting exactly the same list of 200+ document formats that they can read. Where there is a difference, though, is in document delivery.
All the products use a web browser front-end, and they'll all produce a synopsis of the documents they find as a result of a query (including Microsoft's Index Server). These synopses are delivered as an HTML page.
However, Muscat and Microsoft require that your every desktop machine be equipped with viewers for all these document types, whereas Fulcrum and Verity perform automatic translation of the documents to HTML on-the-fly.
Performance is one area in which all products seem to do well, at least in the eyes of their users. Martin Bayton, marketing manager at PAFEC, says: "The Fulcrum search engine is embedded in our document management systems. Performance has been more than adequate - indeed, one of the things our users like is that it can do away with the need to fill in an index card at all when filing a document. Instead, they can rely on the Fulcrum index."
Muscat's strong point in the performance arena is the small size of its kernel - it requires just 200Kb of RAM, and can happily handle 10 searches per second on an index of 17 million documents. And Muscat performs some clever tricks to keep the size of the index at manageable levels, and speed up searches. Instead of indexing every word within a document, Muscat will only select the 60 most important ones.
From the user's point of view, though, what counts most is the ability to actually find something useful. Many people have become disillusioned with Internet search engines, after being notified that there are some 200,000 documents containing their search term. Most index systems expect you to go off and master some arcane language to hone your query to produce better results, but there are other approaches.
Muscat provides an easy way for ordinary users to refine their search.
All you have to do is look at the returned list of documents, and then select those that are most interesting. When you click on the "improve" button, Muscat will automatically suggest other search terms that will help refine the search. This process can be repeated over and over again, and avoids the user having to painfully construct complex Boolean expressions.
Furthermore, Muscat attempts to be cleverer than most when finding documents in the first place. Instead of using word frequency, Muscat automatically applies weighting to each word in your query. Although frequency and proximity are taken into account, weighting is given greater importance when searches are carried out. The process of refining the search will change the weighting used, as well as introduce new search terms.
Finally, just a few words on the kind of technology that is just around the corner with the likes of agent technology. On the Internet itself there is now a trend away from expecting users to have to carry out their own searches, and towards the customised and automated delivery of personalised information via Email. This trend is starting to appear in search engines for internal use, such as Verity's Search'97 automated agents. An agent will automatically Email you with new documents that appear which match your chosen search terms. Agents are also starting to appear in other products such as Fujitsu's groupware product, TeamWare.
Microsoft: www.microsoft.com or telephone 0345 002000
Verity: www.verity.com or telephone 01372 747076
Fulcrum: www.fulcrum.com or telephone 01293 419940
Muscat: www.muscat.co.uk or telephone 01223 421222.
Some parts of Atacama have not received rainfall for 500 years - but a sudden deluge of water upset the Desert's delicate biological balance
Spitzer Space Telescope could not spot Oumuamua, suggesting that it is actually pretty small
Greenland crater one of the 25 largest impact craters on Earth
This long-sought progenitor star was identified in an image captured by Hubble in 2007