As client-server technology developed through the early 1990s, it became clear that there was a problem. While workstations and PCs could make users more productive, and their use of computers more enjoyable, many of the most important applications, and the data associated with them, resided on proprietary mainframe computers. To make client-server really useful it needed to be linked in with these legacy applications.
Building a server application to run on a mainframe was tricky. Specialist skills were needed that the client-server vendors did not have, the mainframe was a jealously guarded and regulated environment, and some applications were so old that nobody really knew how they worked anyway. But they did work, and produced data that populated the green screens of visual display units that older users of IT will remember.
The solution that the client-server vendors came up with was screen scraping. This involved grabbing a screen produced by a mainframe application and, knowing the co-ordinates of the useful data, working though the screen image and extracting the bits required and displaying them in a fancy new client application.
All well and good. But the problem was that many mainframe applications were a bit more complicated than this, requiring a level of user interaction before the final output was produced. To solve this, more complex mainframe adapters were built that could manage the required interaction between the user and the mainframe, and eventually display the results on the client.
This potted history has been repeated recently in a slightly different guise with the web. There are a number of reasons for automating the reading of web sites: search companies want to know what content is where; price comparison sites need to keep themselves up to date; and the web can provide a wealth of competitive information about constantly changing markets.
Static web sites are one thing, and the search companies' crawlers mine such content on a daily basis. Extracting useful information from dynamic web sites is another matter, as it presents the same problem faced by the client-server vendors on the mainframe. Enter web scraping. As with screen scraping, web scraping could only go so far, perhaps a one-off query to produce a dynamic screen with no further interaction required. Anyone who has booked a flight online knows that the reality is more complex than this.
The best way to determine the cost and availability of a given flight is to proceed with the booking process as far as you can without actually committing to a purchase. This requires a number of interactions: departure point and destination, preferred dates, composition of party etc.
This data is often requested though a series of interactions that entice you down the path to purchase. And, of course, it is not just air travel. It is the same with hotel bookings, offers on a retailer's web site, the latest prices of hardware from IT dealers and so on. Getting to pricing information can be a highly interactive experience and, most importantly, prices are dynamic. Just making a booking for an airline ticket might mean that the price is higher for the next customer.
Geoengineering on the sea floor near glaciers would form a new ice shelf to prevent melting
Alterations in capillary blood flow can be caused by body position change
Curiosity rover is in 'normal mode' but not transmitting scientific data back to base
NatWest outage comes a day after Barclays' IT systems shut out customers and staff