As the Net matures, two key issues are converging. One is the increase in traffic, the second is that Web sites are becoming a key marketing tool for businesses.
Companies which are serious about the way customers 'experience' their Web sites shouldn't ignore this interdependence. Content and aesthetics aside, ask yourself two simple questions:
What is the experience that I deliver to visitors via my Web site?
How will that experience be affected if the traffic on my site doubles?
Many people do spot-checks on the performance of their Web site locally. They fire up Netscape, enter their home page URL and time how long the page takes to appear. The results can be encouraging but they can also be deceptive for a number of reasons.
First, many users forget to flush their browser's cache. Cacheing is a mechanism used by a browser to speed up your Web surfing. The browser simply stores frequently requested HTML pages and images, such as your company's home page and logo, locally on your computer. If you frequent a certain site, what you see in your browser may well have been pulled completely from data cached locally.
Most spot-checks don't dig deep enough, either. It's worth testing more than the speed of accessing your home page. Imagine how a typical customer would travel around your Web site, then follow the same path yourself.
Depending on how your site is set up, the time it takes to view a series of pages will probably differ considerably from the time it takes to access the same pages individually. Many Web servers also cache pages and images, which will affect response times.
Local spot-checks can also be misleading because the path you take from within your company's intranet to a resource on your own Web site is probably shorter than the route your customers take, most likely through a busy Internet service provider (ISP). If your visitors reside within a corporate intranet, their requests are probably coming through a proxy server as well, which results in even more hops.
Proxy servers sit between a corporate intranet and the public Internet. When a user wants to visit your Web site, the request goes first to the proxy server, which makes the request to your Web site. The proxy server collects the data from your Web server and forwards it to the user who requested it. Some ISPs use proxy servers to route Internet requests.
A third factor is the speed of your own network connection. If your corporate intranet is connected to a leased line, you can retrieve data from your Web server much faster than an end-user can using a 28.8Kbps modem over a noisy phone line. Multiplying the time it takes you to retrieve your home page on a leased line by the difference in network connection speeds will only give you a rough estimate of the difference in performance. And if your Web site attracts a diverse group of users, you'll have to provide acceptable performance for all of them, whether they have 28.8Kbps, 14.4Kbps or 9.6Kbps modems.
The good news is that some Web server vendors have the foresight to record visitors' timing information for you. The bad news is that it's usually hard to get at the information. If your Web server records the information in its logs, you will see for each log entry the time it took to transfer the data for that request. Keep in mind that timing data collected by your Web server is raw information. To make sense of it, especially on a busy site, you'll probably need a log analysis tool which breaks apart the information in a more meaningful way.
If you really want to find out how your end-user experiences your site, call a friend at the other end of the country and have him or her call up your site at different times of the day and time how long it takes to retrieve your home page and a couple of pages after that. You may be surprised.
Hopefully, your friend confirms that your Web site is performing at acceptable speeds. But things on the Net have an odd way of not scaling linearly. Will performance still be acceptable when twice as many people are visiting your site?
Fortunately, there are tools on the market to help you analyse your setup, simulate possible increases in usage, pinpoint scaling bottlenecks in your system and even help you build a site with the ability to scale up to meet ever-increasing traffic. First, you should study how your site is being used. The more you know, the better equipped you will be to fine-tune it and prepare for upgrades. The log analysis tools currently on the market, such as EG Software's AuditTrack, Interse's Market Focus 2, I/Pro's NetLine and net.Genesis's net.Analysis Pro 2.0, can help you understand usage patterns and tell you which areas of your Web site attract the most traffic. They will also give a proportional view of how different parts of your site are being used.
This last function is important because your Web server's reaction to twice as many users searching your product database may be different from its response to the same number requesting the home page.
These tools enable you to look at trends in usage and performance at a site. Analysis of usage trends for the past three or six months can tell you whether your traffic is doubling every quarter, month or week, and how usage patterns are changing over time.
Simulate to accumulate
Armed with a thorough understanding of the current picture, you can use one of the load-testing tools to simulate different growth and change scenarios. Most provide a good degree of flexibility in creating realistic scripts of how users move around your site. Once you've recorded a few common scenarios, they can simulate any number of simultaneous users performing the same actions.
The more sophisticated simulation tools show you in realtime the way your server's performance degrades as more simulated users visit your site. By intentionally overloading your Web site and carefully watching the realtime meters, you can identify the number of users at which your site's performance falls below reasonable levels. Use this information to estimate when it will be time to upgrade, coupled with trend information from an analysis tool.
Load-testing software is also invaluable in locating the bottlenecks in your Web server. They can occur in many areas - server software, memory, CPU, disk speed and network bandwidth. If memory appears to be the problem, you can avert problems by installing more. If network bandwidth is the limiting factor, you may have to upgrade your network connection.
Balancing the load
At some point, traffic to a Web server may be too heavy for one computer to handle effectively. The obvious solution is to put another computer to work. A mirrored site is a simple solution - it takes all the Web site resources on the existing server, mirrors them onto another computer and then splits the traffic. This is how many of the larger sites handle millions of hits per day. There are a number of ways to split the traffic but two common methods are round-robin DNS (Domain Name Server) and dynamic IP (Internet Protocol) redirection.
IP addresses (e.g. 184.108.40.206) are based on a hierarchy that helps one computer find another computer on the network quickly. In round-robin DNS systems, the Web server name is associated with a list of IP addresses. Each address on the list maps to a different computer and each computer contains a mirrored version of the Web site. When a request is received, the Web server name is translated to the next IP address on the list.
By translating Web server names to IP addresses in this fashion, requests can be load-balanced to multiple computers.
Dynamic IP redirection is the second common method. The main Web server takes all requests for the site's home page but the visitor's browser is redirected to another URL to satisfy the request. The magic of this method is that the redirected URL could be on the same computer as the main server or any one of several back-end mirror computers. The main server redirects the traffic to back-end Web servers based on their current loads.
Many original megasites had to create their own load-balancing software. But as load-balancing becomes an issue for more Web sites, out-of-the-box solutions have come to market. A tool from HydraWEB Technologies (www. hydraweb.com) provides fault-tolerant load-balancing across multiple servers, intelligently routeing large volumes of HTTP requests to the server best equipped to optimise performance at the time. Cisco Systems also offers Internet scaling solutions (www.cisco.com/warp/public/751/lodir/swww_wp.html).
The explosion in Web traffic bodes well for Internet commerce. To keep pace with the growth, serious Web site owners must understand current usage, traffic and performance trends to ensure their customers' Web experience remains as satisfying as their experience of any other facet of their business.
Niel Robertson ([email protected]) is a software engineer at net.Genesis, an Internet and Web software development firm based in Cambridge, Massachussetts.
'We are making good progress on 10nm,' claims Intel
Engineer calculates that Chengdu's plan to replace streetlights with artificial moonlight would cost $100bn
Research could also apply to other 'space weather' events involving hot, fast-moving plasma
Dark matter holds the Universe together - and gravitational waves could help identify it