Apis Networks

adaptive service monitor


No two servers are alike. No two servers will ever experience the same conditions. For those evolving servers, we have created the Adaptive Service Monitor™ ("asm"), a statistical monitor that collects, analyzes, and adapts the server to the changing needs of its users.

How it works

asm periodically collects samples from selected services at arbitrary intervals. These data are then compared against historical trends with a focus on significant changes between samples. If significance is found between the samples (α = 0.05), then further analysis is done to determine the cause of a bottleneck. asm analyzes process information, memory utilization, disk I/O, and network throughput, then adapts the server by lessening the burden of the conflicting service. After exploratory tuning is performed, server information is recorded and reanalyzed to determine a success rate. This information is used for future decision making. As a result, the server stays healthy during peak hours and constantly retunes itself as Web sites grow.

An example

In order to better understand how asm operates, let us step through a brief example. This is a guided tour that steps through the basic process in asm given two variables.

  1. Consider that we have a MySQL server and Web server running in tandem on 2 GB of RAM. MySQL is consuming 1 GB of RAM, and the Web server, the remaining 1 GB. The rest of the memory is paged to disk through a swap file. Paging reduces server performance by shifting memory from the faster RAM to the slower hard disk.

  2. Under an extraordinary circumstance, a site suffers from the "Slashdot Effect", meaning that it receives an influx of page requests beyond normal tolerance levels. The Web server scales up the number of concurrent connections in an attempt to cope with this drastic change.

  3. An average system would buckle under the pressure and become progressively slower during this timeframe. asm analyzes the server loads and discovers a spike in load averages. After exploratory research, it determines that the Web server has grown in memory utilization from 1 GB to 1.5 GB with 1/2 GB of memory being paged to disk, thus significantly reducing performance.

  4. asm analyzes other services and notes that MySQL's query and table caches are being fairly underutilized, meaning that an unnecessary chunk of memory is being allocated to MySQL's buffers for future caching that never occurs and is unlikely to occur in the near-future in a normal environment.

  5. MySQL is retuned to reduce the caching buffers thereby freeing up memory to be used by other applications, such as the Web server. The Web server now has an extra 512 MB of RAM available, which it uses to eliminate paging by shifting memory processing back to the much faster RAM. Load averages subside to normal tolerance levels and asm records the response for future decision making.