- Real-Time Monitoring
- Capacity Planning
- Controls & Actions
- Event Management
- Reporting & Analysis
- Scalability & Trust
- Architecture Overview
- Supported Technologies
- How to Buy
- Hyperic Enterprise
- Hyperic Open Source
- Cloud Monitoring
- Feature Comparison
Hyperic HQ and MySQL Backend Performance Study
- Executive Summary
- Test Environment
- Test Results
- Test Interpretations
In this series of backend performance studies, the performance engineers at Hyperic ran high load performance tests and proved that Hyperic HQ with MySQL as a backend database provided outstanding performance for gathering large volumes of metrics and storing data. In this test, during which we measured the performance of 675 nodes with above average metric collection load to determine baseline statistics and then simulated a 5-hour outage to record peak database usage, we found the following:
- Average metrics collected per minute – 220,000
- Peak metrics collected per minute (after simulated outage) – 2.3 million
- Server load and CPU utilization remained relatively low, suggesting that system resources were not fully utilized
The results demonstrate Hyperic HQ’s capacity for monitoring and processing rules at scale. Additionally, the results provide additional insight for derived performance expectations when monitoring and managing larger scale. In fact, the tenfold increase in 220,000 metrics per minute collection from the actual test to the peak collection of 2.3 million metrics per minute in the backlog test suggest that there is substantial room for growth as compared to the test environment.
In the fall of 2006, demand for Hyperic software was accelerating, and so was the customer demand for backend MySQL database monitoring. At the time, Hyperic supported PostgreSQL and Oracle. When adding a third option to the mix, Hyperic Engineering determined some significant obstacles that commonly restrain software development efforts from supporting additional databases:
- Lack of database abstraction layers
- Hard-coded pieces to account for different databases
In order to add support and isolate complexity, building database independence into Hyperic HQ was a critical first step. For Hyperic HQ 3.0, the team added the database abstraction layer by adding Hibernate into the core of the Hyperic HQ server.
Next, for Hyperic HQ 3.1, the team added beta support for MySQL as an optional database backend. This was no simple task, as queries and inserts into MySQL behave quite differently than in the other previously supported databases. And, being the first commercial, data-intensive, enterprise application to support MySQL, the art of tuning the application to leverage the full potential of MySQL required the expertise of both MySQL and Hyperic engineering, and sustained significant, large scale customer beta testing.
In Hyperic HQ, the server collects millions of metrics sent by HQ agents across networks. These metrics are, in turn, stored in a relational database management system (RDBMS), in this case, MySQL. Thus, the Hyperic HQ systems management suite is a robust, data intensive platform with heavy data throughput requirements. Hyperic also requires heavy data analysis and the ability to conduct fast queries and data joins.
The MySQL database server has long been known for its speed and simplicity, as well as its ability to scale with web infrastructure. Traditionally, it was not often used as a backend data store for enterprise applications. However, as the benchmark results will show, MySQL has evolved beyond its traditional role and can perform exceptionally well in an enterprise setting.
This benchmark report is intended to demonstrate how Hyperic HQ and MySQL can perform in a enterprise setting. This document is focused on the performance results and expectations both Hyperic and MySQL users can expect of large scale implementations. In this case, we will use two forms of performance testing to relate our results and interpretations:
- Simulate an actual large scale deployment monitoring 32,000 services across 675 discreet managed platforms, or nodes.
- Simulate a maximum scale deployment using the same setup and force a HQ Server outage for 5 hours to pool a backlog of monitoring metrics. Clearing the backlog queue will indirectly prove the maximum throughput the HQ Server and the MySQL database will allow.
Back to top
Before we begin, since the systems management industry offers us many different definitions for the same term, it's important to establish the Hyperic definitions of common terms so you can better understand the information provided in this report. The Hyperic definitions are as follows:
- Platform – A machine/operating system combination or any network or storage device, also referred to as nodes. Platforms are the lowest level of management, and include components such as CPU’s, Network Interfaces, and File Systems. As the HQ Server depends on agents for metric collection and management operations, each platform has its own designated agent.
- Server – Any software that is installed on a platform under management. Databases, middleware, virtualization, application and web servers are all examples of servers. Servers run on platforms. Platforms host multiple Servers. Examples of servers include any installations of JBoss, Tomcat, or MySQL on a given platform.
- Service – A component of a server dedicated to a specific purpose. Typically services are represented by the units of work of a given server. Different types of servers each define a list of one or more types of services they provide. Examples of services include Webapps deployed in Tomcat, or Virtual Hosts configured in Apache. Services can also be attached directly to a platform in the case of CPU, Network Interfaces and Filesystems.
Back to top
The total setup includes 4 physical machines: one for the MySQL database, one for HQ Server, and two machines which have spawned 375 HQ agents each. All agents are monitoring the same filesystems and reporting their metrics to the HQ Server. Each machine is connected via gigabit ethernet.
|CPU||2 Quad Core 2GHz|
|HQ Heap||4 GB|
|NIC||Gigabit, with 1 GB interconnect between HQ and MySQL machines|
|CPU||2 Quad Core 1596 MHz|
|MySQL InnoDB buffer pool||4.5 GB|
|Max tmp tables||48|
|Tmp table size||192M|
|Storage||RAID-1 146 GB SAS 3G HardDrives|
|Machine 3 and 4:
Hyperic HQ Agents
|CPU||2 Quad Core 1596 MHz, 64-bit|
The test environment was set up to monitor 675 agents and 32,000 services. The two machines hosting the agents spawned multiple agent processes to simulate a network with 675 nodes.
Additionally, to simulate real-life management activity across these nodes, the team created 54,600 active alert definitions, including 27,000 thousand multi-condition file service alerts, 27,000 file service value change alerts, and 600 platform availability alerts. The team also tracked 26,000 event logs.
|Number of platforms/agents||675|
|Number of servers||1,000|
|Number of services||32,000|
|Number of active alert definitions||54,600 total
600 Value Change Platform Availability
27,000 Value Change on file service
27,000 Multicondition on file service
|Number of eventlogs tracked||26,000|
Back to top
Test Scenario #1: Simulate Actual Load
HQ averaged a metrics collection rate of 220,000 metric insertions per minute, with minimal load on the servers supporting the test.
|HQ Server Statistics:
The average metric collection per agent was 325 / minute.
Load average – peaked at about 8 and averaged about 2.5
CPU utilization – peaked at 20% and averaged 10%
JVM Free memory – peaked at 3GB with a low of 0.1GB and averaged 1.5GB
|MySQL Server Statistics:
CPU utilization – peaked at 20% and averaged 5%
Note: See appendix for actual HQ metric charts monitoring the HQ Server performance during this test. Unfortunately, metric charts for the maximum throughput were not preserved for the purposes of this report. Hyperic will provide charts from a future test of similar nature shortly.
Test Scenario #2: Simulate Maximum Load
HQ achieved a peak of 2.3 million metric insertions per minute, with the maximum load being placed on the HQ Server itself, and moderate load on the MySQL Server.
|HQ Server Statistics:
The average metric collection per agent was 2.3 million metric insertions per minute.
|MySQL Server Statistics:
CPU utilization – peaked at 33% and averaged 10.6%
Back to top
Interpretations for MySQL
As the test results show, MySQL handled the load in both tests easily. In fact, while the tests demonstrated the maximum load a single Hyperic HQ server can deliver to a single MySQL database instance, it did not prove out a theoretical maximum for multiple HQ Servers against a single MySQL database instance as is typically required in High Availability deployments or in large scale deployments.
Interpretations for Hyperic
Hyperic HQ is a data-intensive enterprise monitoring application, tailored for a web operations audience. For growing web-driven companies, scaling their web applications is of paramount importance to being in business. Determining scale can come in two varieties:
- Horizontal scalability – monitoring and managing across tens of thousands of nodes.
- Vertical scalability – concentrating monitoring and management capabilities by increasing the number of metrics and data being monitored on fewer nodes.
Given the reality of the restriction of the availability of physical hardware to manage, Hyperic chose to focus on the latter managing the maximum amount of metrics on a finite set of physical platforms. However, the results also translate to the former by spreading metrics across a theoretical number of platforms in a less concentrated form.
Of course, each implementation is unique, with collection rates, additional data manipulation and deployment characteristics behaving differently. High rates of metric collections do incur small performance penalties for the systems that are targeted for collection, however, some metrics are so critical, the risk of not monitoring those processes carefully can have dramatic consequences. The average number of metrics per minute captured by Hyperic Enterprise customers is 50 metrics per minute.
Additionally, these tests do not demonstrate the effects that clustering of the HQ Servers would have. Given that the load on the HQ Server outweighed the load on the MySQL Server, theoretically there is substantial room for additional metric collection across a wider array of resources.
Back to top
This test demonstrates both Hyperic HQ’s ability to scale to meet the demand of even the most sophisticated web shops in the world, as well as MySQL’s ability to handle enterprise- grade, transactional-based applications. While each individual implementation of Hyperic HQ and of MySQL is unique, this report sets a high baseline with which to predict your success in using either product in an enterprise setting. And at Hyperic, we believe these results provide significant support for why our customers are demonstrating an increasing demand to pair Hyperic HQ with MySQL as a backend database store.
Back to top
Metric Results during 24 hour actual load test.
Figure 1: Hyperic HQ Load Average
Figure 2: Hyperic HQ CPU Usage
Figure 3: Hyperic HQ JBoss JVM Free Memory
Figure 4: MySQL CPU Usage
Figure 5: MySQL Metric Inserts per Minute
Figure 6: MySQL CPU Utilization at Maximum Load