Hi everyone,
Hope you’re doing good!
Ensuring high availability of your Oracle databases is crucial for maintaining optimal performance and meeting Service Level Agreements (SLAs). Oracle Enterprise Manager (OEM) provides a comprehensive way for monitoring and managing database uptime. With that said, I received a call from a client and he was asking this:
- Is there any way I can calculate the DB uptime in the last 30 days using a simple and fast approach?
Well, the answer is: yes!
OEM has the uptime info from the Graphical Interface, where you can navigate to the targets and check it. I’m not a huge expert on OEM, but I’d say that you may can create some report (or maybe already exists) about availability.
OK, but we are DBA’s, we like to work using SSH and SQL*Plus, isn’t? So, the question is:
- Is there any way I can calculate the DB uptime in the last 30 days using SQL statements?
Again, the answer is: yes! Definitely!
If you are using Oracle Enterprise Manager, you can query run a SELECT statement directly against the OEM views to calculate uptime manually.
This blog post will guide you through the process of calculating database uptime and SLA percentage using SQL queries on OEM views.
Below, we’ll walkthrough the SQL query used to calculate database uptime for both individual databases and RAC (Real Application Clusters) databases over the last 30 days.
You must connect to OEM DB repository.
The complete query is here:
COLUMN target_name FORMAT a11
COLUMN uptime_hours FORMAT 9999
SET LINESIZE 200
SET PAGESIZE 500
WITH availability AS (
SELECT
target_name,
target_type,
CASE
WHEN INSTR(target_name, '_') > 0 THEN SUBSTR(target_name, 1, INSTR(target_name, '_') - 1)
ELSE target_name
END AS cluster_name,
availability_status,
start_timestamp,
NVL(end_timestamp, SYSTIMESTAMP) AS end_timestamp
FROM
SYSMAN.MGMT$AVAILABILITY_HISTORY
WHERE
target_type = 'oracle_database'
AND start_timestamp >= TRUNC(SYSDATE) - 30
),
downtime AS (
SELECT
cluster_name,
SUM(CASE WHEN availability_status != 'Target Up' THEN (end_timestamp - start_timestamp) * 24 ELSE 0 END) AS downtime_hours
FROM
availability
GROUP BY
cluster_name
),
adjusted_uptime AS (
SELECT
cluster_name,
ROUND(720 - SUM(downtime_hours), 0) AS uptime_hours
FROM
downtime
GROUP BY
cluster_name
)
SELECT
cluster_name AS target_name,
uptime_hours,
ROUND((uptime_hours / 720) * 100, 2) AS sla_percentage
FROM
adjusted_uptime
ORDER BY
cluster_name;
Let’s now break down the query. The SQL query consists of three main parts:
- Extracting and Normalizing Intervals
- Calculating Downtime
- Calculating Uptime and SLA Percentage
Let’s start with:
- Extracting and Normalizing Intervals:
This part of the query extracts relevant data from the MGMT$AVAILABILITY_HISTORY view and normalizes the cluster name for RAC databases.
WITH availability AS (
SELECT
target_name,
target_type,
CASE
WHEN INSTR(target_name, '_') > 0 THEN SUBSTR(target_name, 1, INSTR(target_name, '_') - 1)
ELSE target_name
END AS cluster_name,
availability_status,
start_timestamp,
NVL(end_timestamp, SYSTIMESTAMP) AS end_timestamp
FROM
SYSMAN.MGMT$AVAILABILITY_HISTORY
WHERE
target_type = 'oracle_database'
AND start_timestamp >= TRUNC(SYSDATE) - 30
)
Explanation:
- target_name: The name of the database or instance;
- target_type: The type of the target, which is ‘oracle_database’;
- cluster_name: Normalized name for RAC clusters by extracting the part of target_name before the first underscore. For single-instance databases, the target_name itself is used;
- availability_status: The status of the database (e.g., ‘Target Up’, ‘Target Down’);
- start_timestamp: The start time of the status interval;
- end_timestamp: The end time of the status interval, defaulting to the current timestamp if it is null.
Let’s now go to:
2. Calculating Downtime
This part aggregates downtime hours for each cluster.
downtime AS (
SELECT
cluster_name,
SUM(CASE WHEN availability_status != 'Target Up' THEN (end_timestamp - start_timestamp) * 24 ELSE 0 END) AS downtime_hours
FROM
availability
GROUP BY
cluster_name
)
Explanation:
- downtime_hours: Total hours the database was down (I’m assuming that DB is down if availability_status is different than ‘Target Up’).
This part of query will sum the “downtime” hours for each cluster.
Let’s now go to:
3. Calculating Uptime and SLA Percentage
This part calculates the adjusted uptime and SLA percentage based on the total possible hours in a month, 30 days (720 hours):
adjusted_uptime AS (
SELECT
cluster_name,
ROUND(720 - SUM(downtime_hours), 0) AS uptime_hours
FROM
downtime
GROUP BY
cluster_name
)
SELECT
cluster_name AS target_name,
uptime_hours,
ROUND((uptime_hours / 720) * 100, 2) AS sla_percentage
FROM
adjusted_uptime
ORDER BY
cluster_name;
Explanation:
- uptime_hours: Total possible hours (720) minus the downtime hours;
- sla_percentage: The SLA percentage calculated as (uptime_hours / 720) * 100.
By following this approach, you can accurately calculate the database uptime and SLA percentage for your Oracle databases using OEM views. This method is particularly useful for generating uptime reports and ensuring that your databases meet the required availability standards.
Below we have an output as example. The DB names were replaced by security reasons:
TARGET_NAME UPTIME_HOURS SLA_PERCENTAGE
----------- ------------ --------------
DB01 701 97.36
DB02 681 94.58
DB03 708 98.33
DB04 711 98.75
DB05 705 97.92
DB06 692 96.11
DB07 711 98.75
DB08 701 97.36
DB09 705 97.92
DB10 703 97.64
DB11 689 95.69
DB12 701 97.36
DB13 713 99.03
DB14 714 99.17
DB15 714 99.17
DB16 714 99.17
DB17 714 99.17
DB18 714 99.17
DB19 714 99.17
DB20 714 99.17
DB21 708 98.33
DB22 547 75.97
Hope this helps.
Peace!
Vinicius
Related posts
Disclaimer
My postings reflect my own views and do not necessarily represent the views of my employer, Accenture.