DEV Community

Abhilash Kumar Bhattaram for Nabhaas Cloud Consulting

Posted on • Updated on

OCI ExaCS Metrics - now uses AHF for all its Database and Cluster Metrics

{ Abhilash Kumar Bhattaram : Follow on LinkedIn }

OCI Metrics is a large part of Oracle Cloud ExaCS Monitoring and it has migrated all its source of metrics collection to AHF ( Autonomous Health Framework ) in earlier terms TFA ( Trace File Analyzer ).

Lets zoom out a little and understand what kinds of metrics are actually provided.

The following metrics are provided for ExaCS Databases

  • CPU Utilization
  • Storage Utilization
  • Execute Counts
  • Block Changes
  • Parse Count
  • Current Logons

The Complete List of health Metrics are documented here
https://docs.oracle.com/en-us/iaas/exadatacloud/exacs/manage-vm-clusters.html

That brings to the question how was OCI was reporting these metrics so far and why it is essential to understand about this change to AHF

  • Oracle has so far been collecting metrics by it's own tooling and has been populating the graphs from database internals , which my best guess would be AWR tools , v$sysmetric and dba_hist_sysmetric and other associated database views , by moving to AHF Oracle essentially is no longer needing to query the internal CDB's and PDB's to gather these metrics. These metrics are auto collected in AHF telemetry data and it would make best sense to use as much of these metrics which are pre-collected.

  • This is all good to hear but there is a catch , AHF ( or rather TFA ) has known history of CPU Loads which is caused by itself and the only option was to stop TFA on the running nodes or upgrade to the latest AHF , this is the best case scenario any Oracle MOS SR would provide. But from my experience the recent versions of AHF > 22.4 seems to be pretty stable , but you never know... DBA's need to be cautious of this. In any future issues of AHF CPU loads where AHF is disabled this would directly have an impact on the Metrics and Alarms , so Oracle has the added pressure to make the future versions of AHF more resilient.

  • All Metrics are in UTC , and this is a major issue where ExaCS implementations in non UTC Time Zones with AHF < 22.4 , the metrics just would not work , there is clean hack MOS SR would provide to get back the metrics to work based on the versions being used ( There are no documented MOS noted on this either ) , most usually to upgrade to latest AHF or point the telemetry to gather data converting the data points to UTC timezones.

New Namspaces used by OCI Metrics

oracle_oci_database - the defalt namespace in use all this while
Until October 2022 , all the database metrics were collected by oracle_oci_database and they will continue to do so , in case Alarms are based on the databases they will continue to work from this namespace , as per my understanding Oracle will continue to gather metrics in this namespace until AHF is streamlined.

oci_database_cluster
This is the new namespace where the AHF telemetry will be pulled in and Oracle will be generating the ExaCS Database home page graphs from the data collected in this namespace , for all ExaCS and ( I assume DBCS bare metal RAC ) generated metrics graphs will be seen from this namespace , the oci_database_cluster essentially provides the GI/ASM related metrics gathered by TFA

oci_database
This is also a similar new namespace would also gather the details from AHF telemetry but would gather database related metrics.

There is official documentation on this at this point in the OCI Metrics Documentation Page

https://docs.oracle.com/en-us/iaas/Content/Database/References/databasemetrics.htm

But there is a white paper released by Oracle in october 2022 which has details about oci_database_cluster and oci_database namespaces - this document is for Oracle Dedicated Infrastructure
OCI but the metrics details provided holds good for ExaCS Infra

https://docs.oracle.com/en/engineered-systems/exadata-cloud-service/ecscm/exadata-database-service-dedicated-infrastructure-administrators-guide.pdf

At this point only ExaCS are using AHF as it's default repository for Metrics and DBCS would follow the same pattern.

I would continue to fill in more information on this as I gather as I see more official documentation on this.

#oracle #oci #oracledatabase #exadata #exacs #curl #api #oraclecloud #oracledba #nabhaas #metrics #ahf #tfa

Top comments (0)