No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionCloud 6.3.1.1 Solution Description 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Routine Monitoring

Routine Monitoring

The ManageOne OM plane provides all-round and hierarchical monitoring functions. O&M personnel can monitor resources, alarms, performance, capacity usage, and other information of the entire network, and learn the health status of network elements (NEs) and ICT resources in real time, which reduces IT costs, increases O&M efficiency, and improves user experience.

NOTE:

ManageOne does not support the scenario where a host belongs to multiple host groups. If a host belongs to multiple host groups, the queried host or VM data may be duplicate.

All-round and hierarchical monitoring includes object monitoring and comprehensive monitoring. Figure 7-26 shows the logical architecture for monitoring.

Figure 7-26 Logical architecture of all-round and hierarchical monitoring

Table 7-5 describes the logical architecture of all-round and hierarchical monitoring.

Table 7-5 Logical architecture of all-round and hierarchical monitoring

Monitoring Type

Description

Object monitoring

  • Physical device monitoring: monitors the alarms, topologies, and performance of servers, network devices, and storage devices.
  • Resource pool monitoring: monitors the capacities, performance, and load of computing, storage, and network resources.
  • Service monitoring: monitors the alarms and performance of the ManageOne system services and cloud service systems.
  • Cloud service instance (tenant resource) monitoring: monitors the alarms and performance of resource instances (such as computing, storage, network, and security resource instances) in VDCs or under resource management tenants.
  • Tenant application monitoring: monitors and collects statistics on resources of tenant applications.

Comprehensive monitoring

  • Centralized alarm monitoring: centrally monitors the alarms of system services or third-party systems.
  • Overall DC monitoring: collects data about resources, alarms, and capacity of DCs in different regions, and displays the overall running status of the DCs on different Dashboard pages.

Centralized Alarm Monitoring

Introduction

Alarm Monitoring of the ManageOne OM plane centrally monitors the alarms of system services and third-party systems, facilitating quick locating and handling of network faults and ensuring normal services. Alarm Monitoring is dedicated to monitoring and O&M of ever-evolving complex networks. Alarm Monitoring can be used to monitor faults on traditional networks and next-generation networks, which shortens fault recovery durations and improves network O&M efficiency.

Logical Architecture

Alarm Monitoring provides a unified alarm model. Third-party systems have their own drivers and report alarms using the interfaces provided by Alarm Monitoring to achieve unified alarm management. Figure 7-27 shows the logical architecture of Alarm Monitoring.

Figure 7-27 Logical architecture of Alarm Monitoring
Table 7-6 Logical architecture of Alarm Monitoring

Third-Party System

Description

Physical devices

Uses eSight or ZOHO to collect the alarms of servers, storage devices, and network devices and report the alarms to the ManageOne OM plane.

Resource pools

Uses FusionSphere OpenStack to collect the alarms of computing, storage, and network resource pools and report the alarms to the ManageOne OM plane.

Cloud services

Uses the service monitoring agent, FusionSphere Service OM, or FusionInsight Manager to collect the alarms and report the alarms to the ManageOne OM plane.

Alarm Handling Mechanisms

Alarm Monitoring provides three alarm handling mechanisms. Alarm merging rules help users improve alarm monitoring efficiency. The processing rules of the full current alarm cache are used to control the number of current alarms. Alarm dump rules are used to control the storage capacity of databases to prevent impact on system performance. Table 7-7 describes the alarm handling mechanisms.

Table 7-7 Alarm handling mechanisms

Mechanism

Description

Alarm merging rule

To help users improve the efficiency of monitoring and handling alarms, Alarm Monitoring provides alarm merging rules. Alarms with the same specified fields (nativeMoDn, moi, alarmGroupId, alarmId, reasonId, and specialAlarmStatus) are merged into one alarm. This rule is used only for monitoring and viewing alarms on the Current Alarms page and takes effect only for current alarms.

The specific implementation scheme is as follows:

  • If a newly reported alarm does not correspond to any previously reported alarm that meets the merging rule, the newly reported alarm is displayed as a merged alarm and the value of Occurrence Times is 1.
  • If the newly reported alarm B and the previously reported alarm A meet the merging rule, alarm B and alarm A are merged into one alarm record and are sorted by clearance status and occurrence time.

    If alarm A is displayed on top, it is still regarded as a merged alarm, and the Occurrence Times value of the merged alarm increases by one. Alarm B is regarded as an individual alarm.

    If alarm B is displayed on top, it is regarded as a merged alarm, and the Occurrence Times value of the merged alarm increases by one. Alarm A is regarded as an individual alarm.

    In the alarm list, click Occurrence Times of an alarm, you can view the detailed information about the merged alarm and individual alarm.

  • If a merged alarm is cleared, it will be converted into an individual alarm. The previous individual alarms will be sorted by clearance status and occurrence time. The first one becomes a merged alarm.
  • If a merged alarm or individual alarm is cleared and acknowledged, the alarm will be converted to a historical alarm and the value of Occurrence Times decreases by one.
  • On the Current Alarms page, aside from Merged alarms, the other alarms are Individual alarms.

Processing rule of the full current alarm cache

To prevent excessive current alarms from affecting system performance, Alarm Monitoring provides a processing rule of the full current alarm cache. When 30,000 current alarms are archived to a database, Alarm Monitoring applies the following two rules to add some alarms to the historical-alarm list until the number of current alarms falls to the proper range:

  • The cleared alarms, acknowledged but uncleared ADMC alarms, acknowledged but uncleared ADAC alarms, and unacknowledged and uncleared alarms are added to the historical-alarm list.
  • The first reported alarms are added to the historical-alarm list by time.

Alarm dump rule

To avoid excessive alarm database data, the system processes events, masked alarms, and historical alarms every 2 minutes according to the following rules:

  • If the database tablespace usage reaches 80%, Alarm Monitoring dumps the data in the database to files according to the sequence of occurrence time and the data table type (event, masked alarm, or historical alarm).
  • The dumped file will be deleted after 180 days.
  • If the size of the dumped file exceeds 1024 MB or the total number of files exceeds 1000, the system deletes the earliest files.

Related Concepts

Alarm Management enables network maintenance personnel to monitor and manage alarms or events reported by the system or managed objects (MOs). Alarm Management provides various monitoring and handling rules and notifies O&M personnel of faults. In this way, network faults can be efficiently monitored, quickly located, and handled, ensuring proper service running. MOs refer to the objects or NEs connected to Alarm Management.

Alarm and Event

If the system or MOs detect an exception or a significant status change, an alarm or event will be displayed on the GUI of Alarm Management. Table 7-8 describes the definitions of the alarm and event.

Table 7-8 Alarm and event

Name

Description

Differences Between Alarms and Events

Similarities

Alarm

Indicates a notification generated when the system or an MO is faulty.

  • An alarm indicates that an exception or fault occurs in the system or MO. An event is a notification generated when the system or MO is running properly.
  • Alarms must be handled. Otherwise, services will be abnormal due to these exceptions or faults. Events do not need to be handled and are used for analyzing and locating problems.
  • Users can acknowledge and clear alarms on the GUI. Users cannot acknowledge or clear events.

Alarms and events are presented to users as notifications.

Event

Indicates a notification of status changes generated when the system or an MO is running properly.

Alarm Severity

The alarm severity indicates the severity, importance, and urgency of a fault. It helps O&M personnel quickly identify the importance of an alarm and take corresponding handling policies. You can also change the severity of an alarm as required.

Table 7-9 lists the alarm severities.

Table 7-9 Alarm severities

Alarm Severity

Color

Description

Handling Policy

Critical

Services are affected. Corrective measures must be taken immediately.

The fault must be rectified immediately. Otherwise, services may be interrupted or the system may break down.

Major

Services are affected. If the fault is not rectified in a timely manner, serious consequences may occur.

Major alarms need to be handled in time. Otherwise, important services will be affected.

Minor

Indicates a minor impact on services. Problems of this severity may result in serious faults, and therefore corrective actions are required.

You need to find out the cause of the alarm and rectify the fault.

Warning

Indicates that a potential or imminent fault that affects services is detected, but services are not affected.

Warning alarms are handled based on network and NE running status.

Alarm Status

Table 7-10 lists the alarm statuses.

Table 7-10 Alarm statuses

Status Name

Alarm Status

Description

Acknowledgement status

Acknowledged and unacknowledged

The initial acknowledgment status is Unacknowledged. A user who views an unacknowledged alarm and plans to handle it can acknowledge the alarm. When an alarm is acknowledged, its status changes to Acknowledged. Acknowledged alarms can be unacknowledged. When an alarm is unacknowledged, its status is restored to Unacknowledged. You can also configure auto acknowledgment rules to automatically acknowledge alarms.

Clearance status

Cleared and uncleared

The initial clearance status is Uncleared. When a fault that causes an alarm is rectified, a clearance notification is automatically reported to Alarm Management and the clearance status changes to Cleared. For some alarms, clearance notifications cannot be automatically reported. You need to manually clear these alarms after corresponding faults are rectified. The background color of cleared alarms is green.

Maintenance status

Normal and maintenance

The initial maintenance status is Normal. You can configure an identification rule to identify alarms generated during commissioning as Maintenance. When monitoring or querying alarms, you can set filter criteria to filter out maintenance alarms.

NOTE:
  • The maintenance status corresponding to Normal is NORMAL.
  • The maintenance status corresponding to Maintenance is INSTALL, EXPAND, UPGRADE, or TESTING.

Validity

Valid and invalid

The initial validity status is Valid. You can configure an identification rule to identify alarms that do not concern you as invalid alarms. When monitoring or querying alarms, O&M personnel can set the filter criteria to filter out invalid alarms.

Event Status

Table 7-11 lists the classification of event statuses.

Table 7-11 Event status classification

Status Name

Alarm Status

Description

Maintenance Status

Normal and maintenance

The maintenance status of an event is fixed and cannot be set using the identification rule. When monitoring or querying events, you can set filter criteria to filter out maintenance events.

NOTE:
  • The Normal event is displayed as NORMAL in the Maintenance Status column of the list on the Event Logs page.
  • The Maintenance event is displayed as INSTALL, EXPAND, UPGRADE, or TESTING in the Maintenance Status column of the list on the Event Logs page.

Current Alarms and Historical Alarms

Table 7-12 describes current alarms and historical alarms.

Table 7-12 Current alarms and historical alarms

Name

Description

Current alarms

Current alarms include uncleared and unacknowledged alarms, acknowledged but uncleared alarms, and unacknowledged but cleared alarms. When monitoring current alarms, you can identify faults in time, operate accordingly, and notify maintenance personnel of these faults.

Historical alarms

Acknowledged and cleared alarms are historical alarms. You can analyze historical alarms to optimize system performance.

Alarm and Event Types

Alarm and event types facilitate query, analysis, and processing of alarms and events. You can select types when filtering alarms and events.

Table 7-13 describes the types of alarms and events.

Table 7-13 Alarm and event types

Type

Description

Communications alarm

Alarms caused by failures of the communications in an NE, between NEs, between an NE and a management system, or between management systems. Example: device communication interruption alarm.

Quality of service alarm

Alarms caused by service quality deterioration. Example: device congestion alarm.

Processing error alarm

Alarms caused by software or processing errors. Example: version mismatch alarm.

Equipment alarm

Alarms caused by physical resource faults. Example: board fault alarm.

Environment alarm

Alarms caused by problems related to the location of a device. Example: temperature alarm generated when the hardware temperature is too high.

Integrity alarm

Alarms generated when requested operations are denied. Example: alarms caused by unauthorized modification, addition, and deletion of user information.

Operation alarm

Alarms generated when the required services cannot run properly due to problems such as service unavailability, faults, or incorrect invocation. Example: service rejection, service exit, and procedural errors.

Physical resource alarm

Alarms generated when physical resources are damaged. Example: alarms caused by cable damage and intrusion into an equipment room.

Security alarm

Alarms generated when security issues are detected by a security service or mechanism. Example: authentication failures, confidential disclosures, and unauthorized accesses.

Time domain alarm

Alarms generated when an event occurs at improper time. Example: alarms caused by information delay, invalid key, or resource access at unauthorized time.

Property change

Events generated when MO attributes change. Example: events caused by addition, reduction, and change of attributes.

Object creation

Events generated when an MO instance is created.

Object deletion

Events generated when an MO instance is deleted.

Relationship change

Events generated when MO relationship attributes change.

State change

Events generated when MO status attributes change.

Route change

Events generated when routes change.

Protection switching

Alarms or events caused by the switchover.

Over limit

Alarms or events reported when the performance counter reaches the threshold.

File transfer status

Alarms or events reported when the file transfer succeeds or fails.

Backup status

Events generated when MO backup status changes.

Heart beat

Events generated when heartbeat notifications are sent.

Overall DC Information Monitoring

The homepage of the ManageOne OM plane displays information such as resources, alarms, capacities, and network status of data centers (DCs) on different dashboard tab pages. The O&M Maps tab page centrally manages O&M functions and provides a unified O&M portal.

  • Dashboard

    Dashboards collect statistics on DCs in different regions from multiple dimensions such as resources, alarms, and capacities and collected data can be displayed on different dashboards to help administrators understand and master the overall running status of the DCs.

    Administrators can customize a dashboard tab page based on the characteristics of different monitoring indicators, display key indicator data of a DC using the corresponding chart type (such as pie chart and column chart), and add the dashboard tab page to favorites on the homepage, helping administrators monitor the running status of the DC more clearly and intuitively. In addition, various WebUIs can be used to improve the display effect and user experience.

  • O&M Maps page

    The O&M Maps page provides a unified O&M portal for alarm monitoring, resource configuration, and assurance analysis of ManageOne DCs in different regions. The O&M Maps page allows users to quickly redirect to other service systems in SSO mode, quickly access common tasks, and supports common tasks, O&M maps, and quick access statistics.

    On the O&M Maps page, administrators can set quick links for common tasks, third-party systems, and functions and O&M services. Administrators can click Access Statistics to obtain the frequently accessed O&M functions and services and add quick links for these services. In this way, administrators can obtain information from the O&M Maps page more efficiently.

Concepts

  • Dashboard: A dashboard is a data-visualized tab page. It consists of one or more visualized components and displays DC metering information and key service indicators. Visualized components are widgets of a dashboard. Visualized components include charts (including line charts, area charts, and column charts) and various data indicators, and display data in different dimensions, such as performance, capacity, and resources.
  • O&M Maps: displays the functions and services on the ManageOne OM plane in a centralized manner.

Logical Architecture

  • Dashboard

    Figure 7-28 shows the logical architecture of Dashboard.

    Figure 7-28 Logical architecture of Dashboard

    Table 7-14 describes the logical architecture of Dashboard.

    Table 7-14 Descriptions of the Dashboard logical architecture

    Dashboard Category

    Description

    Preconfigured dashboard

    Preconfigured dashboards include Data Center Overview, Resource Pool Overview, Multi-Level Cloud Resource Overview, and VDC Resource Details and are displayed on the homepage of the ManageOne OM plane by default.

    • Data Center Overview: displays the number of physical devices on the entire network, number of devices, server quantity collected by status, cloud service provisioning statistics, and resource allocation in each region.
    • Resource Pool Overview: displays the number of resources on the entire network, number of resources, and resource allocation in each region.
    • Multi-Level Cloud Resource Overview: displays information about physical devices, resource usage, cloud service provisioning, and current alarm quantity and distribution in the cloud of the current level.
    • VDC Resource Details: displays the cloud DC data, such as the number of first-level VDCs, scale and resource distribution for each first-level VDC.

    Customized dashboard

    If the preconfigured dashboards cannot meet the requirements on centralized monitoring, administrators can create dashboards, analyze monitoring data characteristics, configure the data and layout on the Dashboard Management page, and add the dashboard to favorites and displays it on the homepage to meet the monitoring and demonstration requirements.

  • O&M Maps

    The O&M Maps page centrally displays O&M functions and services of ManageOne. Administrators can directly redirect to services of third-party systems using Quick Links. Administrators can create tasks through Common Tasks to quickly process these tasks. Administrators can view the statistics on the number of access times of Common Tasks, O&M Maps, and Quick Links by clicking Access Statistics.

    Figure 7-29 shows the principles of O&M Maps.

    NOTE:

    Items displayed on Quick Links, Common Tasks, and O&M services can be set based on O&M requirements.

    Figure 7-29 Logical architecture of O&M Maps

    Table 7-15 describes the logical architecture of O&M Maps.

    Table 7-15 Logical architecture of O&M Maps

    Function

    Description

    Benefits

    Access Statistics

    Collects the access times of the following items:

    • Common Tasks
    • O&M Maps
    • Quick Links

    Administrators can click Access Statistics to obtain the frequently accessed O&M functions and services and add quick links for these services. In this way, administrators can obtain information from the O&M Maps page more efficiently.

    Common Tasks

    Allows administrators to set common tasks as required.

    Administrators can set common tasks to display frequently used O&M tasks in the Common Tasks area on the O&M Maps page, facilitating quick O&M task operations.

    O&M Services

    Allows administrators to set O&M services as required. By default, O&M services are classified into Monitoring, Configuration, and Assurance services.

    Administrators can set O&M tasks to display frequently used O&M tasks in the O&M service area on the O&M Maps page, implementing quick O&M service redirection.

    Quick Links

    Allows administrators to add quick links for frequently accessed third-party systems.

    Administrators can add quick links for frequently accessed third-party systems in the Quick Links area on the O&M Maps page, implementing quick third-party system redirection.

Physical Device Monitoring

The Physical Devices function centrally monitors and manages hardware devices such as data center servers, storage devices, network devices, and equipment room devices, and provides comprehensive monitoring capabilities such as alarms, components, topologies, and performance, helping O&M personnel quickly locate and rectify hardware faults.

Physical Devices obtains resource data from:

  • Interconnected systems: Physical Devices is interconnected with eSight, ZOHO OPM, and ZOHO APM using System Access. Physical Devices periodically synchronizes base resources and location resources from the interconnected systems. Automatically synchronizes base resources from eSight, ZOHO OPM, and ZOHO APM. The default synchronization period is 180 minutes.
  • Self-planning: Administrators can manually add base resources and location resources based on self-planning.

Logical Architecture

Figure 7-30 shows the logical architecture of Physical Devices monitoring.
Figure 7-30 Logical architecture of Physical Devices monitoring
  • Physical Devices is interconnected with eSight, ZOHO OPM, and ZOHO APM using System Access. Physical Devices periodically synchronizes system resources from the interconnected systems. Automatically synchronizes base resources from eSight, ZOHO OPM, and ZOHO APM. The default synchronization period is 180 minutes.
  • Table 7-16 lists the types and data sources of physical devices.
    Table 7-16 Types and data sources of physical devices

    Physical Device Type

    Base Resource Type

    Base Resource Subtype

    Data Source

    Base Resource

    Server

    Rack server

    eSight, ZOHO OPM, and ZOHO APMSelf-planning

    High-density server

    Heterogeneous server

    Storage server

    Third-party server

    Blade

    Blade server

    KunLun Server

    Network device

    Switch

    eSight, ZOHO OPM, and ZOHO APMSelf-planning

    Router

    Firewall

    Load balancer

    Storage device

    Storage

    eSight, ZOHO OPM, and ZOHO APMSelf-planning

    FC switch

    Equipment room device

    Cabinet

    Self-planning

  • Physical Devices allows administrators to manually add base resources, data centers, and equipment room location resources based on self-planning.
  • The Alarms, Monitoring Configuration, and Resource Pools functions obtain physical device data from Physical Devices and use the data for business analysis.

Resource Pool Monitoring

Resource Pool Monitoring helps administrators monitor the overall status of various resources in ManageOne. Resource Pool Monitoring supports real-time tracing of resource data and performance of multi-level cloud resources, two-level cloud, VRM cloud, IaaS resource pools, and big data resource pools and automatically generates statistics tables, which helps administrators predict resource capacity trends in advance, identify risks and take preventive measures in a timely manner to ensure normal service running.

Table 7-17 lists the resources that can be monitored.

Table 7-17 Resources that can be monitored

Type

Application Scenario

Multi-level cloud

When multiple ManageOne systems need to be centrally managed, multi-level cloud management allows administrators to interconnect the clouds with ManageOne and configure the logical relationships among ManageOne systems to implement unified multi-cloud management and collect resource data at different logical locations. Multi-level cloud management monitors the scale, capacity, resources, and performance of each resource pool from the cloud dimension.

IaaS resource pool

When you need to monitor the basic capacity, cloud resource load, and resources, you can monitor the IaaS resource pool and trace resource data in real time by region, resource pool, AZ, and cluster.

Big data resource pool

When you need to monitor the usage of big data resources accessed from FusionInsight and synchronize big data clusters on the ManageOne OM plane, you can monitor the big data resource pool to obtain real-time and historical monitoring indicators of clusters, obtain the status and configuration data of services and hosts, and perform a series of function operations on clusters, services, and hosts.

Related Concepts

  • Dimensions and icons of different cloud types are as follows:
    • Private cloud: (two-level cloud), (Region), (resource pool), (AZ), (cluster or host group)
    • Public cloud: (Region)
  • Resource Pools manages the following types of data: performance data, capacity data, and resource data.
  • Private cloud: An ECS built for internal use of an enterprise. It is an extension and optimization of a traditional data center and provides storage capacity and processing capabilities for various functions. It provides effective control and guarantee for data confidentiality, data security, and quality of service (QoS). The biggest feature of the private cloud is security and privatization, which is the foundation of custom solutions.
  • Public cloud: The Internet Data Center (IDC) or third-party service providers provide resources such as applications and storage devices. It has powerful scalability and low cost, but lack of control over cloud resources, and has low data security and poor matching.
  • Two-level cloud: Apply for resources from the peer FusionCloud by interconnecting with the FusionCloud API Gateway at the peer end. This ensures that resources can be borrowed quickly from the peer DC when resources in the local DC are insufficient.
  • Multi-level cloud: The cloud system logical relationship tree formed by interconnection and configuration among ManageOne cloud service systems in different regions and services can implement unified multi-cloud management and monitor the scale, capacity, resources, and performance of each resource pool from the cloud dimension.

Logical Architecture

With the logical structure of Resource Pools, administrators can better understand the unified multi-level cloud monitoring model and configure and manage multi-level cloud relationships based on service requirements in actual O&M scenarios. By learning the data source and display content of the IaaS resource pool, administrators can adjust resource allocation in time and provide optimal service policies.

The following uses two levels of policing cloud as an example to describe the physical model and multi-level cloud logical model, as shown in Figure 7-31.

Figure 7-31 Logical architecture of Resource Pool Monitoring

Each blue rectangle in the physical model represents a ManageOne OM system. Only the physical structure of the interconnection between the public security network cloud (provincial police department) and several ManageOne OM systems can be displayed. Multi-level Cloud Monitoring transforms the physical model into an integrated multi-level cloud model. In the logical model, each yellow rounded rectangle represents a cloud node. Define a number of cloud nodes (for example, provincial police department cloud), and attach the public security network cloud (provincial police department) and several ManageOne OM systems to cloud nodes. Each cloud node displays the resource data of ManageOne OM systems attached to the cloud node, and displays data statistics and comparison.

  • Physical model:
    • In the first-level cloud model, the public security network cloud (provincial police department) is the upper-level cloud, and the Internet cloud (provincial police department), the video network cloud (provincial police department), and the public security network cloud (city A) are lower-level clouds.
    • In the second-level cloud model, the public security network cloud (city A) is the upper-level cloud, and the Internet cloud (city A) and the video network cloud (city A) are lower-level clouds.
  • Logical model:
    • Cloud nodes are created in the two upper-level clouds (ManageOne OM systems) in the physical model.
      • Create two cloud nodes in the public security network cloud (provincial police department): provincial and municipal integrated cloud and provincial police department cloud.
      • Create a cloud node in the public security network cloud (city A): cloud in city A.
    • The public security network cloud (provincial police department), Internet cloud (provincial police department), and video network cloud (provincial police department) are attached to the provincial police department cloud, and the public security network cloud (provincial police department) is the local cloud under the provincial police department cloud.
    • The public security network cloud (city A), Internet cloud (city A), and video network cloud (city A) are attached to the cloud in city A. The public security network cloud (city A) is the local cloud under the cloud in city A.
NOTE:

ElasticSearch is a search server that provides the capability of storing, querying, and calculating data.

Cloud Resource Monitoring

Cloud Resource Monitoring monitors cloud resource usage in real time in terms of computing, storage, network, database, and security resources. It collects monitoring indicators of each cloud resource module and detects resource module availability. Administrators can learn about the status of cloud resources, analyze the running status and health status of services, and handle alarms in a timely manner to ensure smooth running of applications.

Logical Architecture

Cloud Resource Monitoring is interconnected with the ManageOne OM plane, FusionSphere, cloud services, Alarm Monitoring, and Monitoring Configuration to obtain information about all resources and resource instances in the current database. Administrators can view resource information and status in terms of computing, storage, network, database, and security resources.

Figure 7-32 shows the logical architecture of Cloud Resource Monitoring.

Figure 7-32 Logical architecture of Cloud Resource Monitoring
Table 7-18 describes the sources of cloud resource monitoring information.
Table 7-18 Information about displayed resource types

Resource Type

Resource Subtype

Computing resource

Elastic Cloud Server, Bare Metal Server, and Image

Storage resource

Elastic Volume Service

Network resource

Virtual Private Cloud, Elastic IP Address, Elastic Load Balance, Bandwidth, and Virtual Private Network

Database resource

Relation Database and Oracle Database

Security resource

Virtual Firewall and Database Security Service

VDC Monitoring

VDC Monitoring centrally manages VDC resources by tenant. When handling resource query requests from users or performing routine maintenance, administrators can query resources as required to help users properly use resources. Administrators can learn VDC information, such as resource statistics, resource details, resource associations, and resource topologies. VDC Monitoring allows administrators to monitor the running status of resources in VDCs at each level and determine whether resources are normal based on resource topologies, performance indicators, and alarm information. In addition, VDC Monitoring helps administrators to maintain VDCs and increase resource usage.

Concepts

VDC is a new type of data center form that applies cloud computing to Internet Data Center (IDC). A VDC is a resource allocation unit that matches the hierarchy between enterprises and organizations. The system creates a first-level VDC for each tenant by default. In VDCs, user management, quota management, project management, product definition, resource provisioning, and service assurance are supported.

Logical Architecture

VDC Monitoring obtains the VDC and tenant information from the ManageOne OM plane, and interconnects with FusionSphere and CloudService to obtain resource information and centrally monitor VDC resources.

Figure 7-33 shows the logical architecture of VDC Monitoring.

Figure 7-33 Logical architecture of VDC Monitoring

Table 7-19 describes the sources of VDC Monitoring information.

Table 7-19 Sources of VDC Monitoring information

Resource Source

Required Information

ManageOne OM plane

VDC and tenant information

The VDC information can be obtained from the ManageOne OM plane. Administrators need to monitor resources in VDCs at all levels.

FusionSphere

Virtual resource instance information

Cloud Services

Cloud service resource instance information

Cloud services, such as ECS, BMS, IMS, EVS, VPC, EIP, ELB, VPN, Bandwidth, Relation Database, Oracle Database, VFW, and Database Security Service (DBSS), are supported.

Alarms

Alarm information

Monitoring Configuration

Performance information

Tenant Application Monitoring

Tenant Applications monitors accessed service resources from the perspective of applications, accurately measures the quality of services provided by the big data platform, and continuously evaluates application resource usage to detect exceptions during service running and ensure stable service running.

Logical Architecture

Figure 7-34 shows the logical architecture of Tenant Applications.

Figure 7-34 Logical architecture of Tenant Applications
Table 7-20 describes the logical architecture of Tenant Applications.
Table 7-20 Description of the logical architecture of Tenant Applications

Category

Description

Stores data.

After a tenant applies for services on FusionInsight, the service data is stored on the ElasticSearch server.

Reports data.

The ElasticSearch server reports the usage of big data assets to Tenant Applications in a timely manner and continuously monitors the data assets of each service.

Provides tags.

Tag Management provides tags for big data applications so that administrators can associate tags with users on the Big Data Application Management page and use the classified tags to monitor big data assets used by tenants.

NOTE:

ElasticSearch is a search server that provides the capability of storing, querying, and calculating data.

Related Concepts

HBase is a column-based distributed storage system that features high reliability, performance, and scalability. HBase is suitable for storing big table data (a table containing billions of rows and millions of columns) and allows real-time data access.

LibrA is an enterprise-level relational database for large-scale parallel processing.

Hive is an open-source data warehouse built on Hadoop. It provides batch computing capability for the big data platform and is able to batch analyze and summarize structured and semi-structured data for data calculation.

Cloud Service System Monitoring

Service Monitoring monitors the node and process performance metrics in real time, records the change trend of key metrics, displays the alarm data of services running on ManageOne as well as cloud services. It displays detailed monitoring data of monitored services from multiple dimensions such as service, node, and instance, helping administrators prevent potential risks in service running in a timely manner.

Concepts

  • Node: A node is unit, such as a host or container that has a certain disk space and a unique IP address on the network server.
  • Instance: An instance is a monitored unit on a single node and is configured based on application scenarios and monitoring requirements. For example, a node can be associated with a monitoring template to form a monitoring instance. Each monitoring instance has multiple processes.

Logical Architecture

Administrators can create monitoring tasks and configure monitoring metric templates for services to be monitored. In addition, administrators can view service monitoring data to learn about the alarm information and each performance metric and its change trend of each monitored object, quickly identifying exceptions and taking measures to ensure proper system running.

Figure 7-35 shows the logical architecture of Service Monitoring.

Figure 7-35 Logical architecture of Service Monitoring

Table 7-21 describes the logical architecture of Service Monitoring.

Table 7-21 Logical architecture of Service Monitoring

Function

Description

Benefits

Creating a service monitoring task

To create a service monitoring task, you need to configure the following information:

  • Basic service information
  • Service running node
  • Service monitoring template
  • Macro variable

Administrators can create monitoring tasks for services to be monitored and configure service monitoring metric templates to monitor performance metrics of service running nodes and processes.

Configuring thresholds in a monitoring template

Monitoring templates are provided by the system. Administrators can select templates based only on monitoring metrics in different templates.

Administrators can set alarm thresholds in the monitoring template.

Viewing service monitoring information

Administrators can view the following monitoring information:

  • Summary
  • Monitoring metric
  • Alarm information

Administrators can view the summary, alarm information, monitoring metric change trends of monitored services to determine the health status of running services, prevent risks, and improve the proactive O&M capability.

Translation
Download
Updated: 2019-10-23

Document ID: EDOC1100063247

Views: 73073

Downloads: 191

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next