ALM-38016 TRM Unavailable
Description
This alarm is generated when a service fails to call TRM due to a network connectivity issue or an exception condition of the TRM service.
Attribute
Alarm ID |
Alarm Severity |
Alarm Type |
---|---|---|
38016 |
Major |
Communications alarm |
Parameters
Parameter Name |
Parameter Description |
---|---|
ServiceName |
Indicates the name of the service that reports the alarm. |
Namespace |
Indicates the namespace of the service that reports the alarm. |
InstanceName |
Indicates the name of the service instance for which the alarm is reported. |
SrvAddr |
Indicates the access address of the service that reports the alarm. |
Impact on the System
The service is affected.
System Actions
None
Possible Causes
- The network between applications and TRM is disconnected.
- The TRM service is faulty.
Procedure
- Check the pod status of TRM.
- Use PuTTY to log in to the manage_lb1_ip node.
The default username is paas, and the default password is QAZ2wsx@123!.
- Run the following command and enter the password of the root user to switch to the root user:
su - root
Default password: QAZ2wsx@123!
- Run the following command to check whether the pod status of TRM is normal.
kubectl get pod -n fst-manage | grep trm
- Check whether the TRM process is running properly.
If the TRM process status is Running, the pod status of TRM is normal. For example:
trm-apiserver-3102815562-7b0jt 1/1 Running 0 2d trm-apiserver-3102815562-x4snz 1/1 Running 0 2d
In the preceding information, the two values in the left-most column are the pod names corresponding to the TRM process.
- If the pod status of TRM is abnormal, perform either of the following operations:
- Wait for the health check service to restart. This process takes approximately 10 minutes.
- Run the following command to manually restart the pods in which the abnormal TRM service processes run:
kubectl delete pod podname -nfst-manage
Use podname obtained in 1.d.
- Check whether the pod status of TRM is normal.
- Check whether this alarm is cleared.
- If yes, no further action is required.
- If no, go to 2.
- Use PuTTY to log in to the manage_lb1_ip node.
- Check whether the network is connected.
- Run the following command to obtain the TRM container IP addresses.
kubectl get pod -o wide -n fst-manage | grep trm
trm-apiserver-3102815562-7b0jt 1/1 Running 0 2d 172.16.3.4 manage-cluster2-53a1198e-mfngz trm-apiserver-3102815562-x4snz 1/1 Running 0 2d 172.16.9.6 manage-cluster1-53a1198e-czx68
In the result above, 172.16.3.4 and 172.16.3.6 are the TRM container IP addresses. The IP addresses may vary at your site.
- Run the following command to check whether network connectivity to the TRM service is available:
curl -k https://${TRM container IP address obtained in 2.a}:9763
curl -k https://172.16.3.4:9763 {"message":{"errcode":"SVCSTG.TRM.4011005","message":"There is no token in header","extend":""}}
If the result is not empty, network connectivity to the TRM service is available:
- If yes, go to 3.
- If no, contact the network administrator to rectify the fault.
- After the fault is rectified, check whether this alarm is cleared.
- If yes, no further action is required.
- If no, go to 3.
- Run the following command to obtain the TRM container IP addresses.
- Contact technical support for assistance.
Alarm Clearing
This alarm is automatically cleared when the TRM service recovers.
Related Information
None