This topic describes how to troubleshoot faults in a full liquid cooling cabinet.
This topic covers the following types of faults:
- The tubing is faulty.
- The water flow is faulty.
- The monitoring is faulty.
For details about parts replacement involved in troubleshooting, see Parts Replacement.
Tubing Faults
Fault
|
Symptom
|
Procedure
|
Water leakage
|
The water pressure and flow rate are reduced.
|
- Locate the leak based on the information provided by the water leakage detection sensor.
- Check whether condensation has occurred.
- Check whether the leak is inside a liquid-cooled compute node and whether the leak has spread to other compute nodes.
- Check whether a quick connect coupling on a liquid-cooled compute node leaks water.
- Check whether the pipe connectors of the full liquid cooling cabinet leaks water.
- Maintenance personnel troubleshoot the leak.
- Condensation: After the cabinet is disconnected from water and power supplies, adjust the equipment room temperature or CDU temperature setting.
- Leakage inside a liquid-cooled compute node: Replace the compute node after it is powered off.
- Leakage from a quick connect coupling of a liquid-cooled compute node: After the cabinet is disconnected from water and power supplies, replace the quick connect coupling of the compute node.
- Leakage from the pipe connectors of the full liquid cooling cabinet: Power off all components, including the chassis. After the cabinet is disconnected from water and power supplies, drain the water collecting tray through the drain pipe, wipe up the remaining water and the water leakage detection sensors, and replace the cabinet.
- After the troubleshooting, install the liquid cooling system and ensure that no impurities are brought to the liquid cooling system.
- Check whether the pipes still leak water.
- After ensuring that the pipes do not leak water, power on the cabinet, keep it running for over 4 hours, and check whether the water leakage detection sensors work properly.
- Ensure that no water sensor alarm is generated on the CCU and the CCU is running properly.
|
An overtemperature alarm about a liquid-cooled compute node is generated.
|
Water Flow Faults
Fault
|
Symptom
|
Procedure
|
Water leakage
|
The connection is loose.
|
- Tighten the connection at the leakage point if it is loose.
- Check whether the water supply device (such as the water bag) of the secondary side has sufficient water. Add water if necessary. For details, see related documents provided by the CDU vendor.
|
A component is damaged.
|
- Locate the leak based on the information provided by the water leakage detection sensor.
- Check whether condensation has occurred.
- Check whether the leak is inside a liquid-cooled compute node and whether the leak has spread to other compute nodes.
- Check whether a quick connect coupling on a liquid-cooled compute node leaks water.
- Check whether the pipe connectors of the full liquid cooling cabinet leaks water.
- Maintenance personnel troubleshoot the leak.
- Condensation: After the cabinet is disconnected from water and power supplies, adjust the equipment room temperature or CDU temperature setting.
- Leakage inside a liquid-cooled compute node: Replace the compute node after it is powered off.
- Leakage from a quick connect coupling of a liquid-cooled compute node: After the cabinet is disconnected from water and power supplies, replace the quick connect coupling of the compute node.
- Leakage from the pipe connectors of the full liquid cooling cabinet: After the cabinet is disconnected from water and power supplies, drain the water collecting tray through the drain pipe, wipe up the remaining water and the water leakage detection sensors, and replace the cabinet.
- After the troubleshooting, install the liquid cooling system and ensure that no impurities are brought to the liquid cooling system.
- Check whether the pipes still leak water.
- After ensuring that the pipes do not leak water, power on the cabinet, keep it running for over 4 hours, and check whether the water leakage detection sensors work properly.
- Ensure that no water sensor alarm is generated on the CCU and the CCU is running properly.
|
Condensation
|
The water temperature of the secondary side is too low.
|
- Check the CDU for condensation. If condensation exists, refer to related documents provided by the CDU vendor and try turning off the proportional valve of the primary side.
- Check whether the temperature and humidity sensors of the CDU are faulty.
- Lower the internal temperature or humidity of the cabinet until the condensation disappears.
- Verify that condensation does not occur when the cabinet is powered on.
|
Solenoid valve fault
|
The solenoid valve does not work.
|
Perform the following procedure and check whether alarms are cleared.
- Check whether the solenoid valve is properly connected to its cable.
- Check whether +, T, U, and - of the TCU correspond to the red, white, orange, and white wires of the solenoid valve respectively. If the connection is incorrect, power off the CCU and reconnect the wires.
- Open the side panel of the cabinet. Press the switch of the electric actuator and rotate its handle. If the handle rotates properly, try replacing the electric actuator.
- If the handle does not rotate properly, replace the solenoid valve because it is stuck by its ball.
|
Quick connect coupling fault
|
The quick connect coupling collar does not slide back.
|
- If the quick connect coupling collar does not slide back, remove and re-insert the quick connect coupling several times.
- If the collar still does not slide back, re-insert the quick connect coupling. When the cabinet is disconnected from water and power supplies, replace the quick connect coupling.
|
Water leakage from a quick connect coupling
|
- If the quick connect coupling collar does not slide back, remove and re-insert the quick connect coupling several times.
- If the collar still does not slide back, re-insert the quick connect coupling. When the cabinet is disconnected from water and power supplies, replace the quick connect coupling.
- If the quick connect coupling slides back but the leakage persists, the O-shape washer may be damaged. Re-insert the quick connect coupling. When the cabinet is disconnected from water and power supplies, replace the quick connect coupling.
|
Monitoring Faults
LCS Alarm
|
Description
|
Procedure
|
Leak0 Alarm: Alarm
|
Alarm reported by the first water leakage detection sensor
|
- Check whether the cabinet leaks water and whether both water leakage detection sensors report alarms.
- Check the yellow wires of the water leakage detection sensors. If the wires have water, power off the system, troubleshoot the leakage and wipe up the water leakage detection sensors. Check whether the alarms are cleared after 1 minute.
- If only one water leakage sensor reports the alarm, the other sensor may be faulty. Try replacing the other sensor.
- If the alarm persists after the preceding troubleshooting, check the water leakage detection sensor.
- If the power indicator of the water leakage detection sensor is not lit, check whether the network cable of the sensor is loose or disconnected. Connect the network cable properly. Then check whether the alarm is cleared after 1 minute.
- Check whether the water leakage detection sensor is properly connected to the ALM-IN port on the EEU of the CCU and whether the wire connections match the silkscreens. If the wire connection is incorrect, power off the CCU, reconnect the wires, power on the CCU, and check for alarms again.
- Check the EEU of the CCU. If it is loosely inserted, install it properly and check for alarms 1 minute later.
- If the alarm persists after the preceding troubleshooting, power off the CCU and replace the water leakage detection sensor.
|
Leak1 Alarm: Alarm
|
Alarm reported by the second water leakage detection sensor
|
Valve Alarm: Fault
|
Electric valve fault alarm
|
- Check the connection between the solenoid valve and the TCU of the CCU. If the wires are disconnected or loosely connected, power off the CCU, reconnect the wires, and power on the CCU again.
- Check whether +, T, U, and - of the TCU correspond to the red, white, orange, and white wires of the solenoid valve respectively. If the connection is incorrect, power off the CCU and reconnect the wires.
- After the preceding troubleshooting, check for alarms five minutes later. If alarms persist, power off the CCU and replace the valve.
|
Temperature Alarm:Temprature High
|
Overtemperature alarm
|
- Check whether the inlet water temperature of the cabinet is the same as that set by the CDU. If they are different, check whether the CDU is faulty.
- Check the upper and lower temperature thresholds and correct them if necessary.
|
Temperature Alarm:Temprature Low
|
Undertemperature alarm
|
Temperature Alarm:Fault
|
Sensor fault alarm
|
- Check the connections between the temperature sensors and the TEM0, TEM1, and TEM2 ports on the TCU of the CCU. If wires are disconnected or loosely connected, connect them properly and check for alarms 1 minute later.
- If the alarms persist after the preceding troubleshooting, power off the CCU and replace the three temperature sensors.
|