Troubleshooting Interface Down Problems
An interface may go down in many situations. To check the interface status, run the display interface command and focus on the current state field. The following symptoms are possible indications of this problem: An interface is in error-down state. An interface is in down state and a prompt message is displayed. An interface is in down state and no prompt message is displayed. Based on the preceding symptoms, this document analyzes the possible causes of the interface down problem and provides the corresponding troubleshooting methods and solutions.
An Interface Is in Error-Down State
When an interface is in error-down state, it cannot receive or send packets, its indicator is off, and the device generates the ERROR-DOWN_1.3.6.1.4.1.2011.5.25.257.2.1 hwErrordown alarm. To check the status of a specific interface, run the display interface command. In the command output, if the current state field displays ERROR DOWN, the interface is in error-down state. The information enclosed in parentheses () following ERROR DOWN indicates the reason why the interface enters the error-down state.
<HUAWEI> display interface 10ge 1/0/1 10GE1/0/1 current state : ERROR DOWN(link-flap) (ifindex: 53) Line protocol current state : DOWN Description: Route Port,The Maximum Transmit Unit is 1500,The Maximum Frame Length is 9216 Internet protocol processing : disabled IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 04f9-388d-e682 Port Mode: AUTO, Port Split/Aggregate: - Speed: AUTO, Loopback: NONE Duplex: FULL, Negotiation: - Input Flow-control: DISABLE, Output Flow-control: DISABLE Mdi: -, Fec: - Last physical up time : - Last physical down time : 2019-03-24 18:28:31 Current system time: 2019-05-15 03:07:30 Statistics last cleared:never ...
Causes and Recovery Measures for Interface Error-Down Events
Causes for Interface Error-Down Events
The error-down mechanism is a protection mechanism provided by CloudEngine series data center switches and involves multiple features including interface, stacking, super virtual fabric (SVF), and security. If any of these features is configured on an interface of such a device and the device detects an exception on the interface or interface-related service, the device shuts down the interface and sets the interface state to error-down to prevent the exception from affecting the entire network.
Table 1-1 describes the causes for interface error-down events on CloudEngine series data center switches.
Feature |
Error-Down Event Type |
Cause |
---|---|---|
Reliability |
auto-defend |
When an interface receives packets from attack sources, the interface is automatically shut down. |
portsec-reachedlimit |
When the number of MAC addresses learned by an interface reaches the limit, the interface goes down. |
|
monitor-link |
When the uplink interface in a Monitor Link group is down or all interfaces in an upstream Smart Link group are down, the associated downlink interfaces go down. |
|
storm-control |
When storm control is configured on an interface, the interface is shut down if the average rate at which the interface receives broadcast packets, multicast packets, or unknown unicast packets is larger than the upper threshold in each of the three consecutive intervals for detecting storms. |
|
Ethernet |
bpdu-protection |
When bridge protocol data unit (BPDU) protection is enabled on the device, an edge interface is shut down when it receives configuration BPDUs. |
m-lag-peer-error |
When M-LAG is used for dual-homing to an Ethernet, VXLAN network, or IP network and the peer-link fails but the heartbeat status is normal, all physical interfaces except the Ethernet management interface, peer-link interface, and stack interface on the M-LAG backup device enter the error-down state. When the peer-link recovers, by default, M-LAG interfaces in error-down state automatically go up after 4 minutes, while other physical interfaces go up immediately. |
|
m-lag-consistency-check-error |
When M-LAG consistency check is enabled and the strict mode is specified, if type 1 configurations of two devices in the M-LAG are inconsistent, M-LAG member interfaces on the M-LAG backup device enter the error-down state and the alarm indicating type 1 configuration inconsistency is generated on the device. |
|
mac-address-flapping |
When the MAC address learned by an interface flaps, the interface goes down. |
|
loopback-detect |
When loopback is detected on an interface (the interface that sends a loopback detection packet receives the packet), the device takes the configured action on the interface. Shutting down the interface is one of the configurable actions. |
|
Interface |
link-flap |
When a link flaps, the associated interface goes physically Down. |
crc-statistics |
When the number of CRC error packets received by an interface exceeds the threshold, the interface is shut down. |
|
fabric-link-failure |
When traffic exceptions occur due to faults on the Serdes links between LPUs and SFUs, interfaces on the LPUs are shut down (applicable only to the CE12800 series switches). |
|
forward-engine-buffer-failed |
If a large number of outgoing packets are discarded on an interface due to an interface buffer exception, the interface is shut down. |
|
forward-engine-interface-failed |
When the system detects that a fault occurs in the forwarding engine and an interface is unstable, the interface is shut down because packet loss may occur or traffic may fail to be forwarded. |
|
transceiver-power-low |
When the optical power of an interface falls below the default lower threshold, the interface is shut down. |
|
Stacking |
no-stack-link |
If there is no forwarding link between two stack members, all service ports of one member device go down. |
resource-mismatch |
When the resource mode or interface split configuration of a stack member that is starting is inconsistent with that of the master switch, all service interfaces except stack member interfaces on the stack member go down. Resource modes include the low-latency network mode, Eth-Trunk quantity, tunnel mode, and EEDB resource mode. |
|
stack-config-conflict |
During the setup of a stack, if other stack members have the stack configuration that conflicts with the master switch, the stack may fail to be set up, and all interfaces of these stack members (excluding the master switch) go down. |
|
dual-active-fault-event |
|
|
stack-member-exceed-limit |
When the number of member switches exceeds the limit, interfaces of excess switches go down. |
|
stack-packet-defensive |
A stack member interface receives a large number of stack protocol packets or stack error packets in a short time. |
|
SVF |
spine-type-unsupported |
The CE6850-48T4Q-EI, CE6850-48T6Q-HI, or CE6855-48T6Q-HI cannot function as a parent switch to join an SVF system. When a CE6850-48T4Q-EI, CE6850-48T6Q-HI, or CE6855-48T6Q-HI attempts to join the stack set up by an SVF parent switch, the interfaces of the CE6850-48T4Q-EI, CE6850-48T6Q-HI, or CE6855-48T6Q-HI are shut down. |
stack-member-exceed-limit |
During the setup of an SVF system of fixed switches, if the number of parent switches is greater than 2 due to incorrect configurations or cable connections, excess devices cannot join the SVF system and interfaces of these excess devices are shut down. |
|
leaf-mstp |
In an SVF system, the downlink interfaces of leaf switches that receive BPDUs are shut down. |
Interface Error-Down Detection Mechanism
If interfaces on a device enter the error-down state, the device has detected an exception. When does the device detect an exception?
- After the device is started properly, exception detection starts, for example, link flapping detection.
- After basic functions of a feature are configured, the system detects exceptions related to the feature, for example, stack-related exceptions: resource-mismatch and stack-config-conflict.
- After you configure an independent exception detection function or sub-function, the system starts to detect exceptions, such as BPDU protection and MAC address flapping detection.
After a device detects an exception on an interface and sets the interface to the error-down state, the interface can be recovered from the error-down state automatically or manually.
- Before recovering the interface from the error-down state, you need to rectify the service fault based on the cause of the interface error-down event. This prevents the interface from entering the error-down state again.
- An interface cannot be recovered from the error-down state when the configuration that triggers the interface to enter the error-down state is deleted.
- For details about possible causes of an interface error-down event and corresponding recovery methods, see CloudEngine Series Switches Error-Down Mechanism.
Automatic Recovery from the Error-Down State
The automatic recovery function enables an interface in error-down state to automatically restore to the up state after a specified delay. This function must have been configured before a device detects an exception.
This function takes effect simultaneously on all interfaces that enter the error-down state due to the same cause. Compared with the manual recovery function, the automatic recovery function improves the efficiency and restores all interfaces that enter the error-down state due to the same cause.
- The automatic recovery function does not take effect on interfaces that have been in error-down state. It takes effect only on interfaces that change to the error-down state after it is configured. Therefore, you are advised to configure the automatic recovery function when configuring services.
- It takes a certain period of time to rectify service faults after an interface transits to the error-down state. Therefore, you are advised to set the recovery delay to a long period of time, for example, 1 hour.
# Procedure
- Run the system-view command to enter the system view.
- Run the error-down auto-recovery cause { auto-defend | bpdu-protection | crc-statistics | dual-active | fabric-link-failure | forward-engine-buffer-failed | forward-engine-interface-failed | link-flap | loopback-detect | m-lag | m-lag-consistency-check | mac-address-flapping | no-stack-link | portsec-reachedlimit | stack-config-conflict | stack-member-exceed-limit | stack-packet-defensive | storm-control | transceiver-power-low } interval interval-value command to set the delay after which an interface in error-down state automatically goes up.
Different device models support different parameters. To obtain all the parameters supported by a device, enter the error-down auto-recovery cause command followed by a question mark (?) in the system view.
- Run the display error-down recovery command to check information about interfaces in error-down state, including the interface name, cause of the error-down event, recovery delay, and remaining time for the up event.
Manual Recovery from the Error-Down State
To manually recover interfaces from the error-down state, you need to run commands on the interfaces one by one. This function applies when the error-down automatic recovery function is not configured during service configuration. In manual recovery, you need to run commands on interfaces in error-down state one by one, which is time-consuming and error-prone. You can manually recover an interface from the error-down state using either of the following methods:
- Run the shutdown and undo shutdown commands in sequence in the interface view to restart the interface.
- Run the restart command in the interface view to restart the interface.
Example: What If an Interface Is in ERROR DOWN(link-flap) State?
Fault Symptom
If the indicator of a connected interface on CloudEngine series switches is off, the interface is not up. Run the display interface command on the device to check the interface status.
<HUAWEI> display interface 10GE 1/0/1 10GE1/0/1 current state : ERROR DOWN(link-flap) (ifindex: 5) Line protocol current state : DOWN ---- More ----
According to the current state field, the physical status of the interface is ERROR DOWN(link-flap), indicating that the interface cannot work properly because the link flaps.
Possible Causes
If the interface status displays ERROR DOWN(link-flap), the interface has link flapping protection enabled and has frequently alternated between up and down states. By default, if an interface alternates between up and down states five times within 10 seconds, the device shuts down the interface and records the interface status as ERROR DOWN(link-flap). If the device detects frequent up/down state changes on an interface, the device shuts down the interface to prevent frequent network topology changes, or triggers an active/standby link switchover if there is a standby link to prevent service interruptions.
By default, link flapping protection is enabled on CloudEngine series switches in V100R002C00 with the patch V100R002SPH006 loaded as well as in V100R003C00 and later versions.
Troubleshooting Procedure
If an interface is in ERROR DOWN(link-flap) state, you are advised to perform the following operations:
- Check historical alarms. If many up/down alarms of the interface were generated for a long time, the interface status is unstable. You are advised to check whether the optical modules and optical fibers on the local and remote interfaces are normal.
- Recover the interface. By default, an interface in error-down state cannot automatically recover. You need to run the shutdown and undo shutdown commands to manually recover the interface. In V200R002C50 and later versions, you can remove and reinstall the transmission medium on an optical interface to manually recover the interface. If an interface is currently not in error-down state, run the error-down auto-recovery cause link-flap interval interval-value command to configure the interface to automatically go up when it is in ERROR DOWN(link-flap) state.
- Disable link flapping detection on the interface. To disable link flapping detection on a specific interface, enter the interface view first.
- In V100R002C00 and earlier versions, run the undo port link-flap trigger error-down command to disable link flapping detection.
- In V100R002C00 with the patch V100R002SPH006 loaded as well as in V100R003C00 and later versions, run the port link-flap threshold 0 command to disable link flapping detection.
If link flapping detection is disabled on an interface, the system does not detect faults caused by link flapping on the interface in real time. Exercise caution when deciding to perform this operation.
- Adjust the link flapping detection threshold. If the network adapter of a server frequently experiences an intermittent disconnection during server startup, the device may incorrectly determine the interface status. In this case, you can run the port link-flap { interval interval-value threshold threshold-value | interval interval-value | threshold threshold-value } command to increase the link flapping detection threshold.
An Interface Is in Down State and a Prompt Message Is Displayed
When an interface is in down state, its indicator is off and it cannot send or receive packets. To check the status of a specific interface, run the display interface command. In the command output, if the current state field displays DOWN, the interface is in down state. The information enclosed in parentheses () following DOWN indicates the reason why the interface enters the down state.
<HUAWEI> display interface 10ge 1/0/1 10GE1/0/1 current state : DOWN(Transceiver type mismatch) (ifindex: 53) Line protocol current state : DOWN Description: Route Port,The Maximum Transmit Unit is 1500,The Maximum Frame Length is 9216 Internet protocol processing : disabled ...
Causes and Recovery Measures for Interface Down Events
Table 1-2 describes the causes and recovery measures for interface down events.
Interface Down Event |
Cause |
Recovery Measures |
||
---|---|---|---|---|
An interface is physically down. |
DOWN(Transceiver speed mismatch) |
The optical module rate of the interface does not match the interface. |
Run the speed command to manually adjust the interface rate or replace the optical module with the one that matches the interface. |
|
DOWN(Transceiver type mismatch) |
Cause 1 (general cause): The optical module type does not match the interface. Cause 2 (special cause): When the QSFP+ to QSFP+ high-speed cable or QSFP28 to QSFP28 high-speed cable on the CE6875EI is not used as a stack cable or an M-LAG peer-link interface cable, the device reports the transceiver type mismatch. |
Measure 1: Manually adjust the interface rate or replace the optical module. Measure 2: Replace the cable with the one that matches the interface. |
||
DOWN(The optical power is too low) |
The optical power is low. |
Replace the optical module with the one that matches the interface. |
||
DOWN(Transceiver loose) |
The optical module is not properly installed. |
Remove and reinstall the optical module. |
||
DOWN(Negotiation unsupported) |
Auto-negotiation is not supported. |
Run the negotiation disable command to disable the auto-negotiation function. |
||
DOWN(Port mode mismatch) |
The interface mode does not match. |
Run the port mode command to change the interface mode. |
||
DOWN(Cable for stack or peer-link interface only) |
When QSFP+ to QSFP+ or QSFP28 to QSFP28 high-speed cables are used, only stack ports or peer-link ports can be up. |
Replace the cable with the one that matches the interface. |
||
DOWN(Fast-up configuration mismatch) |
The fast up function configuration does not match. |
Adjust the fast up function configuration. |
||
DOWN(Trunk error down) |
The Eth-Trunk interface to which the interface belongs is in error-down state. |
Recover Eth-Trunk member interfaces from the error-down state. For details, see An Interface Is in Error-Down State. |
||
DOWN(Port unavailable) |
The interface is unavailable. |
Reduce the interface rate because this fault is typically caused by the rate increase of the interface with the variable rate. |
||
Link faults occur. |
TRIGGER DOWN(1AG auto recover) |
CFM detects link faults. |
Run the shut down command on the interface, and 7 seconds later, run the undo shut down command on the interface. |
|
The interface is forced administratively down. |
Administratively DOWN |
A network administrator has run the shutdown command on the interface. |
The network administrator needs to run the undo shutdown command on the interface. |
|
Traffic is down. |
Flow DOWN |
Traffic status of the interface is down. |
This status is determined by the status of the management Virtual Router Redundancy Protocol (mVRRP) bound to the interface. If the mVRRP status is Backup or Initialize, the traffic status of the interface is down. After the mVRRP recovers, the traffic status of the interface automatically goes up. |
Example: What If an Interface Is in DOWN(Transceiver type mismatch) State?
Fault Symptom
If the indicator of a connected interface on CloudEngine series switches is off, the interface is not up. Run the display interface command on the device to check the interface status.
<HUAWEI> display interface 10ge 1/0/5 10GE1/0/1 current state : DOWN(Transceiver type mismatch) (ifindex: 198) Line protocol current state : DOWN ---- More ----
The current state field shows that the physical status of the interface is Down(Transceiver type mismatch), indicating that the optical module type does not match the interface. As a result, the interface cannot work properly.
Possible Causes
When an interface is in Down(Transceiver type mismatch) state, the possible causes are:
- The optical module type does not match the interface configuration.
- If the optical module or card for the interface is replaced after the current configurations take effect and the new optical module or card does not support the original configurations, the interface may change to the Down(Transceiver type mismatch) state. The interface goes up after the configurations that are not supported by the current module are deleted or the current module is replaced with a module that supports the original configurations.
- If an optical module with an MPO connector is installed on an optical interface that is not split or an optical module with an LC connector is installed on an optical interface that is split, the interface is in Down(Transceiver type mismatch) state after the interface split command is run.
- If the training disable command is configured on two 40GE or 100GE optical interfaces that have 40GE copper cables connected and a 40GE optical module is installed on one interface, this interface changes to the DOWN(Transceiver type mismatch) state. The interface goes up after the undo training disable command is run.
- The transmission medium is incorrectly used.
- When a QSFP+ to QSFP+ high-speed cable or QSFP28 to QSFP28 high-speed cable is not used as a stack cable or M-LAG peer-link interface cable, the CE6875EI reports the Transceiver type mismatch.
- On the CE6863, CE6863K, and CE6881E, when a 25GE interface works at the rate of 25 Gbit/s, 1 m, 3 m, and 5 m SFP28 high-speed cables can be connected to the interface in V200R020C00 and later versions. When a non-1 m copper cable is connected to the interface, the Reed-Solomon Forward Error Correction (RS-FEC) function must be enabled on the interface. Otherwise, the interface is in Down(Transceiver type mismatch) state.
- The optical module type does not match the quad synchronous adapter (QSA).
On the CE6855HI, CE6856HI, and CE7855EI, a QSA adapter can be installed on a 40GE interface that has the interface split function configured. Installing a transmission medium whose rate is 10 Gbit/s on the QSA adapter enables the 40GE interface to function as a 10GE interface. Only the first 10GE interface that is split from the 40GE interface is working, while other three 10GE interfaces are unavailable. If a QSA adapter is installed on an interface that does not have the interface split function configured or a transmission medium whose rate is not 10 Gbit/s is installed on the QSA adapter on an interface that has the interface split function configured, the interface enters the Down(Transceiver type mismatch) state.
Troubleshooting Procedure
You are advised to check whether the transmission medium is incorrectly used. If the transmission medium does not match the interface, replace the optical module, optical fiber/cable, or copper cable to ensure that the transmission medium is correctly used.
Check whether the interface configuration matches the optical module type. If not, modify the interface configuration to rectify the interface fault.
- If a QSA adapter is used on the interface, check whether the interface split function is correctly configured and whether the matching transmission medium is used. If not, modify the interface split configuration or replace the medium to rectify the interface fault.
If the fault persists, see "Interface Troubleshooting" in Troubleshooting - Hardware.
An Interface Is in Down State and No Prompt Message Is Displayed
When an interface is in down state, its indicator is off and it cannot send or receive packets. To check the status of a specific interface, run the display interface command. In the command output, if the current state field displays DOWN, the interface is in down state.
<HUAWEI> display interface 10ge 1/0/1 10GE1/0/1 current state : DOWN(ifindex: 53) Line protocol current state : DOWN Description: Route Port,The Maximum Transmit Unit is 1500,The Maximum Frame Length is 9216 Internet protocol processing : disabled ...
Checking Whether the Prerequisites for Bringing an Interface to Up Are Met
If an interface is down and no prompt message is displayed, check whether the prerequisites for bringing the interface up are met. Whether an interface is enabled is determined by a combination of multiple actions, including the [ undo ] shutdown command. An interface is enabled only when all the conditions are met. To check diagnostic information about whether interfaces are enabled, run the display system internal device port command.
# Check diagnostic information about 10GE1/0/1.
<HUAWEI> display system internal device port 10ge 1/0/1
Port create related check:
--------------------------------------------------------------------------------
Item LogicCfg PhyCfg Picm IsPass
--------------------------------------------------------------------------------
board module 0x3000001f 0x3000001f N/A YES
board device 0x14000207 0x14000207 N/A YES
lfe device 0x80000000 0x80000000 N/A YES
pic module 0x50000020 0x50000020 N/A YES
pic device 0x43000006 0x43000006 N/A YES
panelport 1 0x50000004 0x50000004 N/A YES
media type 1 -- -- N/A NO
port device 0x6000002f 0x6000002f N/A YES
--------------------------------------------------------------------------------
Port enable related check:
--------------------------------------------------------------------------------
DevType AttrName AttrValue ExpectValue IsPass
--------------------------------------------------------------------------------
board isFastUpgrade 0 == 0 YES
PhyLpuBrd isIssuUpgrade 0 == 0 YES
PhyLfe Status 0x10001 != 0 YES
port isAvailable 0x1 == 1 YES
port isshut 0x1 == 1 YES
port portlfeisup 0x1 == 1 YES
port portissuup 0 == 0 YES
port triggerShut 0x1 == 1 YES
port port12x100gDown 0 == 0 YES
port phyportisshut 0x1 == 1 YES
--------------------------------------------------------------------------------
Port physical related check:
--------------------------------------------------------------------------------
Link Enable Speed Negotiation Loopback
--------------------------------------------------------------------------------
DOWN DISABLE 100000 DISABLE PHY
--------------------------------------------------------------------------------
In the preceding command output, if any value of the IsPass field in Port enable related check is not YES, the interface does not go up.
Check Item |
Description |
---|---|
isFastUpgrade: portsec-reachedlimit |
If the IsPass field for this item does not display YES, the interface finds that the current system is in fast stack upgrade state and will not go up. Check whether the current environment is in the fast stack upgrade. |
isIssuUpgrade: ISSU upgrade |
If the IsPass field for this item does not display YES, the interface finds that the current system is in ISSU upgrade state and will not go up. |
Status: synchronization finish flag of the LFE component. |
If the IsPass field for this item does not display YES, the interface finds that the current system is still in synchronization state and will not go up. You can run the display fei frame boot state slot 1 component feisw command to check the startup status of the FEISW component. |
isAvailable: interface offline state |
If the IsPass field for this item does not display YES, the interface is offline. |
isshut: The shutdown command is run on the interface. |
Check whether the shutdown command is run on the interface. |
portlfeisup: whether the forwarding engine instance (FEI) to which the interface belongs is up. |
If the IsPass field for this item does not display YES, the FEI where the interface belongs is not up, that is, the status attribute of phyLfe is invalid. |
portissuup: ISSU state |
Check whether the system is in ISSU upgrade state. |
triggerShut: The interface is triggered to go down. |
Check whether the related configuration triggers the shutdown of the interface. |
port12x100gDown: whether an unregistered SFUC is installed on the CE12800 that has a 12*100G card installed. |
If the IsPass field for this item does not display YES, 12*100G cards are installed on the CE12800 but SFUCs are not registered. Check whether there are registered SFUCs. |
phyportisshut: The value of this attribute is determined by the preceding checks. If all the preceding checks succeed, but this check fails, an internal logic error occurs. |
Collect log information, and contact Huawei technical support. For details, see Collecting Information and Seeking Technical Support. |
For details about the possible causes and solutions of the failed check items, see Troubleshooting Guide or collect logs and contact Huawei technical support.
Checking the Interface Transmission Medium Status
If the interface is enabled but remains in down state, check the transmission medium status of the interface to determine whether the fault is caused by hardware or environment factors.
- For an Ethernet optical interface
- Check whether the optical module is properly installed.
Run the display interface transceiver command. If no command output is displayed or the command output is abnormal, the optical module is not properly installed. In this case, remove and reinstall the module.
You can run the display device slot <slot-id> command to check whether optical modules on a device are properly installed in a batch.
- Check whether the optical module matches.
Run the display interface interface-type interface-number transceiver verbose command to check optical module information.
<HUAWEI> display interface 10ge 1/0/1 transceiver verbose 10GE1/0/1 transceiver information: ------------------------------------------------------------------- Common information: Transceiver Type :1000BASE_SX //Indicates the optical module type. Connector Type :LC //Indicates the type of the optical fiber connector. Wavelength (nm) :850 //Indicates the optical wavelength. Transfer Distance (m) :150(62.5um/125um OM1) 300(50um/125um OM2) 400(50um/125um OM4) Digital Diagnostic Monitoring :YES Vendor Name :SumitomoElectric Vendor Part Number :SCP6F86-GL-CWH Ordering Name : ------------------------------------------------------------------- Manufacture information: Manu. Serial Number :7YK056C08623 Manufacturing Date :2007-11-13 Vendor Name :SumitomoElectric ------------------------------------------------------------------- Alarm information: Non-Huawei-Ethernet-Switch-Certified Transceiver LOS alarm -------------------------------------------------------------------
View the Transceiver Type field to check whether the local optical module type matches the remote optical module type. For example, if a GE optical module is installed on the remote interface and a 10GE optical module is installed on the local interface, the interfaces will not go up. You can replace the optical module on the local or remote interface based on actual requirements to ensure that local and remote optical modules have the same rate.
View the Manu. Serial Number field to check whether the serial number of the copper cable connected to the local interface is the same as that connected to the remote interface. If the serial numbers are inconsistent, you can replace the copper cable connected to the local or remote interface based on actual requirements to ensure that the serial numbers of the copper cables at both ends are consistent.
View the Transfer Distance field to check the transmission distance of the optical module, and determine whether the optical fiber length is within the allowed transmission distance range of the optical module based on the optical fiber type. For example, the transmission distance supported by OM1 optical fibers in the preceding command output is 150 m. If the actual transmission distance exceeds 150 m, use an optical fiber with a longer transmission distance.
View the Alarm information field to check whether the optical module is a non-Huawei-certified optical module. If the alarm message "Non-Huawei-Certified Transceiver" or "Non-Huawei-Ethernet-Switch-Certified Transceiver" is displayed, the optical module is a non-Huawei-certified optical module. Replace it with a Huawei-certified optical module.
If Alarm information displays LOS Alarm, the local optical module does not receive optical signals. The remote interface, optical fiber, or optical module may be abnormal. Run the display this command in the interface view to check whether the interfaces at both ends are shut down. If so, run the undo shutdown command. If not, check whether the optical modules and optical fibers at both ends are normal.
- Check whether the receive optical power and transmit optical power are normal.
Run the display interface interface-type interface-number transceiver verbose command to check whether the receive optical power and transmit optical power are within the normal range.
<HUAWEI> display interface 10ge 1/0/1 transceiver verbose ...... ------------------------------------------------------------------- Diagnostic information: Temperature (Celsius) :33.68 Voltage (V) :3.29 Bias Current (mA) :7.97 Bias High Threshold (mA) :13.20 Bias Low Threshold (mA) :4.00 Current RX Power (dBm) :-2.15 Default RX Power High Threshold (dBm) :1.00 Default RX Power Low Threshold (dBm) :-11.90 Current TX Power (dBm) :-2.07 Default TX Power High Threshold (dBm) :1.00 Default TX Power Low Threshold (dBm) :-9.30 -------------------------------------------------------------------
If the receive optical power is low (Current RX Power has a smaller value than Default RX Power Low Threshold), the transmit signal strength of the remote optical module is poor. As a result, the local interface may go down or discard packets after it goes up. Check whether the distance between the local and remote devices exceeds the maximum transmission distance of the remote optical module. If not, check whether the optical module and optical fiber on the remote interface are damaged. If they are damaged, replace them.
If the receive optical power is high (Current RX Power has a larger value than Default RX Power High Threshold), the transmit signal strength of the remote optical module is too high. The possible reason is that the distance between the local and remote devices is short but a long-distance optical module is used. In this case, install an optical attenuator on the remote optical module to reduce the transmit power.
If the transmit optical power is low (Current TX Power has a smaller value than Default TX Power Low Threshold), the transmit signal strength of the local optical module is poor or the local optical module is faulty. As a result, the local interface may go down or discard packets after it goes up. Replace the local optical module or contact Huawei technical support.
If the transmit optical power is high (Current TX Power has a larger value than Default TX Power High Threshold), the transmit signal strength of the local optical module is too high. This may cause a continuous high receive power on the remote optical module and the remote optical module may be burnt. The possible cause is that the local optical module is faulty. You are advised to replace the local optical module.
- Check whether the optical module matches the optical fiber.
- Check the optical module type.
Generally, an optical module has a label attached, identifying the rate, center wavelength, and mode (single-mode or multimode) of the optical module, as shown in Figure 1.
You can obtain more information about optical modules in the following ways:Install an optical module on an interface of a matching type, and then run the display interface transceiver command to check optical module information.
- Check whether the optical module matches the optical fiber.
A single-mode optical module (typically with a center wavelength of 1310 nm or 1550 nm) must be used with single-mode optical fibers (typically yellow).
A multimode optical module (typically with a center wavelength of 850 nm) must be used with multimode optical fibers (typically orange).
- DWDM optical module: If a DWDM optical module is used, run the wavelength-channel channel-number command to manually configure the wave channel number corresponding to the center wavelength of the DWDM optical module and check whether the wavelengths of the optical modules on the interfaces at both ends are the same.
The center wavelength of a DWDM optical module can be configured only on a 10GE or 25GE interface that has a 10GE DWDM optical module installed or has a 10GE DWDM optical module pre-configured.
- Check the optical module type.
- Perform an external loopback test on the optical module.
Use an optical fiber to connect the Tx and Rx ends of the optical module to perform an external loopback test on the optical module (use an optical attenuator for a long-distance optical module). If the interface indicator is steady on and the interface can go up, the local interface and optical module are normal. Otherwise, the local interface and optical module may be abnormal. You are advised to use a normal optical module or interface for test.
Avoid loops when performing the external loopback test. The optical module will work abnormally if the receive optical power is too high. Therefore, avoid high receive optical power on the optical module. In most cases, a short-distance optical module and a multimode optical fiber are used in an external loopback test. In addition, run a command to check the receive optical power and ensure that the receive optical power is lower than the upper threshold.
- Perform a replacement test.
Perform a replacement test for the optical module and optical fiber link, and locate the failure point based on the test result. You can replace the local interface, local optical module, optical fiber link (including the optical fiber, optical distribution frame, fiber splice point, optical splitter, and intermediate wavelength division multiplexing (WDM) transmission device), remote optical module, and remote interface.
If an intermediate WDM transmission device exists, bypass or replace the intermediate device. If the local and remote interfaces go up normally, the intermediate device is faulty.
Replace the optical fiber and optical distribution frame. If the local and remote interfaces go up normally, the optical fiber link is faulty.
If the local and remote interfaces go up normally after the local optical module is replaced, the local optical module is faulty.
If the local and remote interfaces go up normally after the local interface is replaced, the local interface is faulty.
If the local and remote interfaces go up normally after the remote optical module is replaced, the remote optical module is faulty.
If the local and remote interfaces go up normally after the remote interface is replaced, the remote interface is faulty.
Replace the optical module and optical fiber preferentially. If a WDM or other transmission device exists on the optical fiber link, remove the intermediate device to directly connect the local and remote devices or replace the intermediate device. Then replace the interfaces and devices to check whether the fault occurs on an interface or a device.
- Check whether the optical module is properly installed.
- For an Ethernet electrical interface
- Check the cable status. Run the virtual-cable-test command in the interface view to check the cable status. If the cable status is displayed as Open or Short rather than OK, replace the cable.
- Check whether the local interface is connected to the correct remote interface.
Checking Whether the Interface Configuration Is Correct
If the transmission medium status of the interface is normal, check whether the fault is caused by the interface configuration. The following factors will affect the state of an interface:
- Negotiation status
The negotiation status of two connected interfaces must be the same. Otherwise, the interfaces may go down. In this case, ensure that the negotiation status of the two interfaces is the same.
Run the display interface interface-type interface-number command to check whether the negotiation status of two connected interfaces on both ends is the same.<HUWEI> display interface 10ge 1/0/1 10GE1/0/1 current state : DOWN (ifindex: 52) Line protocol current state : DOWN Description: Switch Port, PVID : 1, TPID : 8100(Hex), The Maximum Frame Length is 9216 Internet protocol processing : disabled IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is e468-a357-cbc1 Port Mode: COMMON COPPER, Port Split/Aggregate: DISABLE Speed: 1000, Loopback: NONE Duplex: FULL, Negotiation: ENABLE Input Flow-control: DISABLE, Output Flow-control: DISABLE Mdi: AUTO, Fec: NONE Last physical up time : - ...
After an SFP-GE copper or optical module is installed on a GE or 10GE optical interface, or a high-speed cable is connected to a 40GE optical interface, the interface works in auto-negotiation mode by default and can be configured to work in non-auto-negotiation mode using the negotiation disable command. If you replace the module after the non-auto-negotiation mode takes effect and the new module does not support the negotiation disable command configuration, the interface goes down. For example, if an SFP-FE optical module is installed on a GE optical interface, an SFP+ optical module is installed on a 10GE optical interface, or a QSFP+ optical module is installed on a 40GE optical interface, the interface goes down. You can run the undo negotiation disable command in the interface view to delete the original non-auto-negotiation mode configuration to make the interface go up again.
A 40GE optical interfaces work in auto-negotiation mode by default when it has a high-speed cable connected. If the interface on the peer device does not support auto-negotiation, run the negotiation disable command to configure the local 40GE interface to work in non-auto-negotiation mode.
When a 40GE interface on the CE-L36LQ-EG card is connected to an interface on some types of network adapters (such as Mellanox network adapters) using a passive QSFP+ high-speed cable, the two interfaces may fail to go up due to different 40GBASE-CR4 negotiation mechanisms used on the interfaces. In this case, disable the auto-negotiation function on the 40GE interface so that the two interfaces can go up and communicate properly.
# Disable auto-negotiation on 10GE1/0/1.
<HUAWEI> system-view [~HUAWEI] interface 10ge 1/0/1 [~HUAWEI-10GE1/0/1] negotiation disable [*HUAWEI-10GE1/0/1] commit
CloudEngine series switches provide the following methods to check the default auto-negotiation status, interface rate, and whether attributes such as auto-negotiation and flow control auto-negotiation can be configured, based on the interface type and transmission medium:
- Use the Interface Query Tool (https://info.support.huawei.com/network/ptmngsys/Web/DC/en/interface_query.html).
- See "Which Ethernet Interfaces of CE Switches Support Auto-Negotiation?" in FAQs - Interface Management - Ethernet Interface in Troubleshooting Guide.
- Interface rate
If the rates of two connected interfaces are different, negotiation between the two interfaces fails. In this case, set the interface rate to ensure that the rates of the two interfaces are the same. If the rate displays auto, the lower-layer link is not established. Check other possible causes or connect the interface to other interfaces to locate the fault.
CloudEngine series switches provide the following methods to check the default auto-negotiation status, interface rate, and whether attributes such as auto-negotiation and flow control auto-negotiation can be configured, based on the interface type and transmission medium:
- Use the Interface Query Tool (https://info.support.huawei.com/network/ptmngsys/Web/DC/en/interface_query.html).
- See "Which Ethernet Interfaces of CE Switches Support Auto-Negotiation?" in FAQs - Interface Management - Ethernet Interface in Troubleshooting Guide.
- FEC status
If the FEC status of two connected interfaces is different, the interfaces cannot go up. To check the FEC status of interfaces, run the display interface command. If the FEC status of the two interfaces is different, disable the auto-negotiation function and run the fec mode command to configure the FEC mode.
- Duplex mode
An Ethernet interface works in either half-duplex or full-duplex mode. Interfaces on both ends must use the same duplex mode. Otherwise, the interfaces cannot go up.
Only 10GE electrical interfaces on the CE6850-48T4Q-EI in V100R003C10 or later versions support the duplex mode configuration when the interface rate is 100 Mbit/s. All Ethernet interfaces on other CloudEngine series switches work in full-duplex mode and do not support the half-duplex mode.
- Interface split configuration
If interface split is not correctly configured on the interface or an inappropriate cable is used after interface split, the interface may go down. When a 40GE interface that is not split is connected to four remote 10GE interfaces over a one-to-four cable, the 40GE interface cannot go up and its indicator is off, whereas the remote 10GE interfaces may be up and their indicators are on but they cannot work properly. You need to correctly configure interface split or replace the cable used after interface split with an appropriate one.
For details about whether interfaces on a device support interface split, interface split types, and cables used after interface split, see Interface Split Query Tool.
- Fault signal detection and filtering
The fault signal filtering function prevents frequent transitions between up and down due to reasons such as link signal jitter, improving link stability. After the fault signal filtering function is enabled on an interface, the interface may go up and the remote interface may go down for a short period of time. If two connected devices on both ends support the fault signal filtering function, you are advised to enable this function on the two devices. If one end supports the function but the other end does not, you are advised to disable the function.
For more details about the fault signal filtering function, see "When Two CE Switches Are Connected, Why Does the Interface on One End Go Up and the Interface on the Other End Go Down Temporarily?" in FAQs - Interface Management - Ethernet Interface in Troubleshooting Guide.
- Loopback detection
If the loopback detection function is configured on the remote interface, the local interface may go down. Disable the loopback detection function on the remote interface.
- Before testing some special functions such as locating an Ethernet fault, enable loopback detection on the desired Ethernet interface to check whether the interface is working properly.
The internal loopback detection function affects other functions and may prevent the interface or link from working properly. When the test is complete, run the undo loopback command to disable loopback detection. The original configuration is restored after loopback detection is disabled.
You are not advised to configure the auto-negotiation, flow control, flow control auto-negotiation, interface rate, FEC, and EEE of electrical interfaces together with the loopback detection function on the interface. Otherwise, the configuration may not take effect.
- Unidirectional single-fiber communication
A single fiber means that two optical modules are connected by only one fiber, and unidirectional communication means that packets can be sent in only one direction from the sender to the receiver.
An optical module provides a TX end and an RX end. The TX and RX ends of one optical module are respectively connected to the RX and TX ends of another module. A device transmits and receives packets through two independent fibers. If the unidirectional single-fiber communication function is disabled, two devices cannot communicate with each other through a single fiber. After the single-fiber enable command is run on interfaces, the devices can communicate with each other unidirectionally.
After the unidirectional single-fiber communication function is configured on an interface, the interface will be in down state if no optical module is inserted, or if a single-fiber bidirectional optical module or high-speed cable is inserted into the interface. In this case, replace the optical module (use an optical module that is not a single-fiber bidirectional optical module) to rectify the fault. If no optical module is inserted into the interface, run the device transceiver fiber command to pre-configure the transmission medium type of the interface as fiber.
For details about the types of single-fiber bidirectional optical modules, see "Do the CloudEngine Series Switches Support BIDI Optical Modules?" in FAQs - Interface Management - Optical Module in Troubleshooting Guide. Optical modules except single-fiber bidirectional optical modules support unidirectional single-fiber connections.
- Energy saving protocols
- Automatic laser shutdown (ALS): can be configured for optical interfaces to control the pulse of an optical module's laser. If a fiber is not properly installed on an optical interface or an optical fiber link fails, ALS shuts down the laser to save energy and prevent eye damage.
- Energy Efficient Ethernet (EEE): is supported only on electrical interfaces (except the Ethernet management interface). After EEE is configured on an interface, the system adjusts the power supply of the interface when the interface is idle. The interface then enters the low power mode (sleeping mode), which reduces the total power consumption of the system. When the interface starts to transmit data, its power supply is restored to the normal state.
- Interface dormancy: can be configured only on GE electrical interfaces (except the Ethernet management interface). After interface dormancy is configured on an interface, the interface is automatically shut down to save power when it is idle.
If one or more of the preceding energy saving protocols are configured on an interface, check whether the configuration is correct.
Example: Why Does an Interface Not Go Up After a Supported GE Optical Module Is Installed?
Fault Symptom
A 10GE optical interface supports GE optical modules. However, after a GE optical module is installed on the interface, the physical status of the interface is not up. The interface status is as follows:
<HUAWEI> display interface 10ge 1/0/1 10GE1/0/1 current state : DOWN (ifindex: 9) Line protocol current state : DOWN Description: Switch Port, PVID : 1, TPID : 8100(Hex), The Maximum Frame Length is 9216 Internet protocol processing : disabled IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0025-9e01-0204 Port Mode: COMMON FIBER, Port Split/Aggregate: DISABLE Speed : 10000, Loopback: NONE Duplex: FULL, Negotiation: - Input Flow-control: DISABLE, Output Flow-control: DISABLE Mdi: -, Fec: - Last physical up time : - Last physical down time : 2020-04-10 03:12:43 Current system time: 2020-04-13 10:42:27 ......
Possible Causes
- The prerequisites for bringing the interface up are not met.
- The optical module is not securely installed or the optical fiber fails.
- The interface does not support the optical module model.
- The transmission mode and length of the optical fiber are improper.
- Negotiation between the local and remote interfaces fails.
- The optical module fails.
Troubleshooting Procedure
- Ensure that the interface is not shut down forcibly and is not in the process of a fast stack upgrade or smooth upgrade.
Run the display system internal device port command to view internal diagnostic information of 10GE1/0/1.
<HUAWEI> display system internal device port 10ge 1/0/1 Port create related check: -------------------------------------------------------------------------------- Item LogicCfg PhyCfg Picm IsPass -------------------------------------------------------------------------------- board module 0x3000001f 0x3000001f N/A YES board device 0x14000207 0x14000207 N/A YES lfe device 0x80000000 0x80000000 N/A YES pic module 0x50000020 0x50000020 N/A YES pic device 0x43000006 0x43000006 N/A YES panelport 1 0x50000004 0x50000004 N/A YES media type 1 -- -- N/A NO port device 0x6000002f 0x6000002f N/A YES -------------------------------------------------------------------------------- Port enable related check: -------------------------------------------------------------------------------- DevType AttrName AttrValue ExpectValue IsPass -------------------------------------------------------------------------------- board isFastUpgrade 0 == 0 YES PhyLpuBrd isIssuUpgrade 0 == 0 YES PhyLfe Status 0x10001 != 0 YES port isAvailable 0x1 == 1 YES port isshut 0x1 == 1 YES port portlfeisup 0x1 == 1 YES port portissuup 0 == 0 YES port triggerShut 0x1 == 1 YES port port12x100gDown 0 == 0 YES port phyportisshut 0x1 == 1 YES -------------------------------------------------------------------------------- Port physical related check: -------------------------------------------------------------------------------- Link Enable Speed Negotiation Loopback -------------------------------------------------------------------------------- DOWN DISABLE 100000 DISABLE PHY --------------------------------------------------------------------------------
In the preceding command output, if any value of the IsPass field in Port enable related check is not YES, the interface does not go up.
- Run the display interface [ interface-type interface-number ] transceiver verbose command to check whether the optical module information is normal.
- If alarm information is displayed, securely install the optical module, replace the optical fiber, and run the restart command to restart the interface.
------------------------------------------------------------------- Alarm information: LOS Alarm -------------------------------------------------------------------
- If no alarm information is displayed, check whether the optical power of the optical module is within the allowed range. That is, the optical power of the optical module must meet the following requirements: Default RX Power Low Threshold < Current RX Power < Default RX Power High Threshold and Default TX Power Low Threshold < Current TX Power < Default TX Power High Threshold.
------------------------------------------------------------------- Diagnostic information: Temperature (°C) :34.77 Voltage (V) :3.29 Bias Current (mA) :7.19 Bias High Threshold (mA) :10.50 Bias Low Threshold (mA) :2.50 Current RX Power (dBM) :-2.19 Default RX Power High Threshold (dBM) :3.01 Default RX Power Low Threshold (dBM) :-15.02 Current TX Power (dBM) :-2.57 Default TX Power High Threshold (dBM) :3.01 Default TX Power Low Threshold (dBM) :-9.00 -------------------------------------------------------------------
- If alarm information is displayed, securely install the optical module, replace the optical fiber, and run the restart command to restart the interface.
Ensure that the local and remote optical modules have the same rate (10 Gbit/s or 1 Gbit/s) and wavelength. The remote optical module must transmit optical signals properly.
Table 1-4 describes the rate matching between 10GE/GE optical interfaces.Run the display interface transceiver verbose command in the user view, system view, or interface view to check detailed optical module information so as to check whether the wavelengths of the local and remote optical modules are the same.<HUAWEI> display interface transceiver verbose 10GE1/0/1 transceiver information: ------------------------------------------------------------------- Common information: Transceiver Type :10GBASE_Passive_Copper_Cable Connector Type :- Wavelength (nm) :850 Transfer Distance (m) :1(Copper) Digital Diagnostic Monitoring :NO Vendor Name :TIME Vendor Part Number :D09181-4A Ordering Name : ------------------------------------------------------------------- Manufacture information: Manu. Serial Number :D132810062 Manufacturing Date :2013-10-08 Vendor Name :TIME ------------------------------------------------------------------- Alarm information: -------------------------------------------------------------------
Ensure that the local and remote optical modules have the same optical fiber transmission mode (single-mode or multimode) and use optical fibers of proper length.
Optical signals of different wavelengths can travel different distances. Transmission distances of optical modules are affected by attenuation and dispersion of optical signals during transmission. Generally, a distance of less than 2 km is considered a short transmission distance, a distance of 10 km to 20 km is considered a medium transmission distance, and a distance beyond 20 km is considered a long transmission distance. The optical modules used on CE series switches support a transmission distance of up to 80 km.
The commonly used wavelengths of optical modules are as follows:850 nm: Optical modules with the wavelength are multimode optical modules. They are mainly used for short-distance transmission.
1310 nm: Most optical modules with the wavelength are single-mode optical modules, and some are multimode optical modules. They are mainly used for medium- and long-distance transmission.
1550 nm: Optical modules with the wavelength are single-mode optical modules. They are mainly used for long-distance transmission.
- Generally, a single-mode optical fiber is yellow, and a multimode optical fiber is orange.
- Generally, the pull tab of a multimode optical module is black and that of a single-mode optical module is blue. You can also view the label attached to an optical module to check whether it is a single-mode or multimode optical module. SM and MM indicate single-mode and multimode, respectively.
When using optical modules and optical fibers, pay attention to the following points:Multimode optical modules must be used with multimode optical fibers. Single-mode optical modules are generally used with single-mode optical fibers, and can also be used with multimode optical fibers. If a single-mode optical module is used with a single-mode optical fiber, the transmission distance is often longer than 10 km.
If the transmission distance is long, use an optical attenuator to prevent excessively high optical power.
If the transmission distance is short, do not use long optical fibers.
- Generally, when a GE optical module or GE copper module is installed on a 10GE interface, auto-negotiation can be configured on the interface. However, 10GE/25GE optical interfaces on some switch models do not support auto-negotiation after GE optical modules are installed. You can use either of the following methods to check whether a 10GE/25GE optical interface supports auto-negotiation and constraints on using the interface. If a GE optical module or GE copper module is used, ensure that the local and remote optical interfaces have the same negotiation mode and rate.
CloudEngine series switches provide the following methods to check the default auto-negotiation status, interface rate, and whether attributes such as auto-negotiation and flow control auto-negotiation can be configured, based on the interface type and transmission medium:
- Use the Interface Query Tool (https://info.support.huawei.com/network/ptmngsys/Web/DC/en/interface_query.html).
- See "Which Ethernet Interfaces of CE Switches Support Auto-Negotiation?" in FAQs - Interface Management - Ethernet Interface in Troubleshooting Guide.
- Use an optical fiber to connect the receive and transmit ends of the same optical module to perform a loopback test and check whether the interface can go up. If the interface goes up, the optical fiber fails. Replace the optical fiber. If the interface cannot go up, the interface may be faulty. Contact technical support personnel.
Collecting Information and Seeking Technical Support
If the fault persists after the preceding operations are performed, collect information and seek technical support.
- Collect operation results of the preceding steps and record the results in a file.
- Collect all diagnostic information and export the information to a file.
- Run the display diagnostic-information file-name command in the user view to collect diagnostic information and save the information to a file.
<HUAWEI> display diagnostic-information dia-info.txt Now saving the diagnostic information to the device 100% Info: The diagnostic information was saved to the device successfully.
The text file is stored in the flash:/ directory by default. You can run the dir command in the user view to check whether the file is generated.
- After diagnostic information is saved to a file, you can export it from the device through SFTP or SCP. For details, see Local File Management.
You can run the display diagnostic-information command to display diagnostic information and save terminal logs in a diagnostic file on a disk. For details, see Diagnostic File Obtaining Guide.
- Run the display diagnostic-information file-name command in the user view to collect diagnostic information and save the information to a file.
- Collect the log and alarm information on the device and export the information to files.
- Run the following commands to save the log and alarm information in the buffer to a file.
<HUAWEI> save logfile //Collect common user logs. <HUAWEI> system-view [~HUAWEI] diagnose [~HUAWEI-diagnose] save logfile diagnose-log //Collect diagnostic logs. [~HUAWEI-diagnose] collect diagnostic information //Collect diagnostic information of the operating system.
- When log files are generated, you can export the files from the device using SFTP or SCP. For details, see Local File Management.
You can also run the display logbuffer and display trapbuffer commands to check the log and alarm information on the device, and save terminal logs in a diagnostic file on a disk. For details, see Diagnostic File Obtaining Guide.
- Run the following commands to save the log and alarm information in the buffer to a file.
- Seek technical support.Visit https://e.huawei.com/en/how-to-buy/contact-us to seek technical support.
Technical support personnel will provide instructions for you to submit all the collected information and files, so that they can locate faults.