No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Remote Fault and OPU2_CSF alarms on OSN1800

Publication Date:  2018-03-20 Views:  63 Downloads:  0
Issue Description

The LDX unit of the OptiX OSN 1800V reports REMOTE_FAULT and OPU2_CSF alarms. The curious thing is that on the peer device which is connected with 3rd party device, there is no alarm reported. Both alarms are reported on Site 1. (see the topology in the picture below).

Alarm Information

 

Handling Process

1. Check the meaning of the alarms:

OPU_CSF alarm indicates client side signal fail and is generated when the client side signal of the remote end fails. Usually, when this alarm occurs, it is in pair with R_LOS and PORT_MODULE_OFFLINE on the client side of remote end

REMOTE_FAULT alarm indicates a fault at the remote end. Usually is in pair with LOCAL_FAULT alarm in the peer station.

2. Since in the peer station Siate2, didn't report any alarm, we suggest customer to check if there is any kind of this alarm suppressed --> NO alarm suppressed

3. Check history alarm of Site 2, no alarm in the same period as OPU2_CSF and REMOTE_FAULT of the Site 1.

4. Collect DC logs for both sites

Logs analysis:

From the logs analysis, we found that for any failure of the client side signal, there is a R_LOS alarm at the same time.

Analysis of the Site_1 NE logs at 21:34 on February 26th:

248807     REMOTE_FAULT   MN         end         2018-02-26    21:34:16+08:00      2018-02-26   21:34:23+08:00      SA     SDH_PATH                           board=5,subcard=255,port=5,path=1;;

 

248806     OPU2_CSF       MN         end         2018-02-26    21:34:16+08:00      2018-02-26    21:34:23+08:00      SA     ODU2_PATH                         

board=5,subcard=255,port=5,path=0x1;;


The peer network Site_1 NE, reported R_LOS alarm on the 5th port of the 2nd board at the same time:

261426     R_LOS      CR         end         2018-02-26    21:34:16+08:00      2018-02-26    21:34:22+08:00      SA     LASER_GROUP        board=2,subcard=255,port=5,group=1;;


Several other alarm relation:

     1.


    Site_1                
    248768     OPU2_CSF         MN         end         2018-02-06    09:13:04+08:00      2018-02-06   13:09:00+08:00      SA     ODU2_PATH                         
    board=5,subcard=255,port=6,path=0x1;;  
    248769     REMOTE_FAULT   MN          end      2018-02-06 09:13:04+08:00      2018-02-06    13:09:00+08:00      SA     SDH_PATH                          
    board=5,subcard=255,port=6,path=1;;     
      Site_2
      261416     R_LOS       CR          end      2018-02-06 09:13:04+08:00      2018-02-06    13:08:59+08:00      SA    LASER_GROUP                
      board=2,subcard=255,port=6,group=1;;
          

          2.  


      Site_1 
      248770   OPU2_CSF                MN       end         2018-02-07   00:52:16+08:00      2018-02-07   06:08:37+08:00      SA     ODU2_PATH                         
      board=5,subcard=255,port=6,path=0x1;;


      248771     REMOTE_FAULT      MN          end     2018-02-07 00:52:16+08:00      2018-02-07   06:08:37+08:00      SA    SDH_PATH                          
      board=5,subcard=255,port=6,path=1;; 


               Site_2


               261417     R_LOS              CR         end         2018-02-07 00:52:17+08:00     2018-02-07 06:08:37+08:00     SA     LASER_GROUP                       
               board=2,subcard=255,port=6,group=1;;

           3.

                         
             
        Site_1
                    


        248797     OPU2_CSF             MN         end         2018-02-12    17:41:02+08:00      2018-02-12    22:40:43+08:00      SA     ODU2_PATH         
        board=5,subcard=255,port=5,path=0x1;;

        248798     REMOTE_FAULT     MN          end        2018-02-12 17:41:02+08:00      2018-02-12    22:40:43+08:00      SA         SDH_PATH                          
        board=5,subcard=255,port=5,path=1;; 



        Site_2

               261423    R_LOS                  CR         end         2018-02-12  17:41:02+08:00      2018-02-12    22:40:42+08:00      SA     LASER_GROUP                       
               board=2,subcard=255,port=5,group=1;;

           4.
              Site_1

              248806     OPU2_CSF             MN       end         2018-02-26 21:34:16+08:00     2018-02-26 21:34:23+08:00     SA    ODU2_PATH                         
              board=5,subcard=255,port=5,path=0x1;;  

       
             248807   REMOTE_FAULT       MN          end      2018-02-26 21:34:16+08:00      2018-02-26  21:34:23+08:00      SA    SDH_PATH                          
             board=5,subcard=255,port=5,path=1;;   



             Site_2                                    

             261426     R_LOS          CR       end         2018-02-26 21:34:16+08:00      2018-02-26 21:34:22+08:00   SA     LASER_GROUP                       
             board=2,subcard=255,port=5,group=1;;    



       



      Root Cause

      Based on the log analysis, the abnormal signal at the client side is the cause of the reported CSF alarm, and receives a REMOTE_FAULT alarm from the peer device. The transmission side has something wrong, maybe a fiber or optical module failed.

      BUT, the problem is that on 2018-02-28 at 21:05 when the OPU_CSF and REMOTE_FAULT occurs on Site_1, on Site_2 no alarms are reported. (See the alarm picture in the Alarm Information section).

       

      Solution

      Querying the alarm of the remote NE at that time, we do not see any alarm, the interrupt log is also flushed (the alarm logs are stored in a memory like a container, and in this situation, since here are many logs, what is out of container, will be deleted) .

      There are two possible suspects:

      1. An abnormal alarm was received on the tributary side of the remote NE. However, the alarm time is relatively short and is automatically filtered.

      2. At this point time, the CPU is busy or the board alarm task is preempted by other high-priority tasks. When the alarm query task is called, the tributary-side
      alarm disappears.

      At present, based on previous analysis, must first investigate locally/physically why the Site_1 NE, board 2 port 5 which reported R_LOS so many times.




      END