No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

U2000 VERITAS can not start and monitor lost

Publication Date:  2012-07-25 Views:  39 Downloads:  0
Issue Description
U2000 system cannot login.
VERITAS cannot be started, and monitor lost one day.App working in the secondary server before issue happened.
And Primary server working abnormal, telnet and ftp both not working also. Client cannot login also. 

Alarm Information
can not login any server, and monitor lost.

Handling Process

1.First confirm with customer, check the working state of the both server. Before issue happened, APP working in secondary. And permiary not ok. So we should focus on the standby server.
2.As server cannot ping or telnet, so in that case, we should login through com port which in the back side of the server as follows: To connect the laptop to the server over the console port on the ALOM board, perform the following steps:Connect the COM port on the ALOM board of SUN server and the COM port of the laptop by using the appropriate serial port cable.
Step 1Start any of the terminal management software, such as hypertrm.exe in the Windows OS, SecureCRT, netterm, teraterm, or CRT, on the laptop.
Step 2Retain the default values of Port, Baud Rate, Data Bits, Parity and Stop Bits, as shown in the Communications Setup dialog box, and then click OK.
Step 3Check whether information is displayed in the terminal management software window. If no information is displayed, check the server, serial port cable, terminal management software, and laptop.
Step 4Enable the telnet through below commands
                                   1.  Remove # in front of below line
                                         # vi /etc/inetd.conf
                                         #telnet stream  tcp6    nowait  root    /usr/sbin/in.telnetd    in.telnetd
                                       Change to :
                                       telnet stream  tcp6    nowait  root    /usr/sbin/in.telnetd    in.telnetd
       
                                   2.   then reboot the inetd daemon, with the command below :
                                # pkill inetd
                               # /usr/sbin/inetd –s
3. Secondary system working fine after reboot the same, and can telnet also. This means system no issue that time.
 
4. Telnet to the secondary server, checks the logs, found:
DCN interrupted and server running in the double-working state. And the secondary cannot start after the DCN restore. Logs as follows:
2011/02/25 22:51:43 VCS NOTICE V-16-1-50983 Resource NMSServer is offline on system Primary in cluster primaryCluster
2011/02/26 00:16:25 VCS NOTICE V-16-1-50983 Resource NMSServer is offline on system Primary in cluster primaryCluster
2011/02/26 00:38:41 VCS ERROR V-16-3-18313 (Secondary) TCP/IP connection via connector from cluster secondaryCluster to cluster primaryCluster is hung; intentionally disconnecting. Auto-reconnect will occur however you may wish to examine the wac resource on system 192.168.202.1
2011/02/26 00:38:41 VCS ERROR V-16-3-18311 (Secondary) secondaryCluster lost connection to cluster primaryCluster
2011/02/26 03:19:00 VCS ERROR V-16-2-13027 (Secondary) Resource(csgnic) - monitor procedure did not complete within the expected time.
2011/02/27 16:12:52 VCS NOTICE V-16-20013-16 (Secondary) SybaseBk:BackupServer:monitor:Setting cookie for proc = /opt/sybase/ASE-12_5/bin/backupserver -SDBSVR_back -e/opt/sybase/ASE-12_5/insta, PID = /proc/18880
2011/02/27 16:12:52 VCS INFO V-16-1-10299 Resource BackupServer (Owner: unknown, Group: AppService) is online on Secondary (Not initiated by VCS)
2011/02/27 16:12:52 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group AppService on all nodes
2011/02/27 16:12:52 VCS ERROR V-16-1-50921 CONCURRENCY VIOLATION:Group AppService is online on the following clusters [primaryCluster, secondaryCluster]

5. After the DCN restore, VERITAS will change to recovered state. Locked the secondary server, App offline also, and try to start the primary server. As the faulty already exist in the Primary, so Veritas will keep in recovered state.
6. 2011/02/27 16:12:52,disable the MAC port in the Secondary and start the APP in Secondary also, monitor restore.
7.  Check the primary server found the OS abnormal , reinstall the server and synchronization fully. Then VERITAS restore.

 

Root Cause
Both server not working, possible cause
1. Both Operation System dump caused the APP and VERITAS not working, monitor lost, client cannot login.
2. Primary server system faulty, and secondary change to Safety mode, cannot login and process not working.
3. Standby System working fine, but VERITAS not working , Not online.

Suggestions
DCN interrupted and Primary server faulty.
1.DCN should be ensure safety all the time, should have the protection also.
2. Daily monitor for VERITAS should be done daily, if already found primary working abnormal, should fix ASAP.

END