Providing System Die or OOPS Information
This topic describes exception information recorded by the kbox when the memory encounters the OOPS exception.
Function Introduction
Due to special causes, the memory of the product or platform service software or OS may access a null pointer or be destroyed, resulting in the OOPS event. The kbox can record the OOPS event information such as the time of occurrence, information about the process that encounters the OOPS event, and call stack of the abnormal process, in the storage device, and therefore facilitating fault locating.
Information List
The recorded die or OOPS information includes three parts.
- OOPS information. Up to 128 KB information can be recorded. This area mainly records the following information:
- Time of occurrence (UTC time)
- PID and name of the current process
- Track of the called abnormal process and stack information (up to 150 lines of the stack information can be output)
- Start address and address length of the module in the output information (up to 384 module records can be output)
If a large number of modules exist in the system and the name of each module is long, log shock may occur.
- Message output logs of the kernel. Up to 64 KB information can be recorded. This area mainly records the following information:
- Time of occurrence (UTC time)
- 64 KB of latest logs generated when the exception occurs, that is, the last 64 KB logs output by the kernel to the circular buffer
- Message output logs of the console. Up to 32 KB information can be recorded. This area mainly records the following information:
- Time of occurrence (UTC time)
- Message output logs of other kernels
If other processes do not print logs to the console, the collected logs are empty.
Examples
A log file includes the following information:
- OOPS exception information
*****area type:die - location in die area:0***** //Error type and error number die info:Oops:0002 ---------KBOX_START---------- //Time when the die event occurs. die time:20131109041507-e9ea9 //Number of the CPU where the process runs, and process name CPU 0 Pid: 7664, comm: bash //Register information RIP: 0010:[<ffffffffa0adbcdf>] [<ffffffffa0adbcdf>] dev_wr_handler+0xa3f/0xce0 [kpgen] RSP: 0018:ffff88005ea73eb8 EFLAGS: 00010292 RAX: 0000000000000045 RBX: 0000000000000000 RCX: 0000000000006161 RDX: 0000000000000062 RSI: 0000000000000096 RDI: 0000000000000246 RBP: ffff88005ea73f08 R08: ffff88007b001000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff810f10f0 R13: 0000000000000005 R14: 00007f8bcf81a000 R15: 0000000000000000 FS: 00007f8bcf97f700(0000) GS:ffff880069600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000035c43000 CR4: 00000000001407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 7664, threadinfo ffff88005ea72000, task ffff8800373ba6c0) Stack: 0000000a32303031 0000000000000000 0000000000000000 0000000000000000 ffff88005ea73f08 ffffffff81146a08 0000000000000002 0000000000000005 ffff88004f3b8e80 ffff88005ea73f48 ffff88005ea73f38 ffffffff81146e9b //Call Stack Call Trace: [<ffffffffa0003e43>] kbox_show_registers+0x4c3/0xa30 [kbox] [<ffffffffa000e91b>] ? kbox_buffer_write+0xcb/0x110 [kbox] [<ffffffffa001012c>] kbox_die_callback+0x15c/0x220 [kbox] [<ffffffff81445e0f>] notifier_call_chain+0x3f/0x80 [<ffffffff81445e5d>] __atomic_notifier_call_chain+0xd/0x10 [<ffffffff81445e71>] atomic_notifier_call_chain+0x11/0x20 [<ffffffff81445eae>] notify_die+0x2e/0x30 [<ffffffff814430c8>] __die+0x88/0x100 [<ffffffff8102e883>] no_context+0xf3/0x200 [<ffffffff8102eabd>] __bad_area_nosemaphore+0x12d/0x220 [<ffffffff8102ec09>] bad_area+0x49/0x60 [<ffffffff81445d6c>] do_page_fault+0x44c/0x4b0 [<ffffffff8103aef6>] ? console_unlock+0x246/0x2a0 [<ffffffff810f10f0>] ? oom_kill_process+0x2a0/0x2a0 [<ffffffff81442675>] page_fault+0x25/0x30 [<ffffffff810f10f0>] ? oom_kill_process+0x2a0/0x2a0 [<ffffffffa0adbcdf>] ? dev_wr_handler+0xa3f/0xce0 [kpgen] [<ffffffffa0adbcdf>] ? dev_wr_handler+0xa3f/0xce0 [kpgen] [<ffffffff81146a08>] ? rw_verify_area+0x58/0x100 [<ffffffff81146e9b>] vfs_write+0xcb/0x130 [<ffffffff81146ff0>] sys_write+0x50/0x90 [<ffffffff8144a079>] system_call_fastpath+0x16/0x1b Code: 00 00 0f 85 12 f7 ff ff 48 c7 c7 08 d5 ad a0 31 c0 e8 58 2e 96 e0 cd 04 e9 0b f7 ff ff 48 c7 c7 98 d4 ad a0 31 c0 e8 43 2e 96 e0 <c6> 04 25 00 00 00 00 00 e9 f0 f6 ff ff 48 c7 c7 d0 d4 ad a0 31 mod_name mod_start core_size os_test 0xffffffffa0015000 0x3df6 bioshmem_driver 0xffffffffa000c000 0x454d kbox 0xffffffffa0b81000 0x49a53 agetty_query 0xffffffffa0b7c000 0x3442 cpufreq_powersave 0xffffffffa0b77000 0x314b signo_catch 0xffffffffa0b6d000 0x30fa sysalarm_agent_netlink_k0xffffffffa0b72000 0x316d af_packet 0xffffffffa0b63000 0x8a5e ...... //Actions performed after the die exception occurs (0: no action; 1: calling the panic function; 2: reboot) action after die is:0 ---------KBOX_END------------
- Message output logs of the kernel
message: *****area type:message - location in message area:4***** die time:20131109041507-e9ea9 >[ 246.509001] INFO: task sched_work:253 blocked for more than 120 seconds. <3>[ 246.509005] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6>[ 246.509009] sched_work D 0000000000000000 5544 253 2 0x00000000 <4>[ 246.509018] ffff88005eb0bdd0 0000000000000046 ffffffff81075026 ffff88005eb0a010 <4>[ 246.509025] 00000000000159c0 00000000000159c0 00000000000159c0 00000000000159c0 <4>[ 246.509032] ffff88005eb0bfd8 ffff88005eb0bfd8 00000000000159c0 00000000000159c0 <4>[ 246.509038] Call Trace: <4>[ 246.509048] [<ffffffff81075026>] ? update_curr+0x186/0x1c0 <4>[ 246.509055] [<ffffffff8107607a>] ? dequeue_task_fair+0x6a/0x170 <4>[ 246.509060] [<ffffffff8106ba29>] ? dequeue_task+0x89/0xa0 <4>[ 246.509082] [<ffffffffa00bf7fe>] ? DBG_Log+0x3e/0x340 [vos] <4>[ 246.509090] [<ffffffff81440499>] schedule+0x29/0x90 <4>[ 246.509095] [<ffffffff8143edad>] schedule_timeout+0x21d/0x2c0 <4>[ 246.509102] [<ffffffff810c1392>] ? call_rcu_sched+0x12/0x20 <4>[ 246.509107] [<ffffffff8106261a>] ? __put_cred+0x3a/0x50 <4>[ 246.509111] [<ffffffff81062d2e>] ? commit_creds+0x12e/0x1e0 <4>[ 246.509117] [<ffffffff8143f54d>] __down+0x6d/0xb0 <4>[ 246.509125] [<ffffffff81061aa7>] down+0x47/0x50 <4>[ 246.509142] [<ffffffffa00c6009>] LVOS_sema_down+0x9/0x10 [vos] <4>[ 246.509157] [<ffffffffa00c64cc>] LVOS_SchedWorkThread+0x5c/0x190 [vos] <4>[ 246.509165] [<ffffffff8144b3d4>] kernel_thread_helper+0x4/0x10 <4>[ 246.509180] [<ffffffffa00c6470>] ? LVOS_SchedWorkInit+0x60/0x60 [vos] <4>[ 246.509186] [<ffffffff8144b3d0>] ? gs_change+0x13/0x13 <3>[ 246.509190] INFO: task sched_work:254 blocked for more than 120 seconds. <3>[ 246.509192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6>[ 246.509196] sched_work D 0000000000000000 5544 254 2 0x00000000 <4>[ 246.509203] ffff88005eb07dd0 0000000000000046 ffffffff81075026 ffff88005eb06010 <4>[ 246.509209] 00000000000159c0 00000000000159c0 00000000000159c0 00000000000159c0 <4>[ 246.509214] ffff88005eb07fd8 ffff88005eb07fd8 00000000000159c0 00000000000159c0 <4>[ 246.509220] Call Trace: <4>[ 246.509225] [<ffffffff81075026>] ? update_curr+0x186/0x1c0 <4>[ 246.509231] [<ffffffff8107607a>] ? dequeue_task_fair+0x6a/0x170 <4>[ 246.509236] [<ffffffff8106ba29>] ? dequeue_task+0x89/0xa0 <4>[ 246.509249] [<ffffffffa00bf7fe>] ? DBG_Log+0x3e/0x340 [vos] <4>[ 246.509256] [<ffffffff81440499>] schedule+0x29/0x90 <4>[ 246.509260] [<ffffffff8143edad>] schedule_timeout+0x21d/0x2c0 <4>[ 246.509265] [<ffffffff810c1392>] ? call_rcu_sched+0x12/0x20 <4>[ 246.509269] [<ffffffff8106261a>] ? __put_cred+0x3a/0x50 <4>[ 246.509274] [<ffffffff81062d2e>] ? commit_creds+0x12e/0x1e0 <4>[ 246.509279] [<ffffffff8143f54d>] __down+0x6d/0xb0 <4>[ 246.509286] [<ffffffff81061aa7>] down+0x47/0x50
- Message output logs of the console
console: *****area type:console - location in console area:4***** die time:20131109041507-e9ea9 die time:20131109041507-e9ea9