Project Home
Project Home
Trackers
Trackers
Tasks
Tasks
Build & Test
Build & Test
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Reports
Reports
Project Information
Project Info
Forum Topic - qnx kernel crash analysis and qsymoops: (8 Items)
   
qnx kernel crash analysis and qsymoops  
Here is too silent.

I wrote a utility "qcrash/qsymoops" to analyze/decode qnx kernel dump in 2006 and added some features like backtrace, 
machine code to assembly code to it in 2008.

Here is the link for it:
http://sourceforge.net/projects/qcrash/
currently it can show backtrace of ppcbe,x86,shle,armle,armbe but no mips support yet as I never try mips at qnx yet 
although used to use sb1250,1125 for Linux 32/64 mips.
If you have qnx kernel crash/dump and if you are ok to publish it here, I can help a little bit(I am not a kernel 
developer so the help will be limited) or you can help yourself with qcrash/qsymoops.
There are still some cases it can't help too much as sometimes kernel crashes very strange and it is out of my mind(I 
can add more code to handle this). Or the printed kernel stack is too small not enough to show backtrace and you will 
only see pc there.

If you ever developed LInux kernel probably you know why I name it to qsymoops although recent Linux kernel doesn't have
 to use an external tool to analyze its kernel dump, it is showed by kernel itself.

kernel dump is some texts so sometimes it may not analyze it correctly and you may have to make the dump neat to make 
qsymoops to understand better. I am not quite familiar with regexp so probably you can improve it.


Re: qnx kernel crash analysis and qsymoops  
Hi Yao,

It doesn't link on linux - no __bswap_32

Also, is it 64bit safe?  I had to make the following change on x86_64 linux

Index: Makefile
===================================================================
--- Makefile	(revision 16)
+++ Makefile	(working copy)
@@ -4,6 +4,10 @@
  # on newer version it will return long machine name for -p.
  HOSTARCH := $(shell uname -m | sed -e 's/i.86/x86/g' -e 's/sh.*/sh/' \
  				-e 's/x86pc/x86/' )
+ifeq ($(HOSTARCH),x86_64)
+	HOSTARCH=x86
+	CFLAGS+=-m32
+endif
  ifeq ($(ARCH),)
  ARCH := $(HOSTARCH)
  CROSS_COMPILE :=

Yao Zhao wrote:
> Here is too silent.
> 
> I wrote a utility "qcrash/qsymoops" to analyze/decode qnx kernel dump in 2006 and added some features like backtrace, 
machine code to assembly code to it in 2008.
> 
> Here is the link for it:
> http://sourceforge.net/projects/qcrash/
> currently it can show backtrace of ppcbe,x86,shle,armle,armbe but no mips support yet as I never try mips at qnx yet 
although used to use sb1250,1125 for Linux 32/64 mips.
> If you have qnx kernel crash/dump and if you are ok to publish it here, I can help a little bit(I am not a kernel 
developer so the help will be limited) or you can help yourself with qcrash/qsymoops.
> There are still some cases it can't help too much as sometimes kernel crashes very strange and it is out of my mind(I 
can add more code to handle this). Or the printed kernel stack is too small not enough to show backtrace and you will 
only see pc there.
> 
> If you ever developed LInux kernel probably you know why I name it to qsymoops although recent Linux kernel doesn't 
have to use an external tool to analyze its kernel dump, it is showed by kernel itself.
> 
> kernel dump is some texts so sometimes it may not analyze it correctly and you may have to make the dump neat to make 
qsymoops to understand better. I am not quite familiar with regexp so probably you can improve it.
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> General
> http://community.qnx.com/sf/go/post36914
> 

-- 
cburgess@qnx.com
Re: qnx kernel crash analysis and qsymoops  
probably you have to define __USE_BSD when you compile, __bswap_32 is defined in /usr/include/bits/byteswap.h and 
included by endian.h.
I am not sure why on x86_64 no __USE_BSD defined.

should be safe, I greped "long" and only one place in my code is using unsigned long for address, on 64 bits Linux I 
know long is 64 bits and int is still 32 bits. when I defined the address as unsigned long I want it support when QNX 
supports 64 bits. but don't have a chance yet.
should be safe on 64bits but I didn't test on 64 bits Linux.
what will be our definition for long when 64 bits coming? I heard Windows use it as 32 still.

Yao

> Hi Yao,
> 
> It doesn't link on linux - no __bswap_32
> 
> Also, is it 64bit safe?  I had to make the following change on x86_64 linux
> 
> Index: Makefile
> ===================================================================
> --- Makefile	(revision 16)
> +++ Makefile	(working copy)
> @@ -4,6 +4,10 @@
>   # on newer version it will return long machine name for -p.
>   HOSTARCH := $(shell uname -m | sed -e 's/i.86/x86/g' -e 's/sh.*/sh/' \
>   				-e 's/x86pc/x86/' )
> +ifeq ($(HOSTARCH),x86_64)
> +	HOSTARCH=x86
> +	CFLAGS+=-m32
> +endif
>   ifeq ($(ARCH),)
>   ARCH := $(HOSTARCH)
>   CROSS_COMPILE :=
> 
> Yao Zhao wrote:
> > Here is too silent.
> > 
> > I wrote a utility "qcrash/qsymoops" to analyze/decode qnx kernel dump in 
> 2006 and added some features like backtrace, machine code to assembly code to 
> it in 2008.
> > 
> > Here is the link for it:
> > http://sourceforge.net/projects/qcrash/
> > currently it can show backtrace of ppcbe,x86,shle,armle,armbe but no mips 
> support yet as I never try mips at qnx yet although used to use sb1250,1125 
> for Linux 32/64 mips.
> > If you have qnx kernel crash/dump and if you are ok to publish it here, I 
> can help a little bit(I am not a kernel developer so the help will be limited)
>  or you can help yourself with qcrash/qsymoops.
> > There are still some cases it can't help too much as sometimes kernel 
> crashes very strange and it is out of my mind(I can add more code to handle 
> this). Or the printed kernel stack is too small not enough to show backtrace 
> and you will only see pc there.
> > 
> > If you ever developed LInux kernel probably you know why I name it to 
> qsymoops although recent Linux kernel doesn't have to use an external tool to 
> analyze its kernel dump, it is showed by kernel itself.
> > 
> > kernel dump is some texts so sometimes it may not analyze it correctly and 
> you may have to make the dump neat to make qsymoops to understand better. I am
>  not quite familiar with regexp so probably you can improve it.
> > 
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > 
> > General
> > http://community.qnx.com/sf/go/post36914
> > 
> 
> -- 
> cburgess@qnx.com


Re: qnx kernel crash analysis and qsymoops  
Just read my code: and it is not safe because when I read the data I am using le32_to_cpu to read the unsigned long, 
that is the only bad point if you can recompile the gen subdirectory to get the new demo_dat.c
I changed it to unsigned int now then default demo_dat.c should work on a 64 bit platform and which I got from 32 bits 
platform.
the default demo_dat.c supports most only no ppcle and shbe so you don't need to recompile "gen" subdirectory to get the
 demo_dat.c again.
Re: qnx kernel crash analysis and qsymoops  
> Just read my code: and it is not safe because when I read the data I am using 
> le32_to_cpu to read the unsigned long, that is the only bad point if you can 
> recompile the gen subdirectory to get the new demo_dat.c
> I changed it to unsigned int now then default demo_dat.c should work on a 64 
> bit platform and which I got from 32 bits platform.
> the default demo_dat.c supports most only no ppcle and shbe so you don't need 
> to recompile "gen" subdirectory to get the demo_dat.c again.


I was wrong!
I didn't read the pointer convert part. Code is updated but didn't test.
Re: qnx kernel crash analysis and qsymoops  
I met with some problem with my own board. ATMEL9260 running QNX6.4.0. below is kernel dump, 
Thanks in advance.

Weiguo

Shutdown[0,0] S/C/F=10/1/0 C/D=fe0156cc/fe0783e8 state(f0)= now lock exit specret
QNX Version 6.4.0 Release 2008/10/21-10:59:12EDT
[0]PID-TID=1-7  P/T FL=00019001/05020000 "proc/boot/procnto"
[0]ASPACE PID=2 PF=00018012
armle context[fc45691c]:
0000: 00000000 fc46977c fffff9e0 fc4696dc fc4696d8 fc460060 fc4696dc fe06410c
0020: fc7d3fc0 00000000 fc460360 fc7d3f7c 0000000e fc7d3f5c fe061118 fe06111c
0040: 0000001f
instruction[fe06111c]:
00 00 84 e5 e3 ff ff 1a 48 31 00 eb 00 30 90 e5 41 0f 53 e3 00 40 a0 13 d6 ff 
stack[fc7d3f5c]:
0000: fc4696d8 fc4696d8 fc460db8 fc4696d8 fc460364 fc7d3fbc fc7d3f80 fe060920
0020: fe061050 00000000 fe060584 fc7d3f8c fc460db8 fc4696d8 00000001 00000000
0040: 00000000 00000000 00000000 00000000 00000000 00000000 fc7d3fc0 fe072710
0060: fe060878 fe072710 fc460db8 fc7d3fcc 00000003 00000000 00000001 00000007

Re: qnx kernel crash analysis and qsymoops  
Sorry for late reply, too busy.

qsymoops -f ./dump.txt -s $QNX_TARGET/armle/boot/sys/procnto > dump1ee.txt
the fault instruction is 
str r0, [r4] but r4 is  r4: 0xfc4696d8
I didn't understand why is not alignment.

sometimes it could be cpu fault(I can't remember which arm board, cpu has prefetch problem), I didn't read this 
function's disassembly to see whether other registers really make sense.

armle context[fc45691c]:
 Platform:armle      endian:little
=========================================
QNX Version 6.4.0 Release 2008/10/21-10:59:12
 QNX Version:6.4.0 Release timestamp:2008/10/21-10:59:12EDT.
=========================================
Shutdown[0,0]
 Shutdown run on cpu 0, kernel cpu:0
=========================================
S/C/F=10/1/0
 Signal    :10  SIGBUS          bus error
 Code      :1   BUS_ADRALN      Invalid address alignment
 Fault code:0   Unknown         There is no this fault code
=========================================
S/C/F=10/1/0
=========================================
C/D=fe0156cc/fe0783e8
 Location of the kernel's code(main):0xfe0156cc data(actives[]):0xfe0783e8
=========================================
state(f0)= now lock exit
state(f0)= now lock exit specret
  Explanation of state of the kernel:
   * now -- in the kernel
   * lock -- nonpreemptible
   * exit -- leaving kernel
   * specret -- special return processing
   * any number -- the interrupt nesting level.
=========================================
[0]PID-TID=1-7
 The thread 7 of process 1 run on CPU 0 when the crash occurred.
 =========================================
 P/T FL=00019001/05020000 "proc/boot/procnto"

  Process flags:0x19001.
   0x00000001: _NTO_PF_NOCLDSTOP
   0x00001000: _NTO_PF_CHECK_INTR Only set by interrupt_attach_*
   0x00008000: _NTO_PF_RING0 Ring0 access(only process manager process)
   0x00010000: _NTO_PF_SLEADER
    +========
   0x00019001
  Thread flags :0x05020000.
   0x00020000: _NTO_TF_DETACHED
   0x01000000: _NTO_TF_ALIGN_FAULT
   0x04000000: _NTO_TF_ALLOCED_STACK
    +========
   0x05020000
  Process      :proc/boot/procnto".
=========================================
[0]ASPACE PID=2
 On CPU 0, the address space for process 2 was active.
 This line appears only when the process is different from the one in the PID-TID\
 line.
=========================================
armle context[fc45691c]:
 Platform:armle          Stored register map address:fc45691c
    r0: 0x00000000     r1: 0xfc46977c     r2: 0xfffff9e0     r3: 0xfc4696dc
    r4: 0xfc4696d8     r5: 0xfc460060     r6: 0xfc4696dc     r7: 0xfe06410c
    r8: 0xfc7d3fc0     r9: 0x00000000    r10: 0xfc460360  r11(fp): 0xfc7d3f7c
 r12(ip): 0x0000000e  r13(sp): 0xfc7d3f5c  r14(lr): 0xfe061118  r15(pc): 0xfe0611\
1c
  spsr: 0x0000001f
=========================================
instruction[fe06111c]:
 Current instruction address(pc):0xfe06111c and machine code:
 00 00 84 e5 e3 ff ff 1a 48 31 00 eb 00 30 90 e5 41 0f 53 e3 00 40 a0 13 d6 ff

 Current ip(0xfe06111c) in image is address (0x5211c), it is in function: dispatc\
h_block_receive_all(0x52040) [dispatch_block_receive_all+0xdc]
00052040 <dispatch_block_receive_all>:
   52040:       e1a0c00d        mov     ip, sp
   52044:       e92dd8f0        push    {r4, r5, r6, r7, fp, ip, lr, pc}
   52048:       e24cb004        sub     fp, ip, #4      ; 0x4
   5204c:       e24dd004        sub     sp, sp, #4      ; 0x4
   52050:       e5905038        ldr     r5, [r0, #56]
   52054:       e1a04000        mov     r4, r0
   52058:       e5953018        ldr     r3, [r5, #24]
   5205c:       e1a07001        mov     r7, r1
   52060:       e3130004        tst     r3, #4  ; 0x4
   52064:       1a000038        bne     5214c <dispatch_block_receive_all+0x10c>
   52068:       e595302c        ldr     r3, [r5, #44]
   5206c:       e3530000        cmp     r3, #0  ; 0x0
   52070:       0a000009        beq     5209c...
View Full Message
Re: qnx kernel crash analysis and qsymoops  
i am getting below message on telnet session when proc nto starts.

can you please help me in understanding or getting the call stack why it is CRASHING


as soon as proc nto starts i get below message at the time of carash

Shutdown[0,0] S/C/F=10/1/5 C/D=fe01a964/fe0a1340 state(4c00)= now lock
QNX Version 6.6.0 Release 2011/06/08-14:50:25EDT KSB:effe7000
[0]PID-TID=1-1? P/T FL=00018001/08000040
[0]ASPACE=>NULL
armle context[fe09fab4]:
0000: 00000000 004de329 400001d3 ee100e15 400001d3 00000004 fe09ec04 fe0
0020: fe0a6968 fe09fb24 00000000 fe0a1234 fe0a1894 fe09faf8 fe047bf0 fc4
0040: 400001d3
instruction[fc404930]:
00 40 93 e5 04 50 93 e5 04 00 50 e0 05 10 c1 e0 10 80 bd e8 b8 c0 a0 e3
stack[fe09faf8]:
0000: 400001d3 fe047bf0 fe0a1748 fe065db0 fe0a1748 fe04d420 00010000 fe0
0020: fe0a11ec fe09ec18 00000000 00000000 fe09eb70 fe09eb70 fe0a6138 fe0
0040: fe09ec04 00000000 00000000 00000000 00000000 fe063f58 00000000 fe0
0060: 00000000 fe047cdc 00000000 fc404000 00000000 10c55878 00000004 000