FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

OSS specific BSD discussion (FreeBSD/NetBSD/OpenBSD)

Moderators: cesium, dev, kodachi, hannu

FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

Postby leres » Sun Jul 04, 2010 10:37 pm

I have a 4 core intel system with one hdaudio and one sbpci:
Code: Select all
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Quad CPU    Q9650  @ 3.00GHz (3000.02-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x1067a  Stepping = 10
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x408e3fd<SSE3,RSVD2,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,<b19>,XSAVE>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  Cores per package: 4
usable memory = 4279726080 (4081 MB)
avail memory  = 4099280896 (3909 MB)
ACPI APIC Table: <PTLTD          APIC  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs

[...]
Jun 29 20:12:21 hot.ee.lbl.gov kernel: oss_hdaudio0: [ITHREAD]
Jun 29 20:12:21 hot.ee.lbl.gov kernel: oss_hdaudio0: <Intel HD Audio> mem 0xd0520000-0xd0523fff irq 16 at device 27.0 on pci0
Jun 29 20:12:21 hot.ee.lbl.gov kernel: oss_sbpci0: [ITHREAD]
Jun 29 20:12:21 hot.ee.lbl.gov kernel: oss_sbpci0: <Sound Blaster PCI128> port 0x3000-0x303f irq 20 at device 0.0 on pci17

Occasionally, the system panics with "page fault while in kernel mode" from do_outputintr()/oss_memset(). This seems to happen when I'm using xmms to stream audio and stop the stream.

I'm using v4.2-build2003 (which seems to be same as v4.2-build2002 except for a few new device id codes added).

I (painfully) built oss with kernel symbols and dmap->dmabuf is zero in the call to memset():
Code: Select all
hot 17 # kgdb /boot/kernel/kernel.symbols vmcore.4
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0xfffffffef7f12000
fault code              = supervisor write data, page not present
instruction pointer     = 0x8:0xffffffff80e564f3
stack pointer           = 0x10:0xfffffffe800c7b00
frame pointer           = 0x10:0xfffffffe800c7b10
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 26 (irq16:  uhci0 drm0)
trap number             = 12
panic: page fault
cpuid = 3
Uptime: 6d5h26m30s
Physical memory: 4081 MB
[...]
(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0x0000000000000004 in ?? ()
#2  0xffffffff80556f59 in boot (howto=260) at ../../../kern/kern_shutdown.c:418
#3  0xffffffff80557362 in panic (fmt=0x104 <Address 0x104 out of bounds>)
    at ../../../kern/kern_shutdown.c:574
#4  0xffffffff808238b3 in trap_fatal (frame=0xffffff00014536e0, eva=Variable "eva" is not available.
)
    at ../../../amd64/amd64/trap.c:757
#5  0xffffffff8082453f in trap (frame=0xfffffffe800c7a50)
    at ../../../amd64/amd64/trap.c:290
#6  0xffffffff808086ce in calltrap () at ../../../amd64/amd64/exception.S:209
#7  0xffffffff80e564f3 in oss_memset (t=0xfffffffef7f12000, val=0, l=4096)
    at osscore.c:50
#8  0xffffffff80e327b0 in do_outputintr (dev=Variable "dev" is not available.
) at oss_audio_core.c:4508
#9  0xffffffff80e3296a in audio_outputintr (dev=0, intr_flags=1)
    at oss_audio_core.c:244
#10 0xffffffff80eb6e93 in hdaintr (osdev=0xfffffffef7f12000)
    at oss_hdaudio.c:211
#11 0xffffffff80e567d6 in ossintr (arg=Variable "arg" is not available.
) at osscore.c:118
#12 0xffffffff80536290 in ithread_loop (arg=0xffffff00015dc2a0)
    at ../../../kern/kern_intr.c:1088
#13 0xffffffff80533103 in fork_exit (
    callout=0xffffffff80536120 <ithread_loop>, arg=0xffffff00015dc2a0,
    frame=0xfffffffe800c7c80) at ../../../kern/kern_fork.c:810
#14 0xffffffff80808a8e in fork_trampoline ()
    at ../../../amd64/amd64/exception.S:455
#15 0x0000000000000000 in ?? ()
[...]
(kgdb) f 8
#8  0xffffffff80e327b0 in do_outputintr (dev=Variable "dev" is not available.
) at oss_audio_core.c:4508
4508                memset (dmap->dmabuf, dmap->neutral_byte, dmap->bytes_in_use);
(kgdb) p dmap->dmabuf
$1 = (unsigned char *) 0x0
(kgdb)

So this looks like a race; the attached patch restructures the lock to protect the dmap->dmabuf test. Since this should never (or at least rarely) happen, it doesn't add any overhead.

Actually, I couldn't figure out how to attach a file (both extensions "c" and "txt" are not allowed?!?!) so here it is inline.
Code: Select all
--- kernel/framework/audio/oss_audio_core.c.virgin      2010-07-04 15:12:40.000000000 -0700
+++ kernel/framework/audio/oss_audio_core.c     2010-07-04 15:14:09.000000000 -0700
@@ -5668,13 +5668,15 @@
   adev = audio_engines[dev];
   dmap = adev->dmap_out;

+  MUTEX_ENTER_IRQDISABLE (dmap->mutex, flags);
+
   if (dmap->dmabuf == NULL)
     {
+      MUTEX_EXIT_IRQRESTORE (dmap->mutex, flags);
       cmn_err (CE_WARN, "Output interrupt when no buffer is allocated\n");
       return;
     }

-  MUTEX_ENTER_IRQDISABLE (dmap->mutex, flags);

   if (!(intr_flags & AINTR_NO_POINTER_UPDATES))
     {
leres
New Member
 
Posts: 3
Joined: Sat Jun 12, 2010 6:02 pm

Re: FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

Postby cesium » Mon Jul 05, 2010 5:55 am

Thanks for debugging this! As you can see in the forum, some oss4/freebsd users have crashes. I tried to help them, but I'm just a mod... (Also I use Linux, and debugging via VM is painful when your CPU doesn't have VT). I'll ask the others to test this diff - maybe this will help them too.

Now, I'm not sure this is the most correct place for a patch - I never saw any similar crash reports under Linux, so perhaps the real problem might be in oss4's FreeBSD wrapper** (or maybe there are subtle differences between FBSD/Linux which trigger this only on FBSD?). Nonetheless, as you said, the cost here is very small...

** kernel/OS/FreeBSD, setup/FreeBSD/oss/build/osscore.c .

As for attachments, I've enabled plain text files now. I prefer inline though if it's a small text.
cesium
Developer
 
Posts: 902
Joined: Sun Aug 12, 2007 12:51 am

Re: FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

Postby leres » Mon Jul 05, 2010 6:32 am

cesium wrote:Now, I'm not sure this is the most correct place for a patch - I never saw any similar crash reports under Linux, so perhaps the real problem might be in oss4's FreeBSD wrapper** (or maybe there are subtle differences between FBSD/Linux which trigger this only on FBSD?). Nonetheless, as you said, the cost here is very small...

** kernel/OS/FreeBSD, setup/FreeBSD/oss/build/osscore.c .

I agree; this particular crash is in a FreeBSD only oss module, as is my patch; this code is not compiled or used under linux.

Regarding the performance/latency cost of my patch: in the normal case, the same number of instructions are executed. It's only when an interrupt happens during the close of a device that dmap->dmabuf can be NULL. This is fairly rare (it's only happened 4 times in the month I've been running oss on this system).
As for attachments, I've enabled plain text files now. I prefer inline though if it's a small text.

Thanks!
leres
New Member
 
Posts: 3
Joined: Sat Jun 12, 2010 6:02 pm

Re: FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

Postby cesium » Mon Jul 05, 2010 7:00 am

leres wrote:I agree; this particular crash is in a FreeBSD only oss module, as is my patch; this code is not compiled or used under linux.

If it's applied it will be. Most OSS code is target-agnostic, and there are build-time wrappers (kernel/OS/... subdir) and sometimes link-time wrappers (setup/...../osscore.c). kernel/framework/audio/oss_audio_core.c is shared across targets - it's compiled into the osscore module I think? So this will be used on Linux/Solaris/etc. targets. Anyhow, I'm not opposing the patch - once this got a bit more testing, I'll try to send the diff to the devs (you may wish to do so yourself, of course).
cesium
Developer
 
Posts: 902
Joined: Sun Aug 12, 2007 12:51 am

Re: FreeBSD 7.2-RELEASE/amd64 page fault while in kernel mode

Postby adamk » Thu Jul 29, 2010 7:09 pm

cesium asked me to test this on IRC. I'm not sure if it's related, but I did just get a lockup, even with this patch:

Code: Select all
bt
Tracing pid 12 tid 100033 td 0xc57c2580
oss_memset(ea39d000,0,10000,4,e000,...) at oss_memset+0x14
do_outputintr(c6915408,0,1000,0,e000,...) at do_outputintr+0x278
audio_outputintr(0,0,0,1000,c52abc44,...) at audio_outputintr+0xd0
sbliveintr(c6767c08,1b3c7a,c6beca40,c55c6a00,c52abcb4,...) at sbliveintr+0x413
ossintr(c6ead840,0,109,d0ef0acb,3f4,...) at ossintr+0x20
intr_event_execute_handlers(c55807f8,c55c6a00,c0cce422,52d,c55c6a70,...) at intr_event_execute_handlers+0x14b
ithread_loop(c57b21e0,c52abd28,0,e2141244,3a,...) at ithread_loop+0x6b
fork_exit(c08862a0,c57b21e0,c52abd28) at fork_exit+0x91
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xc52abd60, ebp = 0 ---




Adam
adamk
Member
 
Posts: 78
Joined: Fri Jun 11, 2004 1:50 pm


Return to BSD

Who is online

Users browsing this forum: Bing [Bot] and 1 guest

cron