USB port hangs using Coral AI accelerator

Yuk! thats unfortunate…

If you give me some ideas/direction, I can be your hands - or I can offer remote connectivity… (SSH access to a host which khadas is on) - will connect serial console and network.

Update -

I can make the USB/Coral go bad by writing data to the NVME disk and doing AI tasks at the same time (this also happens on USB storage!)

Hello @RichardPar

When the error occurs could you provide the dmesg log?

There are no change in the logs when the errors start… below is what is in the log -

When the Coral test is started, the reset SuperSpeed line appears (along with the 3 other lines)
it does not change when the error occurs.

[66642.027349] usb 2-1: reset SuperSpeed Gen 1x2 USB device number 2 using xhci-hcd
[66642.047234] usb 2-1: LPM exit latency is zeroed, disabling LPM.
[66642.047489] xhci-hcd xhci-hcd.0.auto: ##### crg set max_burst 0
[66642.048162] xhci-hcd xhci-hcd.0.auto: ##### crg set max_burst 0

Any assistance?, or should I just take this as ‘wasted money’ and go back to RaspberryPi. The current linux kernel is not fit for purpose. I have 5 VIM4’s and planned to get 20 more -

its a shame it doesnt run HomeAssistant+AI properly.

as i can understand in your situation
vim4 supply power for nvme + wifi + Coral AI usb etc … right ?

reason can be anything but most of time is a power supply

maybe problem is trivial: not enough power for all on peaks ?

what about testing just Coral alone ?

or provide separate additional power for USB device ?

Thanks…
its just VIM4+NVME+Coral (No Wifi) -

The USB reset occurs as libUSB is getting the card in to a known state to upload firmware.

Its not Power supply - it happens on isolated PSU - VIM4 is powered by 25Watt USB-C-PD PSU
Coral alone works… (as far as I can see)
Adding load on the CPU/Disk (NVME) seems to cause the USB stack handler to slow down. The same problem occurs on the USB-3 and USB-2 sockets (both independent powered USB hubs). Devices seem to work, but I dont have another USB device that uses the same libusb/bulkio functioality to remove Coral from the setup)

I took out the NVME and replaced with a powered USB Hard disk - The problem still persisted. (VIM4+USB-HDD+Coral)

I set the Disk IO scheuler in Cgroups to limit speed at 10megabytes/second writing to disk - problem persisted. To me, it looks like the problems occur when linux writes the dirty cache (but that is an observation; nothing scientific)

I changed the PSU to a 5V/20A - there is no reason for power to be a problem…

After 2 minutes of tests, it just rebooted itself

my suggestion for clearing problem check only configuration like : VIM4 with eMMC + usb-Coral + original power adapter!

after this results we can follow to the next step

There is no ‘original power adapter’ - all the boards come in a tiny box with antennas.
I will repeat with eMMC

I used DD to create a 10GB file …

to NVME - Coral fails after 5GB+ transferred to drive

to eMMC - Coral fails after

After the failure, the device takes a long time to recover… below example shows it going from error state to working.

lsusb shows the USB device as present…and dmesg does not show any USB plugging events.

Memory Plot

Red arrow shows the place where the USB/Coral errors start happening (Writing to eMMC)

Writing to NVME - Same thing, just quicker :smiley:

UPDATE:

When Coral is not working, dropping the cache makes it work again…

root@Khadas:/home/khadas# echo 3 > /proc/sys/vm/drop_caches

i saw messages like resources exhausted its mean something wrong with software memory allocation etc …

looks like its not USB port problem ? :wink: how do u think?

I think its a memory issue inside the kernel - I just dont know where to start looking. I dont know why the cache is causing the problem - below ~1GB of FreeMem and it doesnt behave well.

The same problem happens on USB2 and USB3 - which are different drivers in tke kernel (one is xhci and the other is DWC3)

I am running a test now with the min_free_bytes set to 2GB … seeing if it survives

UPDATE:

echo 500000 > /proc/sys/vm/min_free_kbytes

The Coral/USB has not gone bad yet … (yes! its a totally silly number!) MemFree is hovering about 1.4GB Free

tnx for exploration ! i will check it on my side , and try to provide some solution and suggestions for similar problems

PS: please check system logs for oom-killer matches

PSS: please share your logs as plain text

No OOM tasks have been executed…

My application just starts, loads AI model - runs a picture of a parrot and exits

Early boot memory looks different - is it meant to ?

Android

earlycon: aml-uart0 at MMIO 0x00000000fe078000 (options ‘’)
[ 0.000000@0] printk: bootconsole [aml-uart0] enabled
[ 0.000000@0] 08400000 - 08500000, 1024 KB, ramoops@0x07400000
[ 0.000000@0] CMA pool @0x0000000005000000, size 52 MiB need clear mmu map
[ 0.000000@0] 05000000 - 08400000, 53248 KB, linux,secmon
[ 0.000000@0] 40000000 - 41000000, 16384 KB, linux,dsp_fw
[ 0.000000@0] 3f800000 - 40000000, 8192 KB, linux,meson-fb
[ 0.000000@0] CMA pool @0x00000000c0400000, size 508 MiB need clear mmu map
[ 0.000000@0] c0400000 - e0000000, 520192 KB, linux,codec_mm_cma
[ 0.000000@0] CMA pool @0x00000000a5400000, size 432 MiB need clear mmu map
[ 0.000000@0] a5400000 - c0400000, 442368 KB, linux,nvme_ssd
[ 0.000000@0] node linux,di_cma compatible matching fail
[ 0.000000@0] Reserved memory: created DMA memory pool at 0x00000000e0000000, size 0 MiB
[ 0.000000@0] e0000000 - e0000000, 0 KB, linux,ppmgr
[ 0.000000@0] 9d400000 - a5400000, 131072 KB, linux,isp_cma
[ 0.000000@0] 99400000 - 9d400000, 65536 KB, linux,adapt_cma
[ 0.000000@0] 91400000 - 99400000, 131072 KB, linux,cam_cma
[ 0.000000@0] 87c00000 - 91400000, 155648 KB, linux,ion-dev
[ 0.000000@0] 7ac00000 - 87c00000, 212992 KB, linux,ion-fb
[ 0.000000@0] 79800000 - 7ac00000, 20480 KB, linux,vdin1_cma

Ubuntu Server
[ 0.000000@0] Machine model: Khadas VIM4
[ 0.000000@0] earlycon: aml-uart0 at MMIO 0x00000000fe078000 (options ‘’)
[ 0.000000@0] printk: bootconsole [aml-uart0] enabled
[ 0.000000@0] 08400000 - 08500000, 1024 KB, ramoops@0x07400000
[ 0.000000@0] CMA pool @0x0000000005000000, size 52 MiB need clear mmu map
[ 0.000000@0] 05000000 - 08400000, 53248 KB, linux,secmon
[ 0.000000@0] 40000000 - 41000000, 16384 KB, linux,dsp_fw
[ 0.000000@0] 3f800000 - 40000000, 8192 KB, linux,meson-fb
[ 0.000000@0] CMA pool @0x00000000c5000000, size 432 MiB need clear mmu map
[ 0.000000@0] c5000000 - e0000000, 442368 KB, linux,codec_mm_cma
[ 0.000000@0] node linux,di_cma compatible matching fail
[ 0.000000@0] Reserved memory: created DMA memory pool at 0x00000000e0000000, size 0 MiB
[ 0.000000@0] e0000000 - e0000000, 0 KB, linux,ppmgr
[ 0.000000@0] bd000000 - c5000000, 131072 KB, linux,isp_cma
[ 0.000000@0] bb800000 - bd000000, 24576 KB, linux,adapt_cma
[ 0.000000@0] b2000000 - bb800000, 155648 KB, linux,ion-dev
[ 0.000000@0] node linux,ion-fb compatible matching fail
[ 0.000000@0] b0c00000 - b2000000, 20480 KB, linux,vdin1_cma
[ 0.000000@0] 21fc00000 - 220000000, 4096 KB, linux,ldc_mem
[ 0.000000@0] cma: Reserved 8 MiB at 0x00000000b0400000

Debian 10
[ 0.000000@0] Machine model: Khadas VIM4
[ 0.000000@0] earlycon: aml-uart0 at MMIO 0x00000000fe078000 (options ‘’)
[ 0.000000@0] printk: bootconsole [aml-uart0] enabled
[ 0.000000@0] swiotlb,default value: noforce
[ 0.000000@0] swiotlb,dts value: normal
[ 0.000000@0] 08400000 - 08500000, 1024 KB, ramoops@0x07400000
[ 0.000000@0] CMA pool @0x0000000005000000, size 52 MiB need clear mmu map
[ 0.000000@0] 05000000 - 08400000, 53248 KB, linux,secmon
[ 0.000000@0] 40000000 - 41000000, 16384 KB, linux,dsp_fw
[ 0.000000@0] 3f800000 - 40000000, 8192 KB, linux,meson-fb
[ 0.000000@0] CMA pool @0x00000000c5000000, size 432 MiB need clear mmu map
[ 0.000000@0] c5000000 - e0000000, 442368 KB, linux,codec_mm_cma
[ 0.000000@0] node linux,di_cma compatible matching fail
[ 0.000000@0] Reserved memory: created DMA memory pool at 0x00000000e0000000, size 0 MiB
[ 0.000000@0] e0000000 - e0000000, 0 KB, linux,ppmgr
[ 0.000000@0] bd000000 - c5000000, 131072 KB, linux,isp_cma
[ 0.000000@0] bb800000 - bd000000, 24576 KB, linux,adapt_cma
[ 0.000000@0] b2000000 - bb800000, 155648 KB, linux,ion-dev
[ 0.000000@0] node linux,ion-fb compatible matching fail
[ 0.000000@0] b0c00000 - b2000000, 20480 KB, linux,vdin1_cma
[ 0.000000@0] 21fc00000 - 220000000, 4096 KB, linux,ldc_mem
[ 0.000000@0] cma: Reserved 8 MiB at 0x00000000b0400000

I dont believe its the memory anymore - I run memtester on all the RAM and things work.

I am now thinking there is a kernel option causing this - I think the buffer/cache flushing is taking more priority and causing userland/IRQ’s to not be serviced.

I am currently doing a criminal hack… and the problem has not occured (as the Cached memory stays high) - but this is not a fix!

#!/bin/sh

while :
do
echo 3 > /proc/sys/vm/drop_caches
sleep 10
done

1 Like

Hello @RichardPar

Could you reproduce this issue with other devices except the Coral?

Thank you for your temp solution,it works for me.I have almost the same problem with Edge2+Gemini2(Orbbec d-camera).When running ros node on edge2 for a while,usb seems to be reset and i can get such message with dmesg.

[ 3118.406616] usb 8-1: reset SuperSpeed Gen 1 USB device number 2 using xhci-hcd
[ 3118.424211] uvcvideo: Unknown video format 20343159-0000-0010-8000-00aa00389b71
[ 3118.424231] uvcvideo: Found UVC 1.10 device Orbbec(R) DaBai DCL(TM) (2bc5:0701)
[ 3118.426814] uvcvideo: Found UVC 1.10 device Orbbec(R) DaBai DCL(TM) (2bc5:0701)
[ 3118.429363] uvcvideo: Found UVC 1.10 device Orbbec(R) DaBai DCL(TM) (2bc5:0701)
[ 3120.375584] usb 8-1: usbfs: process 20472 (component_conta) did not claim interface 0 before use
[ 3125.375904] usb 8-1: usbfs: process 20498 (component_conta) did not claim interface 0 before use
[ 3130.376150] usb 8-1: usbfs: process 20524 (component_conta) did not claim interface 0 before use