Which system do you use? Android, Ubuntu, OOWOW or others?
Ubuntu
Which version of system do you use? Khadas official images, self built images, or others?
vim3-ubuntu-20.04-gnome-linux-4.9-fenix-1.1.1-220725-emmc.img, and “apt full-upgrade” last week
Please describe your issue below:
I set up the 4G module according the documentation khadas. I have the network but I found the errors in the log dmesg, and after several hours the network will disconnect and reconnect. And the network speed will be slower. I tried one solution to add “vm.min_free_kbytes = 32768” in “/etc/sysctl.conf”, this delays the occurrence of the error but not permanently solve the problem.
This error occurs immediately when the memory is occupied. Looks a bit like a memory leak in the network.
@ivan.li, I think that this error only occurs when the RAM memory is fully occupied. In the case of free memory, because the network module has a lot of memory available, although there is a “memory leak” problem, we cannot see the error. Did you test with the memory occupied? What is your current kernel version? I try to update to the same version. Thank you.
I tried the version you sent (with the kernel version 4.9.241 after updating), but the error still occurs when the RAM memory is occupied. You can use the command in a tmux session to occupy the memory:
Hello @ivan.li, I need to run an AI inference program which uses most of the memory. While the program is running, it has no effect on the use of other units or other network modules such as Wi-Fi, Ethernet or other 4G USB sticks.
I tried to set “vm.min_free_kbytes = 32768” in “/etc/sysctl.conf”, which forces the kernel’s memory manager to keep at least 32MB of free memory in order to ensure proper functioning of the system itself. The reserved memory will not be allocated to other programs.
In this case, the error with the Quectel 4G module did not appear immediately, but after about 2 days, the error appeared. That is to say, even if the memory does not run out, the memory will be slowly occupied by the 4G module over time.
Just like the previous stress test, we just occupied the memory, not a memory leak or a momory problem, but the 4G module had an error. This is not normal.
I think it’s the driver or kernel who has a memory problem with the 4G module, it gradually fills up the reserved memory. Currently I can only temporarily resolve this error by rebooting. I will do more tests to see where the memory leak is.
Hello @ivan.li, thanks a lot. I wait the test results on your side.
Is it possible to reboot/reset the 4G module without rebooting the vim3? For example by sending AT commands or restart driver or network unit etc. Because we can only temporarily fix this problem by rebooting vim3 each day. It’s inconvenient.
Hi~ @JJ1997
Sorry for the late reply . This problem has been included in my work plan for this week. I’ve done one a few times before. The most illustrative test was a two-day, two-night test. According to the data presented, the 4G module does not seem to be the real culprit.
Can I offer you something that will allow me to actually recreate your usage scenario. Let me look for more clues.
Hi @ivan.li
Thanks for your efforts. You have reproduced the problem but did not find the cause, do I understand correctly?
I think you meant that we offer you something instead of you offer us something. If yes, because our program involves our IP like code, algorithm, AI model etc, it’s not easy to install and configure by yourselvs, and it is difficult to provide it to you to recreate our usage scenario.
We will try to provide you with the information you need as much as we can. Our program does inference on the NPU 24/7. The previous stress test which ran out memory can also reproduce the problem. In the idle case (such as your screenshot I think), we have not observed this issue. Could you tell us more about your current progress and the information you need?
Thanks a lot!
We haven’t reproduced this issue only with 4G module running without other applications.
So the issue occurred only when the memory is low, right? While your AI inference program which uses most of the memory and the issue occurred. Can you try to optimize your program to use less memory? Have you checked your program about the memory?
If you doubt there is a memory leak with the 4G module driver then we need to use other way to reproduce this issue.
@numbqq , we run our program before with 4G USB sticks, Wi-Fi or Ethernet, we didn’t experience any issues like this. Now we try to switch to 4G M.2 module, we have this issue.
When the memory is occupied by our program, our program didn’t cause system freezes or crashes. As the 4G module belongs to the basic operation of the system, I don’t think it is normal.
Even if we limit the memory used by our program, the error don’t appear right away, after 24h for example, it will appear. Our program restart each day, the error should either appear in the 24h timeframe between restart or not at all, but the error is always here after the first 24h. The only solution we found now is a reboot Ubuntu each day.
For now we found the issue only when the memory is low. We haven’t checked our program about the memory, but a normal stress test can cause this problem. Even with memory usage optimized of our program, I’m afraid I’ll end up in the same situation as limiting memory: the error don’t appear right away, but after a while, it will appear.
Hello @numbqq , Based on other discussions on the Internet and our previous experience with the program using other network modules, we doubt there is a problem with driver or kernel. Do you have some official examples to run AI inference on VIM3 to avoid using our program to reproduce the issue? As we don’t have much experience, it will be difficult for us to do memory analysis for our program, it should take longer. Thanks a lot.