[Khadas WiP] VIM4 NVMe IO Errors

Unfortunately, @ps23Rick has even had problems with a Kingston A2000 https://forum.khadas.com/t/khadas-wip-vim4-nvme-io-errors/16572/23

1 Like

So that brings Nick’s count of working SSDs down to zero and all SSDs tested fail. Well, that makes searching for SSD differences (as tried above with APST) 100% pointless.

The kernel you VIM4 users all are using regardless of Android, Ubuntu, Debian, Arch/Manjaro is Amlogic’s most recent forward ported version (from 4.9, before from at least 3.14, 3.10 and whatnot) which is a 5.4.125. There’s nothing else and there won’t be anything else anytime soon or at all.

One huge problem with these vendor kernels is that the SoC vendor’s employees forward port the code base since forever and likely simply skip patches here and there when merge conflicts occured.

Nobody knows how much code and which areas this affects unless somebody takes the time and efforts to rebase this Amlogic kernel on a clean 5.4.125 LTS. This is not an Amlogic problem but one of ‘ARM SoCs originating from the Android world’ in general and as such applies to e.g. Allwinner or Rockchip as well: The radxa bsp kernel patches : from 5.10.67 to 5.10.123 - ROCK 5 Series - Radxa Forum

Well, 26 days ago Nick said they identified the issue and are working on it…

1 Like

I won’t hold my breath then.

It is a shame because it was the only reason I bought the thing.

1 Like

Wow… you’ve all been busy since I checked in last. I guess I’m just wondering if the next reasonable step to try and figure this out would be to do what was suggested above and take the time to re-baseline on a vanilla/fresh kernel source tree and slowly apply the respective bits of SOC specific code, test — lather, rinse, repeat.

Unfortunately it seems like this path would require assistance (perhaps lots) from the SOC vendor who has the intimate knowledge of the innards of the parts and that Khadas is just a middleman that can only try to fix things and not the actual source of the problems so to speak… Interesting … all of this… It’s too bad that Khadas is stuck in the middle. I just wonder how much help the SOC vendor would be willing to provide in an effort to make more future sales…?

Maybe I’m off on these assumptions — I’ll have to go back and re-read these posts again. I think it’s sad that Khadas and Nick have to try their best to try to fix the problems that have been put in their lap because of not so great things being done by the SOC supplier. (My interpretation)

1 Like

These are all the issues with SOCs and vendors that Armbian was created to address. I’ve seen Armbian and Igor, Balbes, etc. criticized for their zealous adherence to mainstream kernel adherence. This seems misplaced. Mainstream kernel support is CORE to the mission of Armbian existence.

The position that Khadas straddles, as a producer of amazing SOC hardware, in this case a practical NVMe SSD HW interface, perfectly illustrates the value of pursuing that mission.

I believe that much of the first bootstrap of Fenix images is inspired by the Armbian build and install system. These contributions help everyone.

I look forward to a future date where these issues have been solved for the Amlogic A311D2 with Armbian, and we can wring the full potential from the Khadas efforts.

— Jeremiah

That would be Amlogic. Why would they care about anything that doesn’t result in selling millions of SoCs?

Here’s Christian’s insights on what to expect from Amlogic and this latest version of their BSP kernel mess (and why A311D2 isn’t just an A311D with an added 2 but something new and entirely different):

What’s at the heart of VIM4 will be soon selling in the millions as this thing. That’s Amlogic’s market and not a few thousand SBC here or there.

Given the crappy I/O capabilites of this Amazon box most probably the whole PCIe thing inside this SoC has only been tested with Wi-Fi chipsets and not storage/NVMe. But who knows? At least it should be well understood that Amlogic does and cares only about TV boxes and the like and that I/O with Amlogic was always crap and will most probably remain crap in the foreseeable future. Since why would a TV box need decent I/O capabilities?

You have made a number of very valid points there @tkaiser.

I was really keen on getting Debian installed on mine but another thread on here about that subject mentions that BayLibre don’t consider the A311D2 to be a priority and I suspect other OS providers would be of the same opinion.

So, if Amlogic and the OS providers can’t or won’t sort it, where does that leave us?

I guess it rests on khadas, being as how they sold the device as having NVMe capability.

I’m sure it will be rather expensive for them to get it sorted but maybe even more so if it can’t be resolved.

I live in hope that the problem is resolved before the product becomes irrelevant.

I would believe the USB3/PCIe IP blocks in A311D2 are the same as before. Amlogic had only USB Hi-Speed (USB2) prior to the G12B/SM1 families but starting with S922X/A311D/S905X3/S905D3/ S905Y3 there was this single PCIe Gen2 lane pinmuxed with USB3. Since I/O is zero priority for Amlogic (STB / TV boxes) most probably it’s still the same controllers in T7 (A311D2).

As such hopefully Khadas guys are busy bisecting stuff (comparing what changed between Amlogic’s 4.9 and 5.4 BSP kernel) and if it’s the same PCIe controller (licensed from Synopsys as everyone else does) then chances are good that there will be a fix making NVMe eventually usable.

As for help from Amlogic: another board maker told me recently that there’s ‘no such thing as Amlogic support’ if you’re an SBC vendor only selling tens of thousands of SoCs. Which makes perfectly sense since why should Amlogic waste resources? But Khadas might be in a way better position since they’re doing reference implementations for Android!

We’re not talking about upstream mainline kernel here. With mainline and ARM SoCs unfortunately there’s one eternal rule: ‘Software support is only ready once the hardware is (close to) obsolete’. Since the whole upstreaming process is somehow broken.

Even slight changes like adding support for an USB controller can take months. For details search the web for ‘apritzel one image to rule them all’ or directly click here.

Ask @chewitt for details about Amlogic since due to my focus on use cases Amlogic is not on my list.

And for the Armbian advertisement placed here in the meantime: nope, that’s BS. Armbian today is only about repackaging the work of others. And those mainlining Amlogic SoCs aren’t part of Armbian (though Oleg – called ‘Balbes’ by someone – did a great job easing booting and consolidating kernel stuff).

Maybe the closest relationship of those guys doing the real work with upstreaming Amlogic SoCs with Armbian was fellow Martin Blumenstingl coming even to my flat years ago to buy an ODROID-C1+ from this idiot being me since back at that time I was still part of ‘team Armbian’ and didn’t know who he was. I charged him maybe 30 bucks for the board instead of donating it to Amlogic community and adding few hundred bucks on top :frowning:

BULLSHIT, c0rnelious.

So I guess in the meantime we just need to wait… For the moment I guess my VIM4 will be shelfed as the NVMe was the big selling point for me as I wanted to have considerably better file storage than any RPI could have (and faster performance obviously)… We’ve already worn out an RPI microSD card within a year and expect that will continue, which is why I was wanting to switch to the VIM4. Perhaps I’ll see if there’s a USB <—> NVMe enclosure of some sort that I can use in the meantime with better stability than the NVMe has right now at this very moment.

I am still believing that Khadas will try their best to weed through things to get the NVMe natively working — but it will take time.

I’ve got a FIDECO M.2 NVME SATA SSD Enclosure, PCIe USB 3.1, Gen2 SSD Adapter.

It works on every device I have but the VIM4 doesn’t even see it. I don’t think it is a power issue because the VIM4 works fine with a 1TB 2.5" SSD in a USB3 caddy.

Have you considered using some sort of USB drive on the RPi as a stop gap? I’ve got two RPi4s that have been running as web servers with SQL backends 247/365 for a couple of years on small HDDs in USB3 caddies.

Thanks for the suggestion… For me I’d honestly like something a bit faster than the RPI I’ve got… I did read a review on Amazon today regarding these enclosures — the writer was saying that it really depends on what the chipset is that is used to interface with the NVMe module … that there are 3 top chipsets to look for — the mainstream makers. Those you’re more likely to have less trouble with than the smaller fish. Perhaps that enclosure you’ve got is one of the smaller fish brands?

I ordered an enclosure this afternoon from Amazon… we’ll see if it works or not on the VIM4… If not then it’ll get returned.

I’ll admit that if Khadas fixed the WI-FI issue (causing laggy ssh and other issues) and this NVMe issue I’d be pretty happy with the VIM4… I’m not looking to do anything esoteric or anything — just a nice speedy SBC that can interface with some decent sized storage drives whether SSD or otherwise.

Likewise.

I was a bit concerned by what @tkaiser said about Amlogic’s poor I/O though, and am rather disappointed by the throughput, especially when reading and writing NTFS.

Once the device takes up its intended roles as a replacement for some of my RPis, that won’t be such an issue.

Since the A311D2 chipset will have no upstream support anytime soon I stopped looking into the BSP codebase (and work has been busy) so I’m not 100% sure, but I believe the USB IP in the A311D2 is new (home-grown) and not the same licensed IP block as previous generations, so needs an all-new driver alongside HDMI.

Thanks for that @chewitt. Hopefully the people at khadas are making progress.

I hope so too… I might have to suck it up and keep running the RPI4b for a while longer while these issues get ironed out. :pensive::pensive::pensive:.

1 Like

How is the failure induced?

I have been trying to break mine and its not happening using stress-ng --hdd testing.

What is the configuration and OS.
Testing here is jammy-server

Will be live testing our VIM4 soon and it would be nice to break it before we start testing.

I can’t speak for others but for mine I installed PhotoPrism on Ubuntu 22.04 Server and using the WebDAV feature copied about 30Gb of photos over the network from my laptop to the WebDAV folder and that was enough – the WebDAV folder was residing on the ext4fs partition on the NVMe board. Pretty vanilla in my case…

I will try moving a few directories of .jpg and see if that does any thing.

Is yours on the m2 connector or using a USB adapter to nvme?

How is yours configured, mine has OS in emmc and use mounted partitions for the critical files.

Initially on the M2… but the other day I bought a no-name USB enclosure and it regularly disconnects from USB on the Khadas (mentioned in another thread here) – which made it just as bad… The drive didn’t get corrupted but just would drop off the USB bus… Making it, in my case, impossible to add a big drive to the VIM4… I’ve moved it over to an RPI4b using the USB enclosure and it’s working ok there – but it’s a more mature platform certainly - albeit much slower unfortunately.