Underwhelming performance Khadas Vim2 Max in video rendering kdenlive

I think the mixed up results in the last list ?

It would be interesting to see the results of Odroid C2 without overclocking (in 1.5 Mz mode).

By the way, in Odroid C2 and Khadas you can easily install LXDE instead of MATE. Then the results will depend less on the size of the occupied memory.

1 Like

Tried 2 times with the big project at 1.5, it crashed every time at 1h50m(only a few minutes left there). Didn’t want to wait that long again. So I simplified it by doing a Render of only 1 minute of the Big Buck Bunny video. Again strange results with the Khadas. It doesn’t seem to like Kdenlive. Here’s the results.
1 minute render 1080p
C2
1.75Ghz 8m09
1.54Ghz 9m19
1.30Ghz 10m29
1Ghz 13m18

Khadas
1st time : 9m24
2nd time : 9m38
3th time : 9m23

The Khadas again didn’t use 100% of it’s capabilities. So without any effects. I did it 3 times. 2nd time was even a lot slower. I truly don’t know what’s happening here.
Update
I now tried with another 1080p video file, again 1minute long and all the sames parameters. Again the same behaviour. 1st result 9m51s, 2nd result 9m16s. It’s all over the place.

I correctly understood that when trying to execute a former task (which is described in the 1st post of this topic) on the Odroid c2 in 1.5 mz mode (without overclocking), the task failed ?

Sorry, I didn’t realize what they meant ?

Indeed: It wasnt a hard crash. But: "Render failed" message. Ive reinstalled the C2, I hope that didn`t mess up anything. It did do the long Bench fully on 1.75Ghz.

I tried the same test on the Khadas 3 times, I didnt get the same result every time. Im out of ideas. I thought I found it when removing the effects. But in a short 1 minute 1080p video it also doesnt work well. I don`t see the logic.

Any idea on how to test with swap file or zram? Or are there any other Linux-distros for the 912?
Thank you.

I reinstalled Ubuntu. This time I could install the swap without any issues.
I tried Kdenlive again. But no difference. Still so damn slow.

I think the problem has got to do with how the cpu is given tasks.
I can’t find any other explanation for this behaviour.

Today I ordered a NanoPC T3+. Also Octa core, but one GB less memory. I hope that will do better for me.

I did this on the Odroid C2. It used about 200mb less memory, so that great. But the render wasn’t faster. I then found that nothing of the project was copied to the swap, but only the os things. So that didn’t have any influence on the render time of the Odroid. Again a good thing to know.

Thank you.

I made a disassembler code of both versions of the program (aarch64 and armhf). During a quick inspection of the code (perhaps I missed important nuances), I found some interesting differences that can affect the strange behavior of the program (this is not a fact, but only my guess). I don’t have time to study everything. Perhaps in the future I will be able to return to this issue and find a reason.

2 Likes

That is a shame. Not sure if my pet theory has been eliminated yet. It is somewhat useful that you have a faster experiment than 2 hours now.

My theory is that temporary files are full: the way to monitor is a separate command prompt shell/window and watch the output from

df -h

before any problems, as soon as any problems start and then again after a couple of minutes of problems. If any of the tmp filesystems are getting full this will show which and you can then re-tune them.

@balbes150
Hello again, sorry but I again need your advice.
I tried to use “Armbian_5.41.1_S9xxx_Ubuntu_xenial_4.17.0-rc2-next-20180426_mate_KODI.img” +“Armbian_5.41.1_S9xxx_Ubuntu_xenial_4.17.0-rc2-next-20180426_mate_KODI.img” and “KVIM2-emmc”.
When using USB_Burning_Tool to install to eMMC I get “Parse burning image fail”.(you said not to use this)
When using an SD card I can’t boot from sd, pressing function doesn’t do anything. I installed Android on eMMC and installed the file in system update. But that bricks the OS. It restarts the Vim2, but it doesn’t boot anymore. It stays on the Khadas boot up screen(did this 2 times).

I tried following your instructions here

And on other threads. But I can’t make it work. It’s all a bit too complicated for me.

I’ll try again tomorrow. I’m reading all the posts about this, but there’s too much to read, and I can’t find anyone having the same issue.

It’s not only to test Kdenlive that I want to try Armbian. Also because I’m preparing a new video about Armbian on most of my SBC’s(comming in a few weeks).

Thank you. I’ll try that when I’ll install Ubuntu again.

Please note , the system MultiOS_3_in_1 and Armbian, fundamentally different in technology use. For your case, I recommend starting with Armbian.

  1. Restore the native Android to eMMC.
  2. Download, unpack and write a special program the resulting image to SD or USB drive. Please note-do not copy the resulting file"*.img" , write image through a special program (list of programs that you can do is themes about Armbian).
  3. Copy from the directory /dtb (recorded media) in the root directory of the file kvim2_android.dtb and rename it to " dtb.img".
  4. To run Android and execute the activation of the multi-boot (details are in the subject Armbian).
    For your tests with video, I recommend checking out three different images-with kernel 3.14, 4.9, and 4.17. The behavior of the system can vary significantly. If you have any questions about the launch of Armbian, I recommend watching topics about Armbian on this forum and armbian forum. By the way, there are several special topics with step-by-step instructions and pictures, how to activate multi-booting and how to run different systems from external media for beginners.

I know! The thing I do not understand is the low level boot sequence of these ARM chips in general, and Amlogic in particular. Fortunately other folks around here do!

My 6p: your VIM2 is now in some weird unknown state and who knows what is trying to boot from where. So, the way I get mine back to a known state is as follows:

  1. Flash a Khadas image to emmc using a windoze PC and the Amlogic burning tool. My preference for Khadas image is the 4.9.40 Ubuntu server image. The Amlogic burning tool instructions are here. The Ubuntu server image is here.
  2. At this stage your VIM2 will always load the kernel and DTB from emmc which seems like a problem. But in fact it (or at least mine) will load a rootfs from an SDcard! So I also have an SDcard with balbes150 Armbian 5.37 xenial server 20171226.img.xz on it. (Need to unzip, then dd the image to an SDcard). If you get this right then, with the SDcard in the VIM2, when you boot you should get Armbian and not Khadas Ubuntu.
  3. Last, you have a choice when running the rootfs from the SDcard of what you want to install on the VIM2 emmc. You can EITHER
    a) Use the /root/install.sh script which will copy the SDcard OS to the emmc, OR
    b) You can use the kvim2-update script which will install the KVIM2-emmc.img.gz image in the /ddbr directory into emmc. You get a suitable Mate/Server/3in1 image from Yandex here.

After step 3 your VIM2 will boot from emmc with the kernel/DTB and rootfs there. Or, if you have an SDcard in, it will boot with the kernel/DTB and rootfs on the card. Thanks all to balbes150 magic.

I got it to work. Thank you.
I tried installing that file in system update in Android again, this time with a Armbian image in on the sd and it worked. I think it didn’t work because I used the wrong image.

I tried :

  1. Armbian_5.37_S9xxx_Debian_stretch_3.14.29_server_20171226.img
    I could not see the whole desktop there. I use an 1080p display, it was in 1080p. But only the upper left part I could see. Not the right part nor the lower taskbar.
  2. Armbian_5.37_S9xxx_Debian_stretch_3.14.29_icewm_20171226.img
    I didn’t know what to do with that. I was put off because of the very old look of it.
  3. Armbian_5.37_S9xxx_Ubuntu_xenial_3.14.29_mate_20171226.img
    I can not login. When I type my password it reboots the login page and asks my password again.

I’ll check-out some more things in the weekend.
What is the best alternative distro I should try?
Thanks a lot. Greetings

Start by reading the first post in this topic.

There is no answer to this. To be pedantic, what would be best for you? My view on this is there are 3 layers to the cake:

  1. The kernel: main choices are 3.14 (good if you want video playing, games, …) and 4.9.40 (later version, …) There are some experimental 4.14 and 4.17 options but they are lacking some core functionality (e.g. wifi). My choice is 4.9.40 as best for me… There are 2 choices here - Khadas release and balbes150 version: I use Khadas as it has RNDIS support which I need when building my system. Technically you can build your own with the fenix scripts - I tried and failed!
  2. There is some VIM specific glue in terms of configuring/partitioning the emmc etc. I like the balbes150 options to boot from SDcard so use his partitioning, but of course use the Khadas Bluetooth & LED (pulse) options.
  3. The rootfs: here is where you have in theory a nearly infinite choice - OpenSUSE, Arch, Armbian, Fedora, Debian, Ubuntu, … My main consideration was I want a large repository of aarch64 applications - to the best of my knowledge that left me with Ubuntu (/Debian/Armbian) and SUSE. (e.g. there is no gnucash application in the Arch/ALARM repository). My current install is a homebrew from Ubuntu base 18.04 because I don’t like stuff in any of the standard distros. At a basic level there are only 2 distros in this list for which there is a simple image/install option - Ubuntu and Armbian (arguably SUSE and Arch images from balbes150 as well) so for newbies a much shorter list to choose from.
1 Like

Hello again.
Another update.
I’ve now got the NanoPC T3+.

I’ve done the same project on it. And I’ve got the same problem with it as on the Khadas Vim2. It’s a bit faster, but that’s expected even with the problem.
It’s 1h25m27s for NanoPC, Khadas 1h43m46s
So it’s better but not what I want.

This may not be good for me, but it does give more information.
Now we know this problem shows up in 64-bit octa core sbc’s. On the XU4’s 32-bit it works fine…

I used on the NanoPC T3+ :
Armbian bionic next 4.14.40
Kdenlive version 17.12.3
MLT version 6.6.0

On the Khadas Vim2 :
Ubuntu 16.04.3 LTS with Mate desktop
Kdenlive version 15.12.3

I’m leaving on a cycling trip tomorrow. I’ll use the C2 for now. When I’m back I’ll do a lot more investigations. I must find a solution for this.

Thank you, greetings.
NicoD

Here a screenshot of the problem :

Here the result :

That’s not a very scientific approach :wink:

Things to keep in mind:

  • you used different swap/zram settings and memory utilization is pretty different. On your last screenshot above from the Vim2 no swap was configured and only 1.4 GB of memory have been used (so the task was able to cope with 1.4 GB memory). Now on the NanoPC you seem to use 1 GB swap (if it’s an Armbian Ubuntu that’s most probably zram – can you confirm this by providing ‘armbianmonitor -u’ output please?) that have been fully used and the main memory also utilized completey. Now we’re talking about the task needing 3 GB of memory
  • The number of available CPU cores can directly influence the amount of memory needed, so if the task in question needs always the same amount of memory per CPU core then an octa-core device with 2 GB RAM (NanoPC) has a disadvantage compared to a quad-core device (C2).
  • 32-bit software might need magnitudes less memory compared to 64-bit software on ARM. I’ve compared a 32-bit and a 64-bit version of the same software running on a 64-bit SoC with 64-bit kernel a while ago: almost twice as much memory needed when running 64-bit code while not providing any significant performance wins

To summarize:

  • the same task running on 32-bit ODROID-XU4 might need a lot less memory so 2 GB DRAM might be fine for the job
  • If a task (or an application) needs a certain amount of memory per active CPU core then SoCs with twice as much CPU cores would need also twice as much RAM to perform in a linearly scaling fashion
  • Once swapping occurs performance can drop a lot depending on whether zram is used or a swap file/partition on physical storage. If physical storage is used the random IO of the storage is important. So an ODROID-C2 that swaps a little bit on their ultra fast eMMC modules can perform magnitudes faster compared to a system swapping on an ultra slow SD card

Some possible insights wrt count of CPU cores vs. memory demand, zram/swap performance and also on 2nd page of the thread how to temporarely disable CPU cores (maxcpus=4 in kernel cmdline):

As already suggested: without monitoring system behaviour with iostat and vmstat in the background it’s a bit hard to further draw conclusions.

2 Likes

Hi Thomas, thank you for the insights. Here is the armbianmonitor output. All small letters.
http://ix.io/1byt
Indeed it used more ram than the Khadas, Ill look into why that is. Ill also try with 4 cores enabled.
I can say that the Odroid C2 also needs a lot of its SWAP. But there is only 1.7GB ram available there. Most of the data moved to swap is not from the videoproject. So I dont think it slows down the process much.

It is zram that`s being used.
nicod@nanopct3plus:~$ cat /proc/swaps
Filename Type Size Used Priority
/dev/zram0 partition 127816 0 5
/dev/zram1 partition 127816 0 5
/dev/zram2 partition 127816 0 5
/dev/zram3 partition 127816 0 5
/dev/zram4 partition 127816 0 5
/dev/zram5 partition 127816 0 5
/dev/zram6 partition 127816 0 5
/dev/zram7 partition 127816 0 5
<\code>

Its a scientific observation Ive made :wink: , not a conclusion. I dont know if its on all 64-bit octa-cores. I only know that I have seen this behaviour in 64-bit octa-core sbc`s.
Indeed it needs a lot more research to be able to make a conclussion.

It is over the whole project that it appears to have the same problem as the Khadas. I know this video render so well from doing it on so many different sbc`s that I recognize the paterns, (Also not scientific)

In video rendering it seems to use as much memory. But 32-bit seems quite a bit faster than 64-bit.
XU4 and Tinker Board as example.
But its like comparing raspberrys with oranges (I hate Apple`s…)
XU4 = 46m23s
Tinker = 1h12m15s
C2 = 1h43m46

Thats a big gap with the 64-bits.
Too bad the XU4 isn`t power efficient, and the Tinker is just a big pile of shht.

I`ll do more tests now, thanks.

First test results.
It didnt use as much ram this time. Last time I had done a lot of tweaking before doing the render, maybe because of that there was more ram used. Also on the last image you see that the big bump at the end is while finishing the file. Here before I started Kdenlive with all the stats running. I didnt turn on the fan yet there.


The first minutes.

Here it`s halfway. There it slows down just like the Khadas, and then goes on strong again.

Last 10%. Here it starts going very slow. This part takes a very long time.


Here almost finished.

This is the big bump where it writes the file. Thats what youve seen on that earlier picture. But it did use less memory.

As last. Here you see how long it did over that last bit. When its running well its about 53C. On the last part its only 42C. Over what should be 7 minutes it did 37minutes. I dont know how to interpret all these numbers. I see that the load is lower, but why I don`t know.

Now did it with 6 threads(but all cores on).The same behaviour.
Time was : 1h24m29s
Only 38 seconds slower. That seems weird.

Thanks for the help

Nope. As already said: there are fast ARM cores and slow (energy efficient) ones. The A15/A17 used on ODROID-XU4 and Tinkerboard are fast cores, the A53 used on almost all 64-bit SBC are slow ones. That’s all.

If you repeat the same with a 64-bit ARM thingy that uses 4 A72/A73/A75 cores it will be a lot faster.

I’ve now tested with 4 threads in Kdenlive. 1h23m49s
Everything is exactly the same as with 8 threads.

Then checked with 1 thread, and only 1 core is used to 100%.
Now 2 threads and it’s using all cores at around 50% and will take longer. 1h41m46s

I’ll test further with only 6 cores enabled and 4 cores. I’ll se if it does better with tasking the threads.

Too bad those things are way more expensive. I only know of the HiKey 960/970 boards with A73 cores. +200 dollar.
And the comming Rock3399’s. But these use way too much power for my use. I will buy the Odroid N1. Can’t wait to check that out.

Any other boards you know of? (sorry for bugging you with this, but I can’t think of anybody with more knowledge about these boards, probably 25% of all forum threads about SBC’s that I read have you commenting on them…)

Did you get any useful information from iostat, vmstat and armbianmonitor?
Thanks and have a nice day.

ODROID-N1 has a high idle consumption due to the ASM1061 PCIe SATA adapter. Since you don’t need this why not giving any other RK3399 design a try? There are a lot and even Khadas will present one soon.

Don’t they all use 12V? RockPro64, Odroid N1, Orange Pi 3399, HiKey Rock960, …
Am I missing some?
I hope the Khadas one will be power efficient. But I don’t know if I would buy another Khadas.
I would like to use the N1 as main desktop. So all those goodies are good for that. Only hope is that it’s stable.
@tkaiser Update
I`ve just done the 1080p BMW Blender project. Great result.
1h10m27s NanoPC T3+
1h12m43s Odroid XU4
1h18m55s Khadas Vim2
2h00m38s Odroid C2
2h29m42s Tinker Board

The NanoPC beats even the XU4. In Blender all cores are always at max. It also works better on 64-bit systems. In Kdenlive we see the total opposite.
I would like to see the Kdenlive results match the Blender results. But I guess it will be something in Kdenlive. I used to be a programmer(long time ago), Ill download the source, and try to decipher whats happening tomorrow.
Cheers