CPU frequency up to 2GHz?

Yes, it looks like marketing bla bla.
But HardKernel did push AMLogic to release an unlimited firmware (bl30.bin) . And the got a new one from AMLogic with all frequencies unlocked up to 2 GHz.
So maybe “bl30.bin” for VIM2 has already a complete table of frequencies or Khadas stuff can push AMLogic to get new “bl30.bin” like HardKernel did.

Andreas

Nope, it’s 1448 MHz max and also as soon as all 4 big cores are busy at the same time this will be decreased to just 1200 MHz.

Pretty easy to test this out BTW with benchmarks like sysbench who scale linear with count of CPU cores. So simply try out these three tests and compare the results to see how real cpufreq clockspeeds look like:

sysbench --test=cpu run --num-threads=1 --cpu-max-prime=20000
sysbench --test=cpu run --num-threads=2 --cpu-max-prime=20000
sysbench --test=cpu run --num-threads=4 --cpu-max-prime=20000

And here is a nice tool from the guy who discovered Amlogic cheating on us wrt cpufreq: Pine H64 Development Board Features Allwinner H6 processor, Gigabit Ethernet, USB 3.0 and PCIe for $26 and Up - CNX Software

So the interesting questions is not how to get 2 GHz but how to get even 1.5 GHz in reality :slight_smile:

tested the mhz tool on a vim2pro (with performance cpufreq governor, cpufreq-info reporting 1.51Ghz on cores 0-3, 1000 Mhz on cores 4-7), here’s the result:

root@vim2p:~# taskset -c 0 ./mhz
count=645643 us50=22806 us250=114044 diff=91238 cpu_MHz=1415.294
root@vim2p:~# taskset -c 7 ./mhz
count=413212 us50=20676 us250=103364 diff=82688 cpu_MHz=999.449

and for comparison, the same on a rk3399 sapphire ref board:

root@sapphire4:~# taskset -c 0 ./mhz
count=645643 us50=22808 us250=114053 diff=91245 cpu_MHz=1415.185
root@sapphire4:~# taskset -c 5 ./mhz
count=807053 us50=22425 us250=112149 diff=89724 cpu_MHz=1798.968

so it appears this tool is quite accurate indeed (cpufreq-info on rk3399 reports cores 0-3 at 1.42Ghz and cores 4-5 at 1.80Ghz)

What’s the result with the cores loaded up? Same frequency or lowered to 1.2 as the post above suggests?

about the same result… it’s certainly lower than the advertised 1.5Ghz, but not as low as 1200Mhz

Good it isnt lowering to 1.2.

@Gouwa have you got any thoughts on this thread? Dont suppose you have talked to amlogic about it? Is it possible to get a different bit of code from them to adjust it? Thanks!

So when you run sysbench --test=cpu run --num-threads=8 --cpu-max-prime=2000000 in another shell what are the results?

Did you see Underwhelming performance Khadas Vim2 Max in video rendering kdenlive - #15 by numbqq ? Performance when running on the big cluster below RPi 3. That’s another ‘1200 MHz indication’ :wink:

I’ve got another process eating up all other cores, I think that’d be equivalent to sysbench, but I can give sysbench a try.

I’ve seen the other thread as well. I’ve been observing various eyebrow-raising results when using various s912 boards, and this indeed does raises some questions…

Well, while sysbench is a pretty lousy tool to measure hardware performance of different architectures it’s really great when doing these sorts of tests since the whole job is done inside the CPU’s caches so not influenced by memory bandwidth/latency and also scaling linearly with count of CPU cores (so you can compare --num-threads=1 with --num-threads=8 and if the latter number is not 8 times lower you know there’s something wrong)

I posted over there a simple script able to repeat @balbes150’s tests from last year that clearly showed back then that with multithreaded CPU loads clockspeeds further decrease. Should be just a matter of minutes to repeat…

Hi tkaiser,

With my latest build, when I run sysbench --test=cpu run --num-threads=8 --cpu-max-prime=2000000, I found the freq is still 1.41GHz.

root@Khadas:~# uname -a
Linux Khadas 3.14.29 #8 SMP PREEMPT Thu May 24 18:25:14 CST 2018 aarch64 aarch64 aarch64 GNU/Linux

root@Khadas:~# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"

root@Khadas:~# echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
root@Khadas:~# echo 1512000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

root@Khadas:~/mhz# taskset -c 0 ./mhz 20
count=645643 us50=22930 us250=114372 diff=91442 cpu_MHz=1412.137
count=645643 us50=22995 us250=114442 diff=91447 cpu_MHz=1412.059
count=645643 us50=22952 us250=114392 diff=91440 cpu_MHz=1412.168
count=645643 us50=23040 us250=114440 diff=91400 cpu_MHz=1412.786
count=645643 us50=22984 us250=114728 diff=91744 cpu_MHz=1407.488
count=645643 us50=22955 us250=114447 diff=91492 cpu_MHz=1411.365
count=645643 us50=23016 us250=114328 diff=91312 cpu_MHz=1414.147
count=645643 us50=22942 us250=114348 diff=91406 cpu_MHz=1412.693
count=645643 us50=22977 us250=114305 diff=91328 cpu_MHz=1413.899
count=645643 us50=23013 us250=114244 diff=91231 cpu_MHz=1415.403
count=645643 us50=22880 us250=114391 diff=91511 cpu_MHz=1411.072
count=645643 us50=23023 us250=114425 diff=91402 cpu_MHz=1412.755
count=645643 us50=23166 us250=114405 diff=91239 cpu_MHz=1415.279
count=645643 us50=22975 us250=114402 diff=91427 cpu_MHz=1412.368
count=645643 us50=22953 us250=114371 diff=91418 cpu_MHz=1412.507
count=645643 us50=23005 us250=114420 diff=91415 cpu_MHz=1412.554
count=645643 us50=22973 us250=114443 diff=91470 cpu_MHz=1411.704
count=645643 us50=22960 us250=114422 diff=91462 cpu_MHz=1411.828
count=645643 us50=23074 us250=114402 diff=91328 cpu_MHz=1413.899
count=645643 us50=23017 us250=114589 diff=91572 cpu_MHz=1410.132

And when I run sysbench --test=cpu run --num-threads=8 --cpu-max-prime=2000000 on another shell,the result is the same:

root@Khadas:~/mhz# taskset -c 0 ./mhz 20
count=330570 us50=11769 us250=68486 diff=56717 cpu_MHz=1165.682
count=330570 us50=11830 us250=68486 diff=56656 cpu_MHz=1166.937
count=330570 us50=11758 us250=58477 diff=46719 cpu_MHz=1415.142
count=330570 us50=11760 us250=58519 diff=46759 cpu_MHz=1413.931
count=330570 us50=11753 us250=58497 diff=46744 cpu_MHz=1414.385
count=330570 us50=11733 us250=58760 diff=47027 cpu_MHz=1405.873
count=330570 us50=11798 us250=58496 diff=46698 cpu_MHz=1415.778
count=330570 us50=11778 us250=58568 diff=46790 cpu_MHz=1412.994
count=330570 us50=11748 us250=58507 diff=46759 cpu_MHz=1413.931
count=330570 us50=11722 us250=58498 diff=46776 cpu_MHz=1413.417
count=330570 us50=11733 us250=58533 diff=46800 cpu_MHz=1412.692
count=330570 us50=11758 us250=58486 diff=46728 cpu_MHz=1414.869
count=330570 us50=11752 us250=58478 diff=46726 cpu_MHz=1414.930
count=330570 us50=11747 us250=58493 diff=46746 cpu_MHz=1414.324
count=330570 us50=11719 us250=58546 diff=46827 cpu_MHz=1411.878
count=330570 us50=11739 us250=58481 diff=46742 cpu_MHz=1414.445
count=330570 us50=11749 us250=58478 diff=46729 cpu_MHz=1414.839
count=330570 us50=11738 us250=58504 diff=46766 cpu_MHz=1413.719
count=330570 us50=11730 us250=58538 diff=46808 cpu_MHz=1412.451
count=330570 us50=11730 us250=58639 diff=46909 cpu_MHz=1409.410

I’m not sure how did you get 1200MHz? Maybe I did something wrong?

Thanks.

1 Like

Back then when @balbes150 thankfully tested for me I was not aware that S912’s boot blob wants to play little.LITTLE (it’s an octa-core A53 design with two little clusters so there’s no big.LITTLE here, it’s just Amlogic for whatever funny reasons shipping this SoC with a firmware that artificially limits 4 CPU cores to 1.0 GHz and 4 CPU cores to 1.4 GHz while faking the clockspeed readouts of the faster cluster for whatever reasons)

So when a load like sysbench is running that neither depends on external memory bandwidth nor on anything else happening outside the CPU cores the result with an 8 thread load is an average 1.2 GHz clockspeed and that was exactly what sysbench reported back then.

In reality the situation with S912 is much worse since while with a full load on all 8 cores at least all 4 ‘fast’ cores at 1.4 GHz are utilized with normal workloads that are single-threaded it can happen easily that a demanding task ends up on one of those bottlenecked CPU cores then limited to 1000 MHz.

On average tasks that are single-threaded are slower on Vim 2 than on Vim since on the latter all 4 CPU cores are allowed to clock at up to 1.4 GHz while on Vim2 for whatever funny reasons the scheduler keeps tasks on the artificially bottlenecked CPU cores and limiting single-threaded loads to 1 GHz.

See S912 limited to 1200 MHz with multithreaded loads - #71 by dukla2000

1 Like

It seems to me that what probably happened is AML set out with the intention of releasing a real big little design, but under testing discovered that the SOC die was way to flakey at these speeds, with throttling and unacceptably high failure rates. Its bad enough that there chip can reach 90C as it is currently configured. Performance = heat and thats basic physics, and the cost of providing an adequate cooling solution is simply beyond the margins of their target market. Instead of admitting that their design was flawed they simply lied and made their software lie.

However none of this really matters to 99.9% of their user base since the chip is well capable of running Android as a media player at performance way beyond just about all of the similarly priced products. Also in the target market of media players performance of each CPU is totally unimportant - but the ability to do multiple tasks in the background competently and in parallel is critical - hence eight low spec CPU’s makes perfect sense and high performance high speed cores make none.

This just shows what an absolutely shitty company AMLogic is. They have shat on their reputation in the western market and it seems that they have abandoned most plans to stay within that market. Why , because they have spotted an opportunity to sell tightly controlled TV boxes to the domestic Chinese market - with the software locked down and the hardware optimised for this sole purpose.

Hobby SOC user’s are about as important to AML as a nat on the arse of an elephant.
If you want a decent product then move along and spend a bit more money on one of the Samsung based chips.

Its all a bit disgusting really. Its bad enough that I have sworn off ever buying a AML SOC product again and am fairly lairy of touching any arm based product that isn’t a Android phone (where they excel). Intel based SOC products are infinitely better performance and peripheral support wise, and are getting to be cost and energy competitive with the ARM based SOC chips and that is where my money will go in the future.

Shoog

1 Like

That’s why Google and others are choosing Amlogic (an US based company BTW) SoCs these days?

Out of curiousity: Which cpufreqs is 7-zip reporting when running on the Vim2? On a Debian/Ubuntu based distro it should be sufficient to do

apt install p7zip
taskset -c 5-7 7zr b
taskset -c 0-3 7zr b
root@vim2p:~# taskset -c 5-7 7zr b

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq:   999   999   999   999   999   999   999   999

RAM size:    3014 MB,  # CPU hardware threads:   8
RAM usage:   1765 MB,  # Benchmark threads:      8

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       1740   297    571   1693  |      33586   299    958   2865
23:       1680   295    580   1713  |      32992   299    954   2855
24:       1599   291    590   1720  |      32095   298    947   2817
25:       1554   294    603   1775  |      30599   293    928   2723
----------------------------------  | ------------------------------
Avr:             294    586   1725  |              297    947   2815
Tot:             296    766   2270
root@vim2p:~# taskset -c 0-3 7zr b

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq:  1410  1415  1415  1411  1415  1415  1415  1415  1414

RAM size:    3014 MB,  # CPU hardware threads:   8
RAM usage:   1765 MB,  # Benchmark threads:      8

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       2799   385    708   2724  |      61871   399   1323   5277
23:       2751   397    706   2804  |      60679   399   1315   5251
24:       2627   396    713   2826  |      59275   399   1303   5203
25:       2517   396    726   2875  |      56530   396   1270   5031
----------------------------------  | ------------------------------
Avr:             393    713   2807  |              398   1303   5190
Tot:             396   1008   3999

Thanks a bunch. So even 7-zip’s benchmark mode is sufficient to confirm Amlogic cheating with clockspeeds :slight_smile:

On the other hand for whatever reasons the ‘per core per GHz’ performance also differs a lot between both clusters:

  • 0-3: 3999 7-zip MIPS at 1415 MHz → 707 per core @ 1 GHz
  • 4-7: 2270 7-zip MIPS at 1000 MHz → 568 per core @ 1 GHz

I fail to interpret the numbers since especially decompression speed is almost twice as fast on cores 0-3 (see comments at the bottom of https://www.7-cpu.com)

@g4b42 In case time permits are you able to run sbc-bench neon on your Vim2?

We started to assemble some benchmarks over there https://github.com/ThomasKaiser/sbc-bench and included a lot of monitoring to get a clue what’s going on on platforms that behave somewhat strange (RPi and Amlogic S905X/S912 as best examples).

Hi tkaiser,

I run sbc-bench on Khadas VIM2 Pro (2GB DDR).

This is our build image, I will try armbian image later.

System info:

khadas@Khadas:~$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

Kernel 4.17.3:

khadas@Khadas:~$ uname -a
Linux Khadas 4.17.3 #1 SMP PREEMPT Mon Jul 30 03:06:44 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux

Result:

khadas@Khadas:~$ sudo /bin/bash ./sbc-bench.sh neon

Average load is above 0.1. Way too much background activity.

System too busy for benchmarking: 06:14:56 up  1:55,  3 users,  load average: 0.14, 0.19, 0.43
System too busy for benchmarking: 06:15:01 up  1:55,  3 users,  load average: 0.12, 0.19, 0.42
System too busy for benchmarking: 06:15:06 up  1:55,  3 users,  load average: 0.11, 0.19, 0.42
System too busy for benchmarking: 06:15:11 up  1:55,  3 users,  load average: 0.10, 0.18, 0.42
System too busy for benchmarking: 06:15:16 up  1:55,  3 users,  load average: 0.10, 0.18, 0.42

sbc-bench v0.4

Installing needed tools. This may take some time... Done.
Checking cpufreq OPP... Done.
Executing tinymembench. This will take a long time... Done.
Executing OpenSSL benchmark. This will take 3 minutes... Done.
Executing 7-zip benchmark. This will take a long time... Done.
Executing cpuminer. This will take 5 minutes... Done.
Checking cpufreq OPP... Done.

Memory performance (big.LITTLE cores measured individually):
memcpy: 1922.6 MB/s (1.0%)
memset: 5917.9 MB/s (0.4%)
memcpy: 1756.5 MB/s (0.9%)
memset: 5112.5 MB/s 

Cpuminer total scores (5 minutes execution): 8.61,8.60,8.59,8.58,8.57 kH/s

7-zip total scores (3 consecutive runs): 5421,5483,5460

OpenSSL results (big.LITTLE cores measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc     126769.69k   374826.41k   716033.19k   955438.42k  1059151.87k  1064255.49k
aes-128-cbc      89687.37k   265021.03k   506280.02k   675587.75k   748544.00k   753303.55k
aes-192-cbc     120395.16k   331951.45k   583681.45k   736646.83k   797349.21k   800828.07k
aes-192-cbc      85160.92k   235401.39k   412691.03k   520572.25k   563516.76k   566127.27k
aes-256-cbc      92078.63k   260123.78k   468108.63k   601614.68k   655682.22k   659603.46k
aes-256-cbc      82717.74k   216091.39k   357307.82k   435378.86k   465046.19k   466780.16k

Full results uploaded to http://ix.io/1iJ7. Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

You can find the full result here.

Thanks.

1 Like

Thank you!

Some interesting stuff! :slight_smile:

  • The ‘big’ cluster with this kernel is cluster 0 unlike all real big.LITTLE implementations out there where the little cluster is 0. Therefore in sbc-bench monitoring output big.LITTLE column shows frequencies wrong
  • Willy Tarreau’s tool again confirms that the 1512 MHz the kernel is talking about are just 1414 MHz in reality
  • tinymembench numbers for both clusters are slightly different but that’s most probably just related to different clockspeeds the 2 A53 clusters are allowed to run at. I wonder how it would look like when doing echo 1000000> /sys/devices/system/cpu/cpu${1}/cpufreq/scaling_max_freq ; taskset -c 0 /tmp//tinymembench/tinymembench (forcing the ‘big’ cluster also to 1GHz and then executing tinymembench there)
  • Little throttling happened when running the 7-zip benchmark single threaded on each cluster. But since your image relies on zram for swap this should’ve not affected results

I added your numbers to results. Would be very interesting to repeat the test with a Debian Stretch and/or 4.9 kernel (if that one is still in use – I have not kept track with meson64 kernel situation)