S912 limited to 1200 MHz with multithreaded loads


#21

Did a reboot, no change in results (10s “constant”). Did obseve temperature barely moves FWIW. Shutdown desktop, same result (although couldnt watch gkrellm). Did put a stopwatch on part of the test - the results lines really do appear every 10 seconds!


#22

Gee numbqq, how come you can get the right answer and I can’t!! Fortunately g4b42 has joined me in the idiots corner so I don’t feel so stupid!! :grinning:


#23

Hi dukla2000,

Have you tried this script provided by @tkaiser ?

#!/bin/bash
echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance >/sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
for o in 1 4 8 ; do
	for i in $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies) ; do
		echo $i >/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
		echo $i >/sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq 2>/dev/null
		case $o in
			1)
				TasksetParm="-c 0"
				;;
			4)
				TasksetParm="-c 0-3"
				;;
			*)
				TasksetParm="-c 0-7"
				;;
		esac
		echo -e "$o cores, $(( $i / 1000)) MHz: \c"
		taskset ${TasksetParm} sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=$o 2>&1 | grep 'execution time'
		cat /sys/devices/virtual/thermal/thermal_zone0/temp
	done
done

#24

Wow, that makes a real difference since now cpufreq settings and reality match very closely:

fake      1        4       8
 100      96      97      97
 250     247     247     247
 500     498     498     498
1000    1000    1000    1000
1512    1417    1417    1216

So we have only the mismatch between 1512 MHz and the ~1420 MHz in reality and when running on all 8 cores the results pretty much describe a system running with 4 cores at 1.0 GHz and 4 at 1.4 GHz.

I only totally fail to understand what’s happening since others are reporting totally weird results. Kernel, ATF and bl30.bin version could matter and also HMP settings in the kernel config since with fixed CPU affinity at least your board starts to behave ‘sane’…


#25

? sysbench version? I have

$ sysbench -i
sysbench 1.0.8 (using system LuaJIT 2.0.4)

#26

O yes, sysbench version is the most important factor and also compiler version the binary has been compiled with (related to distro). All my testing so far happened with sysbench 0.4.2 which somewhat behaves consistently wrt thread count and so on (sysbench 0.5 which I compiled myself last year showed lower numbers and sysbench 1.0.8 is obviously broken).


#27

sysbench version 1.0.14 here


#28

Can one of you guys please provide full output? Obviously the measurement mode has changed and tests seem now to be limited to 10 seconds execution then providing a performance indicator or something like that (while with eaerlier versions only execution time was interesting)


#29
root@amlogic:~# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
root@amlogic:~# 
root@amlogic:~# sysbench --version
sysbench 0.4.12
root@amlogic:~# 

#30
# sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=8 2
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
Unrecognized command line argument: 2

have to leave, back later


#31

You forget that a lot of time has passed since then. The system has changed. The kernel configuration has changed and the options responsible for the correct operation of big.lite may not have been previously enable dor the number of cores used. Have changed the kernel sources. The compiler that is used to build has changed. Changed many utilities / software and their settings (other sources used patches with their Assembly, etc.). Dtb files changed.

If you want to get the right results, you should use a single image (system) with all settings.


#32

Sorry, it’s about the simple issue whether the cpufreq code living in the kernel controls CPU clockspeeds or something else. With Raspberry Pi and Amlogic it’s something else while on most other SoCs it’s the kernel controlling clockspeeds.

If I adjust /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq with performance cpufreq governor and set 1512 MHz then this should happen especially if the kernel reports having done this. There are some situations where the kernel might disagree (throttling) but then if you query /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq the real clockspeed should be returned and not just some bogus number as it’s always the case with Amlogic SoCs (S905 on ODROID-C2 being the only exception since Hardkernel got a special BLOB from Amlogic allowing real clockspeed control).

We’re dealing here with a platform implementing bogus cpufreq adjustment. Isn’t there a Cortex-M core inside the SoC dealing with this stuff?

At least on the Raspberries the real cpufreq scaling happens solely on the VideoCore implemented in the proprietary ThreadX RTOS and they just like Amlogic chose to return bogus values to the kernel. Their excuse can be read here: https://github.com/raspberrypi/linux/issues/2512#issuecomment-382703153


#33

btw, I’ve been wondering if there’s any way to control CPU DVFS on amlogic SoCs without passing through the SCPI firmware… (maybe not since there is an internal PMIC)


#34

According to http://events17.linuxfoundation.org/sites/events/files/slides/elcna-2017-amlogic.pdf there’s an SCM firmware running on an embedded Cortex-M3 and all communications is through a Mailbox interface. So I would assume it’s not only about ATF but also about the ‘firmware’ loaded on the M3 core…

Maybe @narmstrong knows a bit more?


#35

it’s not clear to me where this M3’s firmware comes from, if it’s loaded by BL2/ATF or is present in maskrom. feedback from @narmstrong would indeed be great


#36

According to the Libre Computer guys it’s part of a BLOB, see 2nd post https://forum.armbian.com/topic/7042-s905912-faulty-clockspeed-arm-trusted-firmware/ (S905X is also affected and based on similar tests a few months ago the 1512 MHz there are also lower in reality)


#37

The first 4 cores are limited to 1512MHz, and the 4 last cores are limited to 1GHz.

And yes, you can only control DVFS using SCPI since it’s in control of the M3 co-processor.

The logic is in the M3 firmware, but the DVFS tables are built with U-boot and loaded by the ATF firmware, but you won’t be able to go further these frequencies.

If you run code among the 8 cores, you won’t have max performance since 4 of them are limited to 1GHz.


#38

We do not even reach the frequencies defined by the DVFS table. That’s my only problem. With current kernel/ATF combination the ‘big’ cores max out at 1416 MHz. And I wonder which component is responsible for this.


#39

is that firmware signed using an amlogic private key ? (ie. to which extent could it be modified, even if it’s just by poking around through reverse engineering)


#40

BTW: With S905X and ‘default’ bl30.bin BLOB this SoC is limited to 1200 MHz (while reporting 1512 MHz via /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq) and only with a modified BLOB it is able to reach higher clockspeeds (~1470MHz while still reporting 1512 MHz via sysfs): https://forum.armbian.com/topic/5570-some-basic-benchmarks-for-le-potato/

IMO a pretty annoying situation wrt Amlogic SoCs when we can neither trust in nor set clockspeeds like we want.