Who is more powerful Mali-G52 MP4 or Mali-T860 MP4? According to information on the Internet, they differ only in power consumption and area on the chip. Or is it not so?
G52 has higher GFLOPs per core even in 2EE configuration. As both have 4 cores, the G52 is going to blow past the T860.
In different sources I see that both have approximately the same performance of 106-108 Gflops for float32.
I based mine on Wikipedia’s table, I don’t know any other sources.
In theory G52 is faster but due to memory bandwidth issues T860 is performing faster than G52 in real world test done by Alyssa who have worked on Panfrost drivers.
its actually 2 shaders with 3EE per core
How much EE per core does VIM3’s GPU actually have? Two or three? Where is this information?
What is the bandwidth issue? Where can I find out more about this?
Check the panfrost git to see if they have any benchmarking done.
pls refer to this thread:
But here she is talking about S922. And VIM3 uses A311D.
Summarize.
According to this table Mali (GPU) - Wikipedia the A311D GPU has a performance of 40.8 GFLOPS per core, assuming one core has 3EE. According to the datasheet https://dl.khadas.com/Hardware/VIM3/Datasheet/A311D_Datasheet_01_Wesion.pdf A311D has 2 GPU cores. Total 2 * 40.8 = 81.6 GFLOPS. That is, the performance of the Mali-T860 MP4 at 700 MHz is 23.8 * 4 = 95.2 GFLOPS.
Conclusion: Mali-T860 MP4 at 700 MHz faster than Mali-G52 6EE at 850 MHz.
A311D is same S922X with NPU and some other small changes
both fall under the same meson G12B family as well.
What do these mbw
numbers represent? -t0: memcpy() test, -t1: dumb (b[i]=a[i] style) test, -t2: memcpy() with arbitrary block size
This is a quick search through my sbc-bench results collection. Not generated by mbw
but tinymembench
.
Spoiler alert: S922/S922X numbers are much lower than A311D numbers. Guess why: some smelly boot BLOB doing DRAM initialization.
VIM3/A311D:
Kernel | Clockspeeds | memcpy | memset |
---|---|---|---|
4.9 | 2208/1800 MHz | 4600 MB/sec | 8990 MB/sec |
4.9 | 2208/1800 MHz | 4660 MB/sec | 9230 MB/sec |
4.9 | 2208/1800 MHz | 4660 MB/sec | 9280 MB/sec |
4.9 | 2208/1800 MHz | 4690 MB/sec | 9280 MB/sec |
4.9 | 2400/2100 MHz | 5080 MB/sec | 9350 MB/sec |
5.10 | 2400/2016 MHz | 4370 MB/sec | 6720 MB/sec |
5.10 | 2400/2016 MHz | 4420 MB/sec | 6640 MB/sec |
5.10 | 2400/2016 MHz | 4770 MB/sec | 6580 MB/sec |
5.10 | 2400/2016 MHz | 4770 MB/sec | 6580 MB/sec |
5.10 | 2400/2016 MHz | 4840 MB/sec | 8260 MB/sec |
5.10 | 2400/2016 MHz | 4850 MB/sec | 7370 MB/sec |
5.10 | 2400/2016 MHz | 4850 MB/sec | 7380 MB/sec |
5.10 | 2400/2016 MHz | 4850 MB/sec | 8100 MB/sec |
5.16 | 2208/1800 MHz | 5000 MB/sec | 9560 MB/sec |
5.17 | 2208/1800 MHz | 4800 MB/sec | 9330 MB/sec |
5.17 | 2208/1800 MHz | 4860 MB/sec | 9150 MB/sec |
5.18 | 2208/1800 MHz | 5000 MB/sec | 9840 MB/sec |
5.18 | 2208/1800 MHz | 5020 MB/sec | 9650 MB/sec |
5.18 | 2208/1800 MHz | 5070 MB/sec | 9460 MB/sec |
ODROID-N2/S922:
Kernel | Clockspeeds | memcpy | memset |
---|---|---|---|
5.10 | 1992/1908 MHz | 3740 MB/sec | 7500 MB/sec |
5.10 | 1992/1908 MHz | 4250 MB/sec | 9090 MB/sec |
5.10 | 1992/1908 MHz | 4260 MB/sec | 9080 MB/sec |
5.10 | 1992/1908 MHz | 4260 MB/sec | 9080 MB/sec |
5.10 | 1992/1908 MHz | 4270 MB/sec | 7670 MB/sec |
5.15 | 1908/1800 MHz | 3900 MB/sec | 7440 MB/sec |
5.15 | 1992/1908 MHz | 3910 MB/sec | 7700 MB/sec |
5.15 | 1992/1908 MHz | 3990 MB/sec | 7970 MB/sec |
5.15 | 2004/1992 MHz | 3820 MB/sec | 7790 MB/sec |
5.15 | 2004/1992 MHz | 3850 MB/sec | 7630 MB/sec |
5.15 | 2004/1992 MHz | 3850 MB/sec | 7710 MB/sec |
5.17 | 1992/1908 MHz | 4190 MB/sec | 8690 MB/sec |
ODROID-N2/S922-X:
Kernel | Clockspeeds | memcpy | memset |
---|---|---|---|
4.9 | 2400/2016 MHz | 3850 MB/sec | 5970 MB/sec |
5.10 | 2400/2016 MHz | 3770 MB/sec | 7610 MB/sec |
5.10 | 2400/2016 MHz | 3770 MB/sec | 7620 MB/sec |
5.10 | 2400/2016 MHz | 3910 MB/sec | 7220 MB/sec |
5.10 | 2400/2016 MHz | 3980 MB/sec | 7670 MB/sec |
5.10 | 2400/2016 MHz | 3990 MB/sec | 7460 MB/sec |
5.10 | 2400/2016 MHz | 4000 MB/sec | 6980 MB/sec |
5.10 | 2400/2016 MHz | 4000 MB/sec | 7030 MB/sec |
5.10 | 2400/2016 MHz | 4020 MB/sec | 7140 MB/sec |
5.10 | 2400/2016 MHz | 4020 MB/sec | 7320 MB/sec |
5.10 | 2400/2016 MHz | 4030 MB/sec | 7120 MB/sec |
5.10 | 2400/2016 MHz | 4030 MB/sec | 7690 MB/sec |
5.10 | 2400/2016 MHz | 4030 MB/sec | 7690 MB/sec |
5.10 | 2400/2016 MHz | 4070 MB/sec | 7220 MB/sec |
5.10 | 2400/2016 MHz | 4090 MB/sec | 7170 MB/sec |
5.10 | 2400/2016 MHz | 4140 MB/sec | 7410 MB/sec |
5.10 | 2400/2016 MHz | 4140 MB/sec | 7710 MB/sec |
5.10 | 2400/2016 MHz | 4160 MB/sec | 7680 MB/sec |
5.10 | 2400/2016 MHz | 4180 MB/sec | 7700 MB/sec |
5.10 | 2400/2016 MHz | 4190 MB/sec | 7690 MB/sec |
5.10 | 2400/2016 MHz | 4200 MB/sec | 7680 MB/sec |
5.10 | 2400/2016 MHz | 4210 MB/sec | 7730 MB/sec |
5.10 | 2400/2016 MHz | 4220 MB/sec | 7730 MB/sec |
5.10 | 2400/2016 MHz | 4220 MB/sec | 7730 MB/sec |
5.10 | 2400/2016 MHz | 4240 MB/sec | 7740 MB/sec |
5.10 | 2400/2016 MHz | 4290 MB/sec | 7730 MB/sec |
5.14 | 2400/2016 MHz | 4030 MB/sec | 7120 MB/sec |
5.15 | 2400/2016 MHz | 4000 MB/sec | 7660 MB/sec |
5.15 | 2400/2016 MHz | 4010 MB/sec | 7680 MB/sec |
5.15 | 2400/2016 MHz | 4030 MB/sec | 7700 MB/sec |
5.15 | 2400/2016 MHz | 4040 MB/sec | 7680 MB/sec |
5.15 | 2400/2016 MHz | 4100 MB/sec | 7730 MB/sec |
5.15 | 2400/2016 MHz | 4140 MB/sec | 7720 MB/sec |
5.16 | 2400/2016 MHz | 3960 MB/sec | 7610 MB/sec |
5.16 | 2400/2016 MHz | 4160 MB/sec | 7460 MB/sec |
5.16 | 2400/2016 MHz | 4190 MB/sec | 7470 MB/sec |
5.16 | 2400/2016 MHz | 4200 MB/sec | 7470 MB/sec |
5.16 | 2400/2016 MHz | 4200 MB/sec | 7470 MB/sec |
5.16 | 2400/2016 MHz | 4200 MB/sec | 7480 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7410 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7420 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7460 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7470 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7480 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7480 MB/sec |
5.16 | 2400/2016 MHz | 4210 MB/sec | 7480 MB/sec |
5.16 | 2400/2016 MHz | 4220 MB/sec | 7450 MB/sec |
5.16 | 2400/2016 MHz | 4220 MB/sec | 7460 MB/sec |
5.16 | 2400/2016 MHz | 4220 MB/sec | 7460 MB/sec |
5.17 | 2400/2016 MHz | 4020 MB/sec | 7690 MB/sec |
But ‘measured in the wild’ at least with tinymembench A311D shows a lot higher memory bandwidth compared to S922X. Taking highest scores from above and comparing mbw
numbers given by Alyssa with tinymembench
memcpy values.
SoC | Clockspeeds | tinymembench | mbw |
---|---|---|---|
A311D | 2210 MHz | 5070 MB/s | ? |
S922X | ? MHz | 4220 MB/s | 4.8 GiB/s |
RK3399 | ? MHz | 3700 MB/s | 6.6 GiB/s |
M1 Pro | 3000 MHz | 27000 MB/s | 30.2 GiB/s |
Has anybody a clue what exactly Alyssa measured (mbw command line)? Since the numbers look really weird giving old and boring RK3399 almost 40% higher memory bandwidth compared to S922X (which should be slower here compared to A311D).