Who is more powerful Mali-G52 MP4 or Mali-T860 MP4?

Who is more powerful Mali-G52 MP4 or Mali-T860 MP4? According to information on the Internet, they differ only in power consumption and area on the chip. Or is it not so?

G52 has higher GFLOPs per core even in 2EE configuration. As both have 4 cores, the G52 is going to blow past the T860.

1 Like

In different sources I see that both have approximately the same performance of 106-108 Gflops for float32.

I based mine on Wikipedia’s table, I don’t know any other sources.

In theory G52 is faster but due to memory bandwidth issues T860 is performing faster than G52 in real world test done by Alyssa who have worked on Panfrost drivers.

2 Likes

its actually 2 shaders with 3EE per core :slightly_smiling_face:

How much EE per core does VIM3’s GPU actually have? Two or three? Where is this information?

What is the bandwidth issue? Where can I find out more about this?

Check the panfrost git to see if they have any benchmarking done.

1 Like

pls refer to this thread:

But here she is talking about S922. And VIM3 uses A311D.

Summarize.
According to this table Mali (GPU) - Wikipedia the A311D GPU has a performance of 40.8 GFLOPS per core, assuming one core has 3EE. According to the datasheet https://dl.khadas.com/Hardware/VIM3/Datasheet/A311D_Datasheet_01_Wesion.pdf A311D has 2 GPU cores. Total 2 * 40.8 = 81.6 GFLOPS. That is, the performance of the Mali-T860 MP4 at 700 MHz is 23.8 * 4 = 95.2 GFLOPS.

Conclusion: Mali-T860 MP4 at 700 MHz faster than Mali-G52 6EE at 850 MHz.

A311D is same S922X with NPU and some other small changes
both fall under the same meson G12B family as well.

What do these mbw numbers represent? -t0: memcpy() test, -t1: dumb (b[i]=a[i] style) test, -t2: memcpy() with arbitrary block size

This is a quick search through my sbc-bench results collection. Not generated by mbw but tinymembench.

Spoiler alert: S922/S922X numbers are much lower than A311D numbers. Guess why: some smelly boot BLOB doing DRAM initialization.

VIM3/A311D:

Kernel Clockspeeds memcpy memset
4.9 2208/1800 MHz 4600 MB/sec 8990 MB/sec
4.9 2208/1800 MHz 4660 MB/sec 9230 MB/sec
4.9 2208/1800 MHz 4660 MB/sec 9280 MB/sec
4.9 2208/1800 MHz 4690 MB/sec 9280 MB/sec
4.9 2400/2100 MHz 5080 MB/sec 9350 MB/sec
5.10 2400/2016 MHz 4370 MB/sec 6720 MB/sec
5.10 2400/2016 MHz 4420 MB/sec 6640 MB/sec
5.10 2400/2016 MHz 4770 MB/sec 6580 MB/sec
5.10 2400/2016 MHz 4770 MB/sec 6580 MB/sec
5.10 2400/2016 MHz 4840 MB/sec 8260 MB/sec
5.10 2400/2016 MHz 4850 MB/sec 7370 MB/sec
5.10 2400/2016 MHz 4850 MB/sec 7380 MB/sec
5.10 2400/2016 MHz 4850 MB/sec 8100 MB/sec
5.16 2208/1800 MHz 5000 MB/sec 9560 MB/sec
5.17 2208/1800 MHz 4800 MB/sec 9330 MB/sec
5.17 2208/1800 MHz 4860 MB/sec 9150 MB/sec
5.18 2208/1800 MHz 5000 MB/sec 9840 MB/sec
5.18 2208/1800 MHz 5020 MB/sec 9650 MB/sec
5.18 2208/1800 MHz 5070 MB/sec 9460 MB/sec

ODROID-N2/S922:

Kernel Clockspeeds memcpy memset
5.10 1992/1908 MHz 3740 MB/sec 7500 MB/sec
5.10 1992/1908 MHz 4250 MB/sec 9090 MB/sec
5.10 1992/1908 MHz 4260 MB/sec 9080 MB/sec
5.10 1992/1908 MHz 4260 MB/sec 9080 MB/sec
5.10 1992/1908 MHz 4270 MB/sec 7670 MB/sec
5.15 1908/1800 MHz 3900 MB/sec 7440 MB/sec
5.15 1992/1908 MHz 3910 MB/sec 7700 MB/sec
5.15 1992/1908 MHz 3990 MB/sec 7970 MB/sec
5.15 2004/1992 MHz 3820 MB/sec 7790 MB/sec
5.15 2004/1992 MHz 3850 MB/sec 7630 MB/sec
5.15 2004/1992 MHz 3850 MB/sec 7710 MB/sec
5.17 1992/1908 MHz 4190 MB/sec 8690 MB/sec

ODROID-N2/S922-X:

Kernel Clockspeeds memcpy memset
4.9 2400/2016 MHz 3850 MB/sec 5970 MB/sec
5.10 2400/2016 MHz 3770 MB/sec 7610 MB/sec
5.10 2400/2016 MHz 3770 MB/sec 7620 MB/sec
5.10 2400/2016 MHz 3910 MB/sec 7220 MB/sec
5.10 2400/2016 MHz 3980 MB/sec 7670 MB/sec
5.10 2400/2016 MHz 3990 MB/sec 7460 MB/sec
5.10 2400/2016 MHz 4000 MB/sec 6980 MB/sec
5.10 2400/2016 MHz 4000 MB/sec 7030 MB/sec
5.10 2400/2016 MHz 4020 MB/sec 7140 MB/sec
5.10 2400/2016 MHz 4020 MB/sec 7320 MB/sec
5.10 2400/2016 MHz 4030 MB/sec 7120 MB/sec
5.10 2400/2016 MHz 4030 MB/sec 7690 MB/sec
5.10 2400/2016 MHz 4030 MB/sec 7690 MB/sec
5.10 2400/2016 MHz 4070 MB/sec 7220 MB/sec
5.10 2400/2016 MHz 4090 MB/sec 7170 MB/sec
5.10 2400/2016 MHz 4140 MB/sec 7410 MB/sec
5.10 2400/2016 MHz 4140 MB/sec 7710 MB/sec
5.10 2400/2016 MHz 4160 MB/sec 7680 MB/sec
5.10 2400/2016 MHz 4180 MB/sec 7700 MB/sec
5.10 2400/2016 MHz 4190 MB/sec 7690 MB/sec
5.10 2400/2016 MHz 4200 MB/sec 7680 MB/sec
5.10 2400/2016 MHz 4210 MB/sec 7730 MB/sec
5.10 2400/2016 MHz 4220 MB/sec 7730 MB/sec
5.10 2400/2016 MHz 4220 MB/sec 7730 MB/sec
5.10 2400/2016 MHz 4240 MB/sec 7740 MB/sec
5.10 2400/2016 MHz 4290 MB/sec 7730 MB/sec
5.14 2400/2016 MHz 4030 MB/sec 7120 MB/sec
5.15 2400/2016 MHz 4000 MB/sec 7660 MB/sec
5.15 2400/2016 MHz 4010 MB/sec 7680 MB/sec
5.15 2400/2016 MHz 4030 MB/sec 7700 MB/sec
5.15 2400/2016 MHz 4040 MB/sec 7680 MB/sec
5.15 2400/2016 MHz 4100 MB/sec 7730 MB/sec
5.15 2400/2016 MHz 4140 MB/sec 7720 MB/sec
5.16 2400/2016 MHz 3960 MB/sec 7610 MB/sec
5.16 2400/2016 MHz 4160 MB/sec 7460 MB/sec
5.16 2400/2016 MHz 4190 MB/sec 7470 MB/sec
5.16 2400/2016 MHz 4200 MB/sec 7470 MB/sec
5.16 2400/2016 MHz 4200 MB/sec 7470 MB/sec
5.16 2400/2016 MHz 4200 MB/sec 7480 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7410 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7420 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7460 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7470 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7480 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7480 MB/sec
5.16 2400/2016 MHz 4210 MB/sec 7480 MB/sec
5.16 2400/2016 MHz 4220 MB/sec 7450 MB/sec
5.16 2400/2016 MHz 4220 MB/sec 7460 MB/sec
5.16 2400/2016 MHz 4220 MB/sec 7460 MB/sec
5.17 2400/2016 MHz 4020 MB/sec 7690 MB/sec

But ‘measured in the wild’ at least with tinymembench A311D shows a lot higher memory bandwidth compared to S922X. Taking highest scores from above and comparing mbw numbers given by Alyssa with tinymembench memcpy values.

SoC Clockspeeds tinymembench mbw
A311D 2210 MHz 5070 MB/s ?
S922X ? MHz 4220 MB/s 4.8 GiB/s
RK3399 ? MHz 3700 MB/s 6.6 GiB/s
M1 Pro 3000 MHz 27000 MB/s 30.2 GiB/s

Has anybody a clue what exactly Alyssa measured (mbw command line)? Since the numbers look really weird giving old and boring RK3399 almost 40% higher memory bandwidth compared to S922X (which should be slower here compared to A311D).

Her archived twitter messages.

2 Likes