I’m planning to send this to the kernel, as while it is 100% a ‘hack’ it also seems to be needed and the only way to solve the stall issues. Amlogic does similar in the vendor kernel and various experiments with less opp points being removed and voltages boosted etc. never seem to work.
I’d noticed the stalls while a retro-gaming derivative of the distro I work on (LibreELEC) wasn’t seeing them despite using the same kernel sources. One of the regular differences is the gaming folks force the performance governor on CPU/GPU so I’ve tried that change and seen that stalls stopped. That prompted some device-tree research in the vendor kernel where I’ve noticed Amlogic deleted the 100/250Mhz nodes, and then I’ve seen another vendor-kernel using distro deleting the 500/667MHz nodes. Testing with those removed from the upstream kernel gave the same (no stalls) result. One of the upstream kernel maintainers has suggested cpu opp-point voltage tweaks (another things Amlogic sources fiddle with) but those never resolved the issue. So the working change has been found through a tyypical combination of luck, educated guesswork, research, and a lot of trial-error testing. I’d love to have a more impressive story about analysis and diagnostics, but I’m not a coding developer so I have to fall-back onto old-school Engineering method … test incremental changes and observe the system to see if you can identify differences.