I forgot to update this post to say that I saw the required module versions vs the current versions, so I updated all of them and made progress installing them (still 5 hours to go for the libraries) but I still would like to communicate more with other users that are trying to use the NPU and related processes. Thanks!
Just to add that I was able to get it all running, problem was broken files in download, I have a good set of instructions for getting it working, but still I am surprised there are not more people here using the NPU to do AI modeling and queries. Not enough interest to get a topic setup?
I’m very interested with running LLMs locally on RK3588s. Currently I’m trying online models and frameworks (langchain, whisper, openai). And I want to play with local models as well.
Cool, I have locally installed and modded my LLM to run off a 128GB SD, I followed these instructions with some minor tweaks. Let me know if you get it going!
Hey, thanks for the guides @TechnoTarzan! I have a couple of edge2’s and was looking at getting them running through docker containers. Does your setup leverage the npu? Do you get any performance improvements? Any chance you could share your performance here?
Greets @goosems! I haven’t done any benchmarking per se, just looking at different ways to integrate the NPU. I have found these links that have helped me a lot, surprised there are not more users of the Edge using this incredible feature. I guess I can make a baseline for where I am now, and tweak it to see if I get a faster query, let me look into it. Here are the links I used to create my setup… Be warned, its a lot of reading, but VERY specific and filled in a lot of gaps in my knowledge base…
Stay in touch with your progress! You’re one of the only people actually focusing on this part of the Edge on this forum! Cheers!!!
Hey thanks for getting back so quickly @TechnoTarzan!
Also thanks for the resources here, it really helps. I am surprised as well as model sizes are coming down fast and I have seen some effort to get things running on PI’s. I’ll have a go at getting the demos setup and play around, I’ll definitely update on progress!
Hey @TechnoTarzan, I haven’t had a long time to dive too deep and my CPP knowledge is limited, but I found this fork of llama.cpp that someone is experimenting with.