Run large model Qwen 1.8B Chat on Edge2

This is just a testing version.

Download convert tool

$ git clone

Install convert environment

Follow this docs to install Conda on PC Linux.

After installing, create a new conda environment.

$ conda create -n RKLLM-Toolkit python=3.8
$ conda activate RKLLM-Toolkit     #activate
$ conda deactivate                 #deactivate

Install dependencies

$ cd rknn-llm/rkllm-toolkit/packages
$ pip3 install rkllm_toolkit-1.0.0-cp38-cp38-linux_x86_64.whl

Check whether install successfully.

$ python
$ from rkllm.api import RKLLM


Download Qwen-1.8B-Chat model in rknn-llm/rkllm-toolkit/examples/huggingface

$ cd rknn-llm/rkllm-toolkit/examples/huggingface
$ git lfs install
$ git clone

Modify ‘’’’ as follows.

diff --git a/rkllm-toolkit/examples/huggingface/ b/rkllm-toolkit/examples/huggingface/
index c253fe4..406ad37 100644
--- a/rkllm-toolkit/examples/huggingface/
+++ b/rkllm-toolkit/examples/huggingface/
@@ -5,7 +5,7 @@
Download the Qwen model from the above website.

-modelpath = '/path/to/your/model'
+modelpath = './Qwen-1_8B-Chat'
llm = RKLLM()

# Load model

Run ‘’’’ to generate rkllm model.

$ python

Model ‘’qwen.rkllm’’ will generate in ‘’knn-llm/rkllm-toolkit/examples/huggingface’’.

Run on Edge2

The code is ‘’rknn-llm/rkllm-runtime’’. You can git clone again on Edge2 or copy from PC. Then pull ‘’qwen.rkllm’’ in ‘’rknn-llm/rkllm-runtime/example’’.

Modify ‘’rknn-llm/rkllm-runtime/example/’’ as follows.

diff --git a/rkllm-runtime/example/ b/rkllm-runtime/example/
index 712b3be..bc5c575 100644
--- a/rkllm-runtime/example/
+++ b/rkllm-runtime/example/
@@ -4,7 +4,7 @@ if [[ -z ${BUILD_TYPE} ]];then



Then, compile and run.

$ cd rknn-llm/rkllm-runtime/example
$ bash
$ export LD_LIBRARY_PATH=/home/khadas/rkllm-runtime/runtime/Linux/librkllm_api/aarch64/
$ cd build/build_linux_aarch64_Release
$ ulimit -n 10240
$ ./llm_demo ../../qwen.rkllm