6. Model Deployment Advanced Guide
6.1. Overview
This section provides code examples for running joint
models generated by the Pulsar
compilation, all provided by the ax-samples
project.
The purpose of the project is to provide sample code for the deployment of the industry’s leading open source algorithm models, and to facilitate the community to quickly evaluate and adapt AXera’s chips.
6.1.1. Access
Hint
The offline version is obtained from GitHub as this document is released, and there is a delay, so please choose the GitHub version if you want to experience the latest features.
6.1.2. ax-samples Introduction
The current ax-samples
has validated but is not limited to the following open source models:
Classification Models
SqueezeNetv1.1
MobileNetv1
MobileNetv2
ResNet18
ResNet50
VGG16
Others……
Detection Model
PP-YOLOv3
YOLOv3
YOLOv3-Tiny
YOLOv4
YOLOv4-Tiny
YOLOv5m
YOLOv5s
YOLOv7-Tiny
YOLOX-S
YOLO-Fastest-XL
Human Detection
YOLO-Fastest-Body
Face Detection
scrfd
Obstacle detection (sweeper scene)
Robot-Obstacle-Detect
3D Monocular Vehicle Detection
Monodlex
Human Body Key Points
HRNet
Human Segmentation
PP-HumanSeg
Semantic Segmentation
PP-Seg
Pose Model
HRNet
Validated hardware platforms
AX630A
AX620A/U
ax-sampless
directory description
$ tree -L 2
.
├── CMakeLists.txt
├── LICENSE
├── README.md
├── README_EN.md
├── benchmark
│ └── README.md
├── cmake
│ ├── check.cmake
│ └── summary.cmake
├── docs
│ ├── AX620A.md
│ ├── AX620U.md
│ ├── body_seg_bg_res.jpg
│ ├── compile.md
│ ├── seg_res.jpg
│ └── yolov3_paddle.jpg
├── examples
│ ├── CMakeLists.txt
│ ├── README.md
│ ├── ax_classification_accuracy.cc
│ ├── ax_classification_nv12_resize_steps.cc
│ ├── ax_classification_steps.cc
│ ├── ax_crop_resize_nv12.cc
│ ├── ax_hrnet_steps.cc
│ ├── ax_ld_model_mmap.cc
│ ├── ax_models_load_inspect.cc
│ ├── ax_monodlex_steps.cc
│ ├── ax_nanodet_steps.cc
│ ├── ax_paddle_mobilehumseg_steps.cc
│ ├── ax_paddle_mobileseg.cc
│ ├── ax_paddle_yolov3_steps.cc
│ ├── ax_robot_obstacle_detect_steps.cc
│ ├── ax_scrfd_steps.cc
│ ├── ax_yolo_fastest_body_steps.cc
│ ├── ax_yolo_fastest_steps.cc
│ ├── ax_yolov3_accuracy.cc
│ ├── ax_yolov3_steps.cc
│ ├── ax_yolov3_tiny_steps.cc
│ ├── ax_yolov4_steps.cc
│ ├── ax_yolov4_tiny_3l_steps.cc
│ ├── ax_yolov4_tiny_steps.cc
│ ├── ax_yolov5s_620u_steps.cc
│ ├── ax_yolov5s_steps.cc
│ ├── ax_yolov7_steps.cc
│ ├── ax_yoloxs_steps.cc
│ ├── base
│ ├── cv
│ ├── middleware
│ └── utilities
└── toolchains
├── aarch64-linux-gnu.toolchain.cmake
└── arm-linux-gnueabihf.toolchain.cmake
The above directory contains the console Demo
for demonstration purposes. On Linux
systems, run from the console.
6.2. Compilation examples
ax-samples source code compilation currently has two implementation paths.
Native compilation based on AX-Pi, because of the completed software development environment integrated on AX-Pi and the simplicity of operation.
Embedded Linux cross-compilation.
6.2.1. Environment preparation
cmake
version greater than or equal to3.13
AX620A
mating cross-compilation toolchainarm-linux-gnueabihf-gxx
added to the environment variables
6.2.1.1. Install cmake
There are several ways to install cmake
, but in the case of Anaconda
virtual environment, you can install it with the following command:
pip install cmake
If it is a non-virtual environment, and the system is Ubuntu
, you can install it with
sudo apt-get install cmake
If you have a lower installation version, you can also download source code compilation cmake
, as follows:
step 1: cmake official website Download
cmake
and unzip itstep 2: Go to the installation folder, and execute
. /configure make -j4 # 4 is the number of cores, you can omit it sudo make install
step 3:
cmake
After installation, check the version information with the following commandcmake --version
6.2.1.2. Install the cross-compilation tool arm-linux-gnueabihf-gxx
There are various cross-compilers, but we recommend using the Linaro
cross-compiler, which can be downloaded from arm-linux-gnueabihf-gxx
,
You can download the files from arm-linux-gnueabihf-gxx`_, where ``gcc-linaro-7.5.0-2019.12-x86_64_arm-linux-gnueabihf.tar.xz
is the 64bit version.
# Create a new folder and move the archive
mkdir -p ~/usr/local/lib
mv gcc-linaro-7.5.0-2019.12-x86_64_arm-linux-gnueabihf.tar ~/usr/local/lib
# Unzip
xz -d gcc-linaro-7.5.0-2019.12-x86_64_arm-linux-gnueabihf.tar.xz
tar -xvf gcc-linaro-7.5.0-2019.12-x86_64_arm-linux-gnueabihf.tar
# Configure environment variables
vim ~/.bashrc
export PATH=$PATH:~/usr/local/lib/gcc-linaro-x86_64_arm-linux-gnueabihf/bin
# The environment takes effect
source ~/.bashrc
6.2.2. Cross-compiling
Download source code
git clone https://github.com/AXERA-TECH/ax-samples.git
3rdparty directory preparation .
3rdparty 目录准备
Download the pre-compiled OpenCV library file
Create a 3rdparty file in ax-samples and extract the downloaded OpenCV library zip file to that folder.
Dependent Library Preparation
After obtaining the AX620 BSP development package, do the following
Download ax-samples cross-compile the repository file and extract it to the specified path ax_bsp, repository get address
$ wget https://github.com/AXERA-TECH/ax-samples/releases/download/v0.3/arm_axpi_r1.22.2801.zip
$ unzip arm_axpi_r1.22.2801.zip -d ax_bsp
- source compilation
Go to the ax-samples root directory and create the cmake compilation task
$ mkdir build
$ cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/arm-linux-gnueabihf.toolchain.cmake -DBSP_MSP_DIR=${ax_bsp}/ ..
$ make install
After compilation, the generated executable examples are stored under the ax-samples/build/install/bin/ path.
ax-samples/build$ tree install
install
└── bin
├── ax_classification
├── ax_classification_accuracy
├── ax_classification_nv12
├── ax_cv_test
├── ax_hrnet
├── ax_models_load_inspect
├── ax_monodlex
├── ax_nanodet
├── ax_paddle_mobilehumseg
├── ax_paddle_mobileseg
├── ax_paddle_yolov3
├── ax_robot_obstacle
├── ax_scrfd
├── ax_yolo_fastest
├── ax_yolo_fastest_body
├── ax_yolov3
├── ax_yolov3_accuracy
├── ax_yolov3_tiny
├── ax_yolov4
├── ax_yolov4_tiny
├── ax_yolov4_tiny_3l
├── ax_yolov5s
├── ax_yolov5s_620u
├── ax_yolov7
└── ax_yoloxs
6.2.3. Local compilation
6.2.3.1. Hardware requirements
AX-Pi (based on AX620A, a cost-effective development board for community developers)
6.2.3.2. Compilation process
git clone Download the source code, go to the ax-samples
root directory, and create the cmake
compilation task.
$ git clone https://github.com/AXERA-TECH/ax-samples.git
$ cd ax-samples
$ mkdir build
$ cd build
$ cmake ..
$ make install
After compilation, the resulting executable examples are stored under the ax-samples/build/install/bin/
path.
ax-samples/build$ tree install
install
└── bin
├── ax_classification
├── ax_classification_accuracy
├── ax_classification_nv12
├── ax_cv_test
├── ax_hrnet
├── ax_models_load_inspect
├── ax_monodlex
├── ax_nanodet
├── ax_paddle_mobilehumseg
├── ax_paddle_mobileseg
├── ax_paddle_yolov3
├── ax_robot_obstacle
├── ax_scrfd
├── ax_yolo_fastest
├── ax_yolo_fastest_body
├── ax_yolov3
├── ax_yolov3_accuracy
├── ax_yolov3_tiny
├── ax_yolov4
├── ax_yolov4_tiny
├── ax_yolov4_tiny_3l
├── ax_yolov5s
├── ax_yolov5s_620u
├── ax_yolov7
└── ax_yoloxs
6.3. Run example
Run preparation
Warning
The examples in this section are only ax-samples
, and do not provide any models for mobilenetv2
and yolov5s
, the following log is for reference only.
Log in to the AX620A
development board, and create the samples
folder under the root
path.
Copy the compiled executable examples from
build/install/bin/
to the/root/ax-samples/
path;Copy the
mobilenetv2.joint
oryolov5s.joint
model generated by Pulsar to the/root/ax-samples/
path;Copy the test images to the
/root/ax-samples/
path.
Attention
Note: The sample code does not provide a detection model such as mobilenetv2.joint
, you need to convert it from the open source onnx
model.
/root/ax-samples # ls -l
total 40644
-rwx--x--x 1 root root 3805332 Mar 22 14:01 ax_classification
-rwx--x--x 1 root root 3979652 Mar 22 14:01 ax_yolov5s
-rw------- 1 root root 140391 Mar 22 10:39 cat.jpg
-rw------- 1 root root 163759 Mar 22 14:01 dog.jpg
-rw------- 1 root root 4299243 Mar 22 14:00 mobilenetv2.joint
-rw------- 1 root root 29217004 Mar 22 14:04 yolov5s.joint
If the board is running out of space, it can be solved by mounting the board in a folder.
MacOS mount ARM development board example
Hint
Due to the limited space on the board, it is often necessary to share folders when testing, so it is necessary to share the ARM
development board with the host computer. Here is an example of MacOS
.
The development machine needs the NFS
service to mount the ARM
development board, while the MacOS
system comes with the NFS
service, just create the /etc/exports
folder, and nfsd
will start automatically and be used for exports
.
/etc/exports
can be configured as follows:
/path/your/sharing/directory -alldirs -maproot=root:wheel -rw -network xxx.xxx.xxx.xxx -mask 255.255.255.0
Parameter Definition
parameter name |
Meaning |
---|---|
alldirs |
Share all files in the |
network |
IP address of the mounted ARM development board, can be a network segment address |
mask |
subnet mask, usually 255.255.255.0 |
maproot |
Mapping rules, when |
rw |
Read and write operations, enabled by default |
Modifying /etc/exports
requires restarting the nfsd
service
sudo nfsd restart
If the configuration is successful, you can use the
sudo showmount -e
command to see the mount information, e.g. output /Users/skylake/board_nfs 10.168.21.xx
, you need to execute mount
command on the ARM
side after configuring the development machine
mount -t nfs -o nolock,tcp macos_ip:/your/shared/directory /mnt/directory
If you have permission problems, you need to check if the maproot
parameter is correct.
Hint
The network
parameter can be configured as a network segment, e.g. 10.168.21.0
, if Permission denied
occurs when mounting a single ip, you can try mounting within the network segment.
Classification Model
For the classification model, you can run it on the board by executing the ax_classification
program.
/root/ax-samples # ./ax_classification -m mobilenetv2.joint -i cat.jpg -r 100
--------------------------------------
model file : mobilenetv2.joint
image file : cat.jpg
img_h, img_w : 224 224
Run-Joint Runtime version: 0.5.10
--------------------------------------
[INFO]: Virtual npu mode is 1_1
Tools version: 0.6.1.14
59588c54
10.8712, 283
10.6592, 285
9.3338, 281
8.8770, 282
8.1893, 356
--------------------------------------
Create handle took 255.04 ms (neu 7.66 ms, axe 0.00 ms, overhead 247.37 ms)
--------------------------------------
Repeat 100 times, avg time 4.17 ms, max_time 4.83 ms, min_time 4.14 ms
Detection models
For the detection model, the post-processor of the corresponding model (e.g. ax_yolov5s
) needs to be executed to achieve the correct on-board operation.
/root/ax-samples # ./ax_yolov5s -m yolov5s.joint -i dog.jpg -r 100
--------------------------------------
model file : yolov5s.joint
image file : dog.jpg
img_h, img_w : 640 640
Run-Joint Runtime version: 0.5.10
--------------------------------------
[INFO]: Virtual npu mode is 1_1
Tools version: 0.6.1.14
59588c54
run over: output len 3
--------------------------------------
Create handle took 490.73 ms (neu 22.06 ms, axe 0.00 ms, overhead 468.66 ms)
--------------------------------------
Repeat 100 times, avg time 26.06 ms, max_time 26.83 ms, min_time 26.02 ms
--------------------------------------
detection num: 3
16: 93%, [ 135, 219, 310, 541], dog
2: 80%, [ 466, 77, 692, 172], car
1: 61%, [ 169, 116, 566, 419], bicycle
- More information about
ax-samples
is available at the officialgithub <https://github.com/AXERA-TECH/ax-samples>`_, and more extensive content is provided in the ``ax-samples
counterpartModelZoo
. Pre-compiled executable programs (e.g. ax_classification, ax_yolov5s)
Sample program run dependent
joint
models (e.g. mobilenetv2.joint, yolov5s.joint)Test images (e.g. cat.jpg, dog.jpg)