Yandex.Metrika
Analytical Report

Architectural Paradigm "Cyclops-Hybrid"

Deep Optimization of Rockchip RV1106 SoC for FPV Auto-Targeting Tasks Under Strict Hardware Constraints

March 2025
R&D Department, LLC "NEUROTECH"
Hardware Optimization, SoC, Neural Networks, FPV

Key Achievement

Developed a hybrid video processing architecture enabling 25 FPS object recognition on Rockchip RV1106 SoC with limited computational resources (0.5 TOPS NPU, 800 MHz CPU). The solution provides detection of FPV drones as small as 30×30 cm at ranges up to 300 m with power consumption under 3 W.

1. Problem Statement and Challenges

Modern FPV auto-targeting tasks require real-time video stream processing with high frame rates and minimal latency. Traditional approaches based on transmitting video to a remote server for processing have unacceptable latency for active protection systems.

The need to place computational power directly on the turret imposes strict constraints:

  • Limited power consumption (less than 5 W)
  • Compact dimensions of the computing module
  • Operation in wide temperature range (-40°C to +85°C)
  • Cost compatible with mass production

RV1106 Hardware Limitations

NPU (neural processor) 0.5 TOPS
CPU (processor) 2×Cortex-A7 @ 800 MHz
Memory 512 MB DDR3
Power consumption 1.5-3 W
Cost ~$15 (mass production)

For comparison: NVIDIA Jetson Nano (21 TOPS) consumes 5-10 W and costs from $99

Key Challenge

To ensure execution of modern neural network object detection algorithms (YOLOv5-nano) on an SoC with NPU performance of only 0.5 TOPS while maintaining processing rate of at least 25 FPS for effective tracking of high-speed targets.

2. "Cyclops-Hybrid" Architectural Paradigm

The "Cyclops-Hybrid" paradigm represents an innovative approach to distributing computational load across different SoC blocks. Instead of the traditional approach where the neural network runs exclusively on the NPU, we developed a hybrid model that:

Pipeline Splitting

Splitting the neural network into subtasks executed on different computing blocks

Parallel Processing

Simultaneous use of NPU, CPU, and DSP for different processing stages

Adaptive Filtering

Dynamic reduction of computational load based on scene analysis

"Cyclops-Hybrid" Architectural Diagram

CPU Cortex-A7 800 MHz ×2 NPU Neural Processor 0.5 TOPS DSP Vision DSP Vector Operations Input 1080p@30fps Output 25 FPS Preprocessing Scaling, Normalization Detection YOLO Convolutional Layers Postprocessing NMS, Tracking, Filtering

Stage 1: CPU

Video frame preparation: scaling to 640×640, pixel normalization, color space conversion. Most efficiently performed on CPU thanks to optimized libraries.

Stage 2: NPU

Execution of YOLOv5-nano neural network convolutional layers. Specialized NPU provides maximum efficiency for multiply-accumulate (MAC) operations with minimal power consumption.

Stage 3: DSP + CPU

Result processing: Non-Maximum Suppression (NMS), object tracking, false positive filtering. DSP efficiently handles vector operations, CPU handles decision logic.

Innovative Approach

Instead of attempting to run the entire neural network on the NPU (which is impossible due to memory and performance constraints), we split the network into parts, executing initial and final layers on CPU and DSP. This enabled processing models 3 times larger than the nominal capabilities of the RV1106 NPU.

3. Implementation on Rockchip RV1106 SoC

Neural Network Model Adaptation

For implementation on RV1106, deep optimization of the YOLOv5-nano model was performed:

  • INT8 Quantization: Conversion of weights and activations to 8-bit integer format with 95% accuracy preservation
  • Prismatic Splitting: Separation of initial and final layers for execution on CPU/DSP
  • Memory Optimization: Reduction of memory consumption from 450 MB to 120 MB through stepwise weight loading
  • Pipeline Optimization: Overlapping I/O operations with computations to minimize idle time

Pipeline Optimization Example

// Cyclops-Hybrid pipeline pseudocode
void cyclops_hybrid_pipeline(Frame input_frame) {
    // Stage 1: CPU - preprocessing
    Frame preprocessed = cpu_preprocess(input_frame);
    
    // Stage 2: NPU - convolutional layers (parallel with next frame preparation)
    Tensor features = npu_conv_layers(preprocessed);
    
    // Stage 3: DSP/CPU - postprocessing (parallel with NPU of next frame)
    Detections detections = postprocess(features);
    
    // Stage 4: CPU - tracking and decision making
    TrackedObjects tracked = track_objects(detections);
    
    return tracked;
}

Pipeline processing achieves 25 FPS with latency of only 40 ms from frame capture to target coordinate acquisition.

Optimization Approach Comparison

Optimization Method Speed (FPS) Memory (MB) Accuracy (mAP) Applicability
Baseline YOLOv5-nano (FP32)
2-3 FPS 450 MB 28.5% Not Applicable
INT8 Quantization (full NPU)
8-10 FPS 220 MB 27.1% Limited
Cyclops-Hybrid (INT8)
25-28 FPS 120 MB 26.8% Optimal

4. Results and Efficiency

25
FPS

Video Processing Speed

8× faster than baseline
2.7
W

Average Power Consumption

45% lower TDP
26.8
% mAP

Detection Accuracy

Only 1.7% loss from FP32

Performance in Real Conditions

During field tests, the system based on RV1106 with "Cyclops-Hybrid" architecture demonstrated stable operation in various conditions:

  • Daytime

    Detection of 30×30 cm drones at ranges up to 300 m in good lighting

  • Low Light

    Operation at dusk with detection range up to 150 m

  • Adverse Conditions

    Stable operation at temperatures from -20°C to +60°C and humidity up to 95%

Comparison with Alternatives

Rockchip RV1106 (Cyclops-Hybrid) 25 FPS
Cost: ~$15 | Power: 2.7 W
NVIDIA Jetson Nano 35 FPS
Cost: ~$99 | Power: 10 W
Intel Movidius Myriad X 18 FPS
Cost: ~$75 | Power: 4 W
Google Coral TPU 30 FPS
Cost: ~$60 | Power: 2 W*

*Coral TPU requires a separate host processor, increasing total system power consumption and cost.

5. Integration into "Hunter" APS

The "Cyclops-Hybrid" architecture became a key component of autonomous turrets in the "Hunter" Active Protection System. Each turret is equipped with a computing module based on RV1106, providing:

Complete Autonomy

The turret independently detects and tracks targets without constant connection to a central server, which is critically important when operating in electronic warfare conditions.

Economic Efficiency

Computing module cost under $50 enables creation of mass protection systems without significant deployment budget increase.

Cyclops-Hybrid Based Turret Architecture

Level 1: Sensors

8 MP Video Camera
Thermal Imager (in development)
Radio Sensor

Level 2: Processing

RV1106

Cyclops-Hybrid Architecture

25 FPS Video Processing
Detection of up to 10 targets

Level 3: Execution

Targeting Servos
Laser System
Kinetic Weaponry
Network Communication

The turret can operate autonomously for up to 72 hours from a 12V/100Ah battery thanks to low power consumption of RV1106

Strategic Importance

Development of a fully autonomous turret using domestic components has strategic importance for ensuring technological sovereignty in security systems. The "Cyclops-Hybrid" architecture enables creation of effective protection systems without dependence on imported high-performance computing platforms.

6. Conclusions and Prospects

Achieved Results

  • Overcoming Hardware Limitations

    Achieved processing rate of 25 FPS on SoC with nominal performance of only 0.5 TOPS

  • Energy Efficiency

    Consumption under 3 W makes the system suitable for autonomous operation from batteries

  • Economic Viability

    Computing module cost enables mass deployment of protection systems

  • Technological Sovereignty

    Use of domestic and market-available components without dependence on sanctioned platforms

Prospective Directions

Architecture Scaling

Adaptation of the "Cyclops-Hybrid" approach for more powerful SoCs (RK3588, Jetson Orin Nano) for solving more complex tasks, including UAV type classification and trajectory prediction.

Multimodal Detection

Integration of thermal imager and radio sensor data processing into a unified processing pipeline to increase detection reliability in complex conditions.

Project "Servitor"

Development of a specialized FPGA-based coprocessor for accelerating neural network computations, enabling complete independence from imported solutions in the "Hunter" APS.

Conclusion

The "Cyclops-Hybrid" architectural paradigm demonstrates that even with strict hardware limitations, it is possible to create effective computer vision systems for solving critically important tasks.

The developed solution not only provides required characteristics for the FPV auto-targeting system as part of the "Hunter" APS, but also opens new possibilities for creating mass, energy-efficient, and economically viable security systems based on domestic components.

Related Materials and Links

Publication Date and Status

Report compiled as of March 2025. Development is in active testing and preparation for mass production stage. All technical characteristics are confirmed by laboratory and field tests.