Intelligent Driving AI Chips: Technological Evolution Overview
Feb 28, 2026 View: 124
The iterative upgrade of intelligent driving technology is, in essence, a computing power competition centered on the full chain of “Perception – Decision – Execution.” As the “central brain” of intelligent driving systems, AI chips play a decisive role in determining how fast autonomous driving evolves from L2 assisted driving to L4 fully autonomous driving.
From the early adaptation of general-purpose processors to customized dedicated chip designs, and from single computing units to heterogeneous integrated architectures, the architectural evolution of intelligent driving AI chips has consistently revolved around three core objectives: enhanced computing power, optimized energy efficiency, and safety redundancy.
Initial Stage
General-Purpose Chip Adaptation and Distributed Architecture Exploration (2015–2019)
During the early development of intelligent driving, L1–L2 assisted driving functions primarily included basic features such as ACC (Adaptive Cruise Control) and LCC (Lane Centering Control). Computing power requirements were relatively low. Mainstream solutions relied on adapting general-purpose chips, with distributed ECUs (Electronic Control Units) forming a modular functional computing model.
At this stage, chip architectures had not yet developed into highly specialized designs. They largely depended on the redundant computing capacity of consumer-grade or industrial-grade processors. The core challenge was ensuring automotive-grade reliability and stability.
In terms of computing demand, early L2 systems required only 2–2.5 TOPS. Representative chip solutions included Mobileye’s EyeQ3 and NVIDIA’s Tegra X1.
Mobileye EyeQ3 adopted a simplified architecture combining a single-core CPU with a dedicated vision processor. Although its actual computing power was about 0.256 TOPS (256 GOPS), it captured a significant share of the early ADAS market thanks to highly optimized vision algorithms. During the 2010s, when autonomous driving was still in its infancy, Mobileye partnered with 27 automakers—including Volkswagen, BMW, and Ford—making its solution the de facto industry standard for L2 assisted driving, with global market share once reaching 70%–80%.
In 2015, NVIDIA launched the Drive PX autonomous driving chip based on Tegra X1. Unlike previous driver-assistance chips, Drive PX introduced deep neural learning. It could collect data from LiDAR, millimeter-wave radar, and cameras, and apply deep learning to dynamically recognize pedestrians, vehicles, road signs, and other objects.
The Drive PX 2 system, leveraging deep learning capabilities, became the core computing unit of Tesla Model S/X. It supported multi-sensor fusion (cameras, millimeter-wave radar), laying the foundation for subsequent Autopilot functions.
Distributed Architecture
Distributed architecture was the defining feature of this phase. Different assisted driving functions were controlled by independent ECUs—for example, ACC processed radar data in one control unit, while LCC processed camera data in another.
This architecture offered short development cycles and manageable costs but had clear limitations:
● High data interaction latency between ECUs, limiting effective multi-sensor fusion
● Fragmented computing power, insufficient for complex scenario-based collaborative decision-making
● Hardware upgrade difficulties, restricting functional iteration
In 2019, an accident involving an L2 assisted driving system highlighted these limitations. L2 systems relied on camera and millimeter-wave radar fusion, but distributed architectures could introduce sensor synchronization delays. Under extreme weather conditions (e.g., slippery roads), misjudgments could be amplified, such as delayed or inaccurate obstacle recognition.
In April 2018, China’s Ministry of Industry and Information Technology, Ministry of Public Security, and Ministry of Transport jointly issued the Administrative Rules for Road Testing of Intelligent Connected Vehicles (Trial), providing regulatory guidance. Twenty-seven provinces and municipalities introduced supporting policies, 16 testing demonstration zones were established, more than 3,500 km of test roads were opened, over 700 test licenses were issued, and total road test mileage exceeded 7 million km.
These efforts accelerated the development of China’s intelligent connected vehicle industry. By 2020, L2 penetration in the passenger vehicle market reached 15%, increasing to around 20% in the first half of 2021. L3 autonomous models began validation testing in specific scenarios.
Growth Stage
Heterogeneous Integration Architecture and Scaled Computing Expansion (2020–2023)
As intelligent driving progressed toward L2+, features such as NOA (Navigation on Autopilot) and urban NOA drove exponential growth in computing demand—from 10 TOPS to over 100 TOPS. Single general-purpose chips could no longer satisfy multi-sensor fusion and complex algorithm inference requirements.
Heterogeneous System Architecture (HSA) became mainstream. The principle: “Let specialized units handle specialized tasks.”
● CPU: system control and task scheduling
● GPU: parallel computation, image rendering, feature extraction
● FPGA: low-latency sensor data preprocessing
● NPU: deep learning inference acceleration
A representative example is NVIDIA’s Orin series. The Orin-X variant integrates:
● 12-core Arm Cortex-A78 CPU + 1 Arm Cortex-R52 core
● Ampere-based GPU
● Deep Learning Accelerator (DLA) supporting INT8/INT16/FP16
● Total computing power: 254 TOPS
● Energy efficiency: 5 TOPS/W
● Dual-chip redundancy up to 508 TOPS
It is widely adopted in new energy vehicles such as XPeng MONA M03 Max and the 2025 Zeekr 001 series. In 2024, NVIDIA Drive Orin-X shipments reached over 2.1 million units globally, accounting for 39.8% market share.
Domestic chipmakers also rose rapidly. Horizon Robotics’ Journey 5 adopted a CPU + GPU + BPU (self-developed deep learning processor) heterogeneous architecture:
● 128 TOPS single-chip performance
● Support for 16 camera inputs
● 30W typical power consumption
● End-to-end latency as low as 60 ms
● Scalable to 1024 TOPS with multi-chip solutions
It has been mass-produced for Li Auto L8/L7 Pro models and supports highway NOA and beyond.
Maturity Stage
Specialized Architecture Upgrades and Intensified Computing Competition (2024–2025)
From 2024 onward, intelligent driving entered a critical commercialization phase for L3. Computing demand surged from 100 TOPS to 1000 TOPS and beyond. End-to-end large models and VLA (Vision-Language-Action) models forced deep architectural optimization. Chiplet technologies matured, and automaker self-developed chips became a major trend.
Dedicated AI accelerators became the core breakthrough. Traditional NPUs evolved into highly specialized AI accelerators, such as Horizon’s BPU 6.0 and Huawei’s Ascend architecture.
Huawei’s Ascend 610B (7nm, Da Vinci architecture):
● 8 AI cores + 8 CPU cores
● INT8 performance up to 200 TOPS
● FP16 performance around 100 TFLOPS
● Supports multi-task parallel processing
It is deployed in Avatr 11/12 and AITO M5/M7/M9 models, supporting L3-level features.
NVIDIA’s Thor chip (4nm, Blackwell architecture):
● Integrates Grace CPU + Ada Lovelace GPU + Hopper engine
● Up to 2000 TOPS per chip
● Supports autonomous driving and intelligent cockpit
● ASIL-D compliant
● NVLink-C2C enabling OS-level parallel execution
● Nearly 8× performance of Orin
Tesla’s FSD HW5 chip demonstrated the advantage of algorithm-hardware co-design:
● Fully self-developed instruction set
● Deep coupling with FSD software
● Over 3× computing utilization improvement
● Power consumption under 30W
NIO ET9 adopts two self-developed NX9031 chips (5nm), each delivering 1000 TOPS, totaling 2000 TOPS with redundancy.
Competitive Landscape
Global leaders such as NVIDIA, Intel, and AMD maintain dominance through ecosystem advantages. NVIDIA leverages CUDA to control the training market; Intel strengthened FPGA positioning via Altera acquisition and Gaudi3; AMD’s MI300 series gained traction in supercomputing.
Chinese AI chip enterprises achieved breakthroughs through scenario-driven specialization:
● Huawei Ascend: smart cities, industrial inspection
● Cambricon Siyuan: government cloud, finance
● Horizon Journey: mainstream autonomous driving
Automaker self-developed chips (Tesla, NIO, XPeng) increasingly break external supplier monopolies.
Future Trends
In-Memory Computing and Vehicle-Cloud Collaborative Architectures
As autonomous driving advances toward L4/L5, computing demand is expected to exceed 4000 TOPS. The traditional von Neumann architecture’s separation of storage and computation introduces latency and energy bottlenecks.
In-memory computing integrates computation directly within memory units, significantly reducing data movement and potentially improving energy efficiency by 10–100×. Some targets project future efficiency of 300–1000 TOPS/W within five years.
Meanwhile, vehicle-cloud collaborative E2E architectures will become mainstream. Onboard chips will cooperate with edge cloud and data centers, enabling continuous system evolution. The AI-Defined Vehicle (ADV) paradigm—powered by large AI models and cloud computing—will accelerate automated iteration and reduce human intervention.
For example, Arm’s Zena Compute Subsystem (CSS) provides a pre-integrated platform that can shorten software development cycles by up to two years and accelerate automotive AI deployment.
From distributed general-purpose adaptation to heterogeneous integration, from specialized accelerators to vehicle-cloud synergy, the evolution of intelligent driving AI chips reflects not only technological advancement but also a structural transformation of the automotive industry itself.
The competition is no longer solely about computing power—it is about architecture, energy efficiency, ecosystem integration, and long-term scalability.
FAQs
1. What is the core difference between distributed and centralized autonomous driving architectures?
Distributed architectures (2015–2019) rely on multiple independent ECUs, each handling specific functions such as ACC or LCC. While cost-effective and easier to develop, they suffer from high latency and limited multi-sensor fusion capability.
Centralized architectures integrate multiple functions into a high-performance domain controller powered by AI SoCs from companies like NVIDIA and Huawei. This enables unified sensor fusion, lower latency, scalable computing, and support for advanced L3/L4 features.
2. Why has computing demand increased from 2 TOPS to over 1000 TOPS?
Early L2 systems mainly handled structured highway scenarios with rule-based algorithms. As intelligent driving evolved toward urban NOA and L3 autonomy, vehicles must process multi-camera, LiDAR, radar, and high-resolution perception data in real time.
In addition, end-to-end deep learning models and VLA (Vision-Language-Action) architectures significantly increase inference complexity, pushing computing demand from a few TOPS to 1000+ TOPS—and potentially 4000+ TOPS for L4/L5 systems.
3. What is a heterogeneous computing architecture in autonomous driving chips?
Heterogeneous architecture (HSA) integrates different specialized computing units within one SoC:
● CPU for system control and scheduling
● GPU for parallel processing
● NPU/DLA for deep learning inference
● FPGA (in some cases) for low-latency preprocessing
For example, NVIDIA’s Orin and Horizon Robotics’ Journey series adopt heterogeneous designs to maximize performance and energy efficiency by assigning tasks to the most suitable hardware unit.
4. Why are automakers developing their own autonomous driving chips?
Self-developed chips allow automakers to achieve deep algorithm-hardware co-optimization. Companies such as Tesla and NIO design custom AI chips tightly integrated with their proprietary software stacks.
This approach improves computing utilization, reduces redundant hardware overhead, enhances supply chain control, and strengthens differentiation—especially as autonomous driving becomes a core competitive factor.
5. What are the future architectural trends for intelligent driving AI chips?
Two major trends are emerging:
1. In-memory computing to reduce data movement bottlenecks and dramatically improve energy efficiency.
2. Vehicle-cloud collaborative architectures, where onboard AI chips work with edge and cloud computing to enable continuous model updates and large-scale data training.
Together, these trends will support the transition toward L4/L5 autonomous driving and accelerate the realization of AI-Defined Vehicles (ADV).
Previous: Renesas R-Car V4H Deep Dive: Architecture, AI Performance, and Its Role



