This website uses cookies. By using this site, you consent to the use of cookies. For more information, please take a look at our Privacy Policy.

Designing Computer Vision Systems for Self-Driving Cars

Dec 17, 2022      View: 301

Self-driving cars and their computer vision systems have received a ton of press coverage, clearly showing how difficult it is to design a robust vision system that performs well in any environmental condition. There are several key elements that all designers should consider when developing these complex automotive vision systems. These include sensor redundancy, low-light vision processing, and faulty sensor detection.

The Importance of Sensor

Redundancy For each location around a car that uses a vision system for object detection, there should be multiple cameras (at least two) pointing at the line of sight. This setting should be set even if the vision algorithm requires only single-vision data.

Sensor diversity allows detecting failure of the main camera by comparing images with the auxiliary camera. The main camera feeds its data to vision algorithms. If the system detects that the primary camera is failing, it should be able to reroute data from one of the secondary cameras to the vision algorithm.

Since multiple cameras are available, vision algorithms should also take advantage of stereo vision. Collecting depth data at a lower resolution and lower frame rate will save processing power. Even if the processing is single-camera in nature, depth information can speed up object classification by reducing the number of scales that need to be processed according to the minimum and maximum distances of objects in the scene.

For example, Texas Instruments' (TI) TDAx family of automotive processors is equipped with the necessary technology to process at least 8 camera inputs and perform state-of-the-art stereo vision processing through the Vision AcceleratorPac to meet such requirements. The Vision AcceleratorPac contains multiple Embedded Vision Engines (EVEs) that excel at the type of Single Instruction Multiple Datapath (SIMD) operations used in the corresponding matching algorithms for stereo vision systems.

Importance of Low-light Vision

Reliance on offline maps, and sensor fusion Low-light vision processing requires a different processing paradigm than daytime use. Images captured in low-light conditions have a low signal-to-noise ratio, and structural elements such as edges are hidden under noise. In low-light conditions, vision algorithms should rely more on blobs or shapes than edges. Typical computer vision algorithms, such as Histogram of Oriented Gradients (HOG) based object classification, mainly rely on edges since these features are dominant in daylight. But cars equipped with such vision systems have trouble detecting other vehicles or pedestrians on poorly lit roads at night.

If the system detects a low-light condition, the vision algorithm should switch to low-light mode. This model can be achieved using a deep learning network trained using only low-light images. Low-light modes should also rely on data from offline maps or offline worldviews. Low-light vision algorithms can provide cues to find the correct location on the map and reconstruct the scene from an offline worldview, which should be sufficient for navigation in static environments.

However, in dynamic environments, for previously unrecorded movements or new objects, fusion with other sensors (lidar, radar, thermal, etc.) is required to ensure optimal performance by utilizing the respective strengths of each sensor modality. For example, LiDAR works fine in day or night, but cannot distinguish colors. The cameras have poor eyesight in low-light conditions, but provide color information for algorithms that detect red lights and traffic signs. Both sensors will perform poorly in the presence of rain, snow, and fog. Radar can be used in bad weather, but they don't have enough spatial resolution to accurately detect the location and size of objects.

You can see that each sensor provides information that, if used alone, is incomplete or inconclusive. To reduce this uncertainty, fusion algorithms combine data collected from these different sensors.

This is the benefit of using a heterogeneous architecture design. For example, TI’s TDA2x processors can handle the processing diversity required for sensor data acquisition, processing, and fusion, thanks to three different cores (EVE, digital signal processor [DSP], and microcontroller [ MCU]). See Figure 1 for details on the functional map.

tda2x

Low light conditions require the use of a high dynamic range (HDR) sensor. These HDR sensors output multiple images with different exposure/gain values for the same frame. To be usable, these images must be combined/merged into one. This merging algorithm is usually computationally intensive, but TI's TDA2P and TDA3x processors have a hardware image signal processor (ISP) capable of to process HDR images.

In addition to offloading image signal processing tasks, integrating the ISP into the vision processor rather than the sensor has the advantage of limiting power consumption and heat generation in the camera, which helps improve image quality. The ISP is powerful enough to handle up to eight 1-MP cameras, making it ideal for surround view applications.

The goal of ISP is to produce the highest quality images possible before passing them to computer vision algorithms so that the latter can perform more optimally. But even if image quality is not ideal, recent developments in computer vision can help mitigate optical effects caused by harsh operating environments.

In fact, over the past decade, the artificial intelligence technique deep learning has been able to solve the most challenging computer vision problems. But due to the huge demand for computing power, deep learning is mainly limited to the cloud computing and data center industries.

Researchers have been focused on developing deep learning networks that are light enough to run on embedded systems without loss of quality. To support this revolutionary technology, TI created the TI Deep Learning (TIDL) library. Implemented using the Vision AcceleratorPac, TIDL can take deep learning networks designed using the Caffe or TensorFlow frameworks and execute them in real-time within the 2.5W power envelope. Semantic Score and One Shot Detector are among the networks successfully demonstrated on TDA2x processors.

To complement its vision technology, TI has been increasing its efforts to develop radar technology tailored for the advanced driver assistance systems (ADAS) and autonomous driving markets. These include:

The AWR1xx sensors for medium and long range radars integrate radio frequency (RF) and analog functions with digital control capability into a single chip. Including these functions in a single chip reduces the size of a typical radar system by 50%. Additionally, the sensor's smaller form factor requires less power and can withstand higher ambient temperatures.

A software development kit running on the TDAx processor implements a radar signal processing chain that can process up to four radar signals. The radar signal processing chain includes two-dimensional fast Fourier transform (FFT), peak detection, and beamforming stages to enable object detection. Due to its high configurability, the radar signal processing chain can support different automotive applications, including long range for adaptive cruise control and short range for parking assistance.

Importance of faulty sensor detection and fail-safe mechanisms

In a world moving towards autonomous driving, faulty sensors and even dirt can have life-threatening consequences, as noisy images can fool vision algorithms and lead to false classifications. There may be greater focus on developing algorithms that can detect invalid scenarios produced by faulty sensors. The system can implement fail-safe mechanisms, such as activating emergency lights or gradually stopping the car.

For example, TDAx devices have learning-based algorithms that use sharpness statistics from the H3A engine to detect obstacles. H3A provides block-by-block sharpness scores, giving the algorithm granular statistics.

Conclusion

Engineers designing automotive vision systems for self-driving cars face many challenges as robustness requirements increase with each level of autonomy. Fortunately, the number of tools available to address all of these problems has been growing, and advances in sensor technology, processor architectures, and algorithms will eventually lead to self-driving cars.

TI offers a range of products targeting the automotive market, including its TDAx processor family, deep learning software libraries and radar sensors. These products enable the design of powerful front camera, rear camera, surround view, radar and fusion applications that are the cornerstone of today's ADAS technology and the foundation of tomorrow's autonomous vehicles.

Previous: Changes in Vehicle Architectures Challenge Radar Systems

Next: What is Automotive Lighting