A picture says more than 1000 words!
In recent years, neural networks (deep learning) have achieved many notable successes to apply for recognition. For example, healthcare providers use neural networks to predict
medical diagnoses and industry use them to visually detect defects in manufacturing
materials and finished products. However, the images are almost always flattened and
projected in 2D, and therefore, the perception of depth is lost. Fortunately, thanks to LiDAR
sensors, 3D data can be made accessible. The use of LiDAR is therefore increasing rapidly.
A recent study by GLOBE NEWSWIRE predicted that the LiDAR market would increase by
22.7% by 2026.
Point Cloud's challenges in Deep Learning LiDAR sensors use laser pulses to make
hundreds of thousands of highly accurate measurements per second. Measurements are
converted to points that are spatially defined by X, Y , and Z coordinates. Besides the spatial
coordinates, points can also be defined by additional features such as the intensity (I) and
(R, G, B) colors. However, Deep Learning on Point Cloud brings challenges because the
data has different properties compared to ordinary 2D images.
This is mainly because a Point Cloud has some very different properties than a flat RGB image: a Point Cloud is unstructured, irregular, and unordered. Typical Deep Learning models for RGB data require the flat structure of the visual XY grid to process the data. For example, RGB pixels can not be arbitrarily reordered (permuted), that would destroy the image. But points in a Point Cloud can be. The shapes of the objects they represent are invariant under such permutations. Only in this way can the architecture deal with the Point Cloud and unordered 3D datasets. We say that a neural network permutation invariant must exist to make predictions possible.
PointNet was released in 2017 to solve these challenges for classification and segmentation of Point Cloud data. This technology offers a uniform architecture that can directly process the Point Clouds
datasets and learn to classify them. It is also possible to process all input data at once, or
determine your input per point segment. This makes the architecture robust for
Permutations in the data. In addition, it guarantees robustness to data changes such as rotation. Finally, the technology also serves as a backbone, collecting information from each point and
converting the input into a higher dimensional vector. Thanks to PointNet, systems can be
developed that can extract information from 3D images and recognize it, understand it, and
interpret it substantively.
A picture is more than 1,000 words
Computer Vision has grown enormously within the AI community , thanks to PointNet. We
increasingly see new AI solutions based on 3D data. Construction companies, in particular,
have opted for Point Cloud technology. For example, 3D technologies are used for drone
scans, eliminating the need for people on-site to take measurements. In addition, they can
also, use 3D for other visual inspection purposes. Think of automated quality control by
digital inspectors , so that maintenance employees carry out fewer inspection rounds. For
example, a solution that can inspect road surfaces and automatically detect defects from
camera images. Thanks to new technology like this, maintenance companies more readily
see which assets, such as lighting, tile floors, smoke detectors, and surveillance cameras,
need maintenance. This enables them to manage assets more efficiently, save costs, and
better identify safety risks.
No wonder there is a growing demand for 3D analytics. Point Clouds are the future of Computer Vision. AI-based solutions can now consume data in its canonical form and interpret observations in 3D. Providing inspectors with extra valuable information will lead to better results and more robust performance. It will, therefore, not be long before more visual tasks currently performed by humans are soon performed by intelligent digital inspectors. It is now essential to think about the impact of Computer Vision on our social and economic structures. If we do this right, the benefits and possibilities are endless. After all, pictures say more than 1,000 words!
Read the full article in dutch here.
Maarten Stol, Principal Scientific Adviser at BrainCreators & Ghailen Ben Achour researcher at BrainCreators.
Are you interested to have more in-depth information about AI and our solutions?
Download our free Ebook!