BrainCreatorsSep 20, 2021 3:38:06 PM4 min read

A picture says more than 1000 words!

In‌ ‌recent‌ ‌years,‌ ‌neural‌ ‌networks‌ ‌(deep‌ ‌learning)‌ ‌have‌ ‌achieved‌ ‌many‌ ‌notable‌ ‌successes‌‌ ‌to‌ ‌

apply‌ ‌for‌ ‌recognition.‌ ‌For‌ ‌example,‌ ‌healthcare‌ ‌providers‌ ‌use‌ ‌neural‌ ‌networks‌ ‌to‌ ‌predict‌ ‌
medical‌ ‌diagnoses‌ ‌and‌ ‌industry‌ ‌use‌ ‌them‌ ‌to‌ ‌visually‌ ‌detect‌ ‌defects‌ ‌in‌ ‌manufacturing‌ ‌
materials‌ ‌and‌ ‌finished‌ ‌products.‌ ‌However,‌ ‌the‌ ‌images‌ ‌are‌ ‌almost‌ ‌always‌ ‌flattened‌ ‌and‌ ‌
projected‌ ‌in‌ ‌2D,‌ ‌and‌ ‌therefore,‌ ‌the‌ ‌perception‌ ‌of‌ ‌depth‌ ‌is‌ ‌lost.‌ ‌Fortunately,‌ ‌thanks‌ ‌to‌ ‌LiDAR‌ ‌
sensors,‌ ‌3D‌ ‌data‌ ‌can‌ ‌be‌ ‌made‌ ‌accessible.‌ ‌The‌ ‌use‌ ‌of‌ ‌LiDAR‌ ‌is‌ ‌therefore‌ ‌increasing‌ ‌rapidly.‌ ‌
A‌ ‌recent‌ ‌study‌ ‌by‌ ‌GLOBE‌ ‌NEWSWIRE‌ ‌predicted‌ ‌that‌ ‌the‌ ‌LiDAR‌ ‌market‌ ‌would‌ ‌increase‌ ‌by‌ ‌
22.7%‌ ‌by‌ ‌2026.‌ ‌ ‌
‌
Point‌ ‌Cloud's‌ ‌challenges‌ ‌in‌ ‌Deep‌ ‌Learning‌ ‌‌LiDAR‌ ‌sensors‌ ‌use‌ ‌laser‌ ‌pulses‌ ‌to‌ ‌make‌ ‌
hundreds‌ ‌of‌ ‌thousands‌ ‌of‌ ‌highly‌ ‌accurate‌ ‌measurements‌ ‌per‌ ‌second.‌ ‌Measurements‌ ‌are‌ ‌
converted‌ ‌to‌ ‌points‌ ‌that‌ ‌are‌ ‌spatially‌ ‌defined‌ ‌by‌ ‌X,‌ ‌Y‌ ‌, and‌ ‌Z‌ ‌coordinates.‌ ‌‌Besides‌ ‌the‌ ‌spatial‌ ‌
coordinates,‌ ‌points‌ ‌can‌ ‌also‌ ‌be‌ ‌defined‌ ‌by‌ ‌additional‌ ‌features‌ ‌such‌ ‌as‌ ‌the‌ ‌intensity‌ ‌(I)‌ ‌and‌ ‌
(R,‌ ‌G,‌ ‌B)‌ ‌colors.‌‌ ‌However,‌ ‌Deep‌ ‌Learning‌ ‌on‌ ‌Point‌ ‌Cloud‌ ‌brings‌ ‌challenges‌ ‌because‌ ‌the‌ ‌
data‌ ‌has‌ ‌different‌ ‌properties‌ ‌compared‌ ‌to‌ ‌ordinary‌ ‌2D‌ ‌images.‌ ‌ ‌
‌
This is mainly because a Point Cloud has some very different properties than a flat RGB image: a Point Cloud is unstructured, irregular, and unordered. Typical Deep Learning models for RGB data require the flat structure of the visual XY grid to process the data. For example, RGB pixels can not be arbitrarily reordered (permuted), that would destroy the image. But points in a Point Cloud can be. The shapes of the objects they represent are invariant under such permutations. Only‌ ‌in‌ ‌this‌ ‌way‌ ‌can‌ ‌the‌ ‌architecture‌ ‌deal‌ ‌with‌ ‌the‌ ‌Point‌ ‌Cloud‌ ‌and‌ ‌unordered‌ ‌3D‌ ‌datasets.‌ ‌We‌ ‌say‌ ‌that‌ ‌a‌ ‌neural‌ ‌network‌ ‌‌permutation‌ ‌invariant‌‌ ‌must‌ ‌exist‌ ‌to‌ ‌make‌ ‌predictions‌ ‌possible.‌ ‌

PointNet‌ ‌was‌ ‌released‌ ‌in‌ ‌2017‌ ‌to‌ ‌solve‌ ‌these‌ ‌challenges for classification and segmentation of Point Cloud data. ‌This‌ ‌technology‌ ‌offers‌ ‌a‌ ‌uniform‌ ‌architecture‌ ‌that‌ ‌can‌ ‌directly‌ ‌process‌ ‌the‌ ‌Point‌ ‌Clouds‌ ‌
datasets‌ ‌and‌ ‌learn‌ ‌to‌ ‌classify‌ ‌them.‌ ‌It‌ ‌is‌ ‌also‌ ‌possible‌ ‌to‌ ‌process‌ ‌all‌ ‌input‌ ‌data‌ ‌at‌ ‌once,‌ ‌or‌ ‌
determine‌ ‌your‌ ‌input‌ ‌per‌ ‌point‌ ‌segment.‌ ‌This‌ ‌makes‌ ‌the‌ ‌architecture‌ robust ‌for‌ ‌ ‌
Permutations in the data.‌ ‌In‌ ‌addition,‌ ‌it‌ ‌guarantees‌ ‌robustness‌ ‌to‌ ‌data‌ ‌changes‌ ‌such‌ ‌as‌ ‌rotation.‌ ‌Finally,‌ ‌the‌ ‌technology‌ ‌also‌ ‌serves‌ ‌as‌ ‌a‌ ‌backbone,‌ ‌collecting‌ ‌information‌ ‌from‌ ‌each‌ ‌point‌ ‌and‌ ‌
converting‌ ‌the‌ ‌input‌ ‌into‌ ‌a‌ ‌higher‌ ‌dimensional‌ ‌vector.‌ ‌Thanks‌ ‌to‌ ‌PointNet,‌ ‌systems‌ ‌can‌ ‌be‌ ‌
developed‌ ‌that‌ ‌can‌ ‌extract‌ ‌information‌ ‌from‌ ‌3D‌ ‌images‌ ‌and‌ ‌recognize‌ ‌it,‌ ‌understand‌ ‌it,‌ ‌and‌ ‌
interpret‌ ‌it‌ ‌substantively.‌ ‌
‌
A‌ ‌picture‌ ‌is‌ more than ‌1,000‌ ‌words

Computer‌ ‌Vision‌ ‌has‌ ‌grown‌ ‌enormously‌ ‌within‌ ‌the‌ ‌AI‌ ‌community‌ ‌, thanks‌ ‌to‌ ‌PointNet.‌ ‌We‌ ‌
increasingly‌ ‌see‌ ‌new‌ ‌AI‌ ‌solutions‌ ‌based‌ ‌on‌ ‌3D‌ ‌data.‌ ‌Construction‌ ‌companies,‌ ‌in‌ ‌particular,‌ ‌
have‌ ‌opted‌ ‌for‌ ‌Point‌ ‌Cloud‌ ‌technology.‌ ‌For‌ ‌example,‌ ‌3D‌ ‌technologies‌ ‌are used‌ ‌for‌ ‌drone‌ ‌
scans,‌ ‌eliminating‌ ‌the‌ ‌need‌ ‌for‌ ‌people‌ ‌on-site‌ ‌to‌ ‌take‌ ‌measurements.‌ ‌In‌ ‌addition,‌ ‌they‌ ‌can‌ ‌
also,‌ ‌use‌ ‌3D‌ ‌for‌ ‌other‌‌ visual inspection purposes. Think‌ ‌of‌ ‌automated‌ ‌quality‌ ‌control‌ ‌by‌ ‌
digital‌ ‌inspectors‌ ‌, so‌ ‌that‌ ‌maintenance‌ ‌employees‌ ‌carry‌ ‌out‌ ‌fewer‌ ‌inspection‌ ‌rounds.‌ ‌For‌ ‌
example,‌ ‌a‌ ‌solution‌ ‌that‌ ‌can‌ ‌inspect‌ ‌road‌ ‌surfaces‌ ‌and‌ ‌‌ automatically ‌‌detect‌ ‌defects‌ ‌from‌ ‌
camera‌ ‌images.‌ ‌Thanks‌ ‌to‌ ‌new‌ ‌technology‌ ‌like‌ ‌this, maintenance‌ ‌companies‌ ‌more‌ ‌readily‌ ‌
see‌ ‌which‌ ‌assets,‌ ‌such‌ ‌as‌ ‌lighting,‌ ‌tile‌ ‌floors,‌ ‌smoke‌ ‌detectors,‌ ‌and‌ ‌surveillance‌ ‌cameras,‌ ‌
need‌ ‌maintenance.‌ ‌This‌ ‌enables‌ ‌them‌ ‌to‌ ‌manage‌ ‌assets‌ ‌more‌ ‌efficiently,‌ ‌save‌ ‌costs,‌ ‌and‌ ‌
better‌ ‌identify‌ ‌safety‌ ‌risks.‌ ‌

No wonder there is a growing demand for 3D analytics. Point Clouds are the future of Computer Vision. AI-based solutions can now consume data in its canonical form and interpret observations in 3D. Providing inspectors with extra valuable information will lead to better results and more robust performance. It will, therefore, not be long before more visual tasks currently performed by humans are soon performed by intelligent digital inspectors. It is now essential to think about the impact of Computer Vision on our social and economic structures. If we do this right, the benefits and possibilities are endless. After all, pictures say more than 1,000 words!

Read the full article in dutch here.

Maarten Stol, Principal Scientific Adviser at BrainCreators & Ghailen Ben Achour researcher at BrainCreators.

Are you interested to have more in-depth information about AI and our solutions?

Download our free Ebook!

BrainCreators

There is a digital clone behind every human expert BrainCreators is a product company that delivers outstanding SaaS solutions for visual inspection in the form of Digital Inspectors, performing as super-powered employees, enabling businesses to scale more effectively and cost-efficiently. BrainCreators augments human experts with intelligent technology.

A picture says more than 1000 words!

RELATED ARTICLES