Computer Vision Complements AR!

By CIOReview | Friday, June 28, 2019

In order for the technology to function, the computer must also understand the context of the physical world, hence computer vision comes into play with AR.

FREMONT, CA:  Augmented reality (AR) has been the buzzword in the last few years. Increasingly, mobile apps incorporating AR are providing enhanced user experience. As a result, it is contributing to the masses by allowing them to visually try new clothes and makeup, turn into various online characters, and visualize furniture in their drawing rooms. 

At its core, AR superimposes sensory elements such as audio or images that resemble the real world entities. However, there is a twist to the plot as computer vision enters the arena.

Computer vision is an interdisciplinary field that involves computer’s understanding of images and digital videos. In a way, computer vision aims at digitizing the human visual system while acquiring, processing, and understanding the images in the same way as a human retina does. For instance, Facebook recognizes people and suggests photo tags using the same technology.

Another example is the Snapchat filter, in which the camera utilizes computer vision to understand and process each frame, such as a human face. Beholding a few universal truths such as the sockets are always darker than the forehead or the eye regions are darker than the upper cheek, it scans image data while analyzing the contrasts between darker and lighter pixels.

For the technology to function, the computer must understand the real-world context of the images.  For instance, while scanning a pizza box to unlock AR content, the position of the box needs to be processed that includes pictures printed on the box, and the direction of the frame. After recognizing the box, it proceeds to capture the image through the phone camera.

There is another AR variation popularly referred to as position-based AR where the app also uses GPS data, accelerometer, device’s compass, and gyroscope. The accelerometer and gyroscope determine the direction and orientation, respectively. Position-based AR utilizes geolocation instead of the embedded marker to display the content. Rather than pointing his phone at the Pizza box, a user can point it at a sign outside the outlet that will fetch additional information such as the menu or the hours of operation.