What is Computer Vision?

Computer vision is a sub-branch of artificial intelligence or AI that uses video annotations, videos that label objects and elements in the frame, which helps machines perceive the world better. Without annotations, machines can see through cameras, but the machines wouldn’t know what to do with that information. They would be video recordings. This technology has many uses and applications in various industries, from law enforcement to healthcare. But how does it work? In this article, we explain the basics and applications of computer vision in today’s world.


Video annotation is a subset of image annotation, the main difference being that it requires more images – frames, in this case – to annotate. Videos can be annotated in two ways: single-frame or streaming video.

The single-frame method is the original method used to annotate videos. In this method, the video is broken down into its frames and annotated one by one. It is a long, grueling process. Some may even call it inefficient, but it is still a good option for videos with objects that don’t move much for the video’s duration, such as a video of a car parked in a garage or plants in a greenhouse.

Streaming video is the more common method nowadays. It uses the video annotation platform tools to analyze the stream of frames and track an object’s movement across those frames. For example, a pedestrian crossing the street or a baseball player swinging his bat. The streaming video method makes it easier for machines to learn compared to the traditional way. It helps machines learn the pattern of the object’s movement and use that as a basis for self-improvement. Overall, it’s a more efficient, accurate, and convenient process.


As said before, we can use computer vision in various ways. Engineers, for example, recently developed a dog trainer robot to serve as pet companions when the owner is away, keeping them occupied and calm. With cameras and AI, they can assess the pet’s disposition and notify the owner if they notice something strange. They can also perform tasks like feeding or playing. Besides training and monitoring pets when owners are away, parents can use a similar system to watch infants or young children when they’re in the other room.

Some Japanese convenience stores have incorporated a “cashierless system” where cameras are integrated into the shelves and refrigerators so the system knows what the customer took and will use that information to ring up their total when they check out. This system is helpful to other stores as a convenient way to monitor inventory and as a security measure to prevent shoplifting from customers.

These examples are just the tip of the iceberg. Computer vision also assists in scientific research and wildlife monitoring. All the data they collect are helpful for the training of future AI models, which cascades into the development of more intelligent AI. By incorporating computer vision with other technologies like robotics and language processing, we might soon reach a time where robots will live among us, and we wouldn’t even be able to tell the difference.