Visual Odometry vs. Visual SLAM vs. Structure-from-Motion

3 min readOct 22, 2021

The main goal of SLAM (Simultanous Localization and Mapping) is to obtain a global, consistent estimate of the robot path. The map of the environment is usually kept just for helping localization. Map information is utilized in Visual Odometry and Loop Clouse blocks. When a loop closure is detected , this information is used to reduce the drift in both the map and pose. Loop detection and loop closure are two main issues in SLAM besides localization[1].

Visual Odometry aims at recovering the path incrementally, pose after pose, and potentially optimizing only over the last n poses of the path (windowed bundle adjustment). In VO, local consistency of the trajectory is the main concern and local map is used to obtain a more accurate estimate of the local trajectory. But SLAM is concerned with the global map consistency [1].

When we look at Figure-1, the map built by VO resides in leftside and map built by SLAM resides in right side. In SLAM, at point “B”, by the help of loop detection and loop closure, the robot understands that it has passed over that place before and it really understands the real topology of the environment. In VO, the robot feels like moving in an infinite corrider and keeps exploring new areas indefinitely [2].

Visual Odometry is one of the main building blocks of Visual SLAM. The other main blocks are Loop-Closure, Backend Optimization and Reconstruction (see Figure-2). Reconstruction block may seem unnecessary in classical SLAM approaches, but some applications need a dense representation of the environment and this is provided by the Reconstruction block. The dense map can be utilized for navigation, obstacle avoidance and interaction purposes in robotic applications [3].

Figure-2: Building blocks of Visual SLAM

Structure from Motion (SfM) is a more general concept compared to Visual SLAM but there are many commonalities as well. SfM is usually performed offline using unordered sequences of images. SfM is mostly concerned with creating a map of the environment using several images taken from different perspectives. In SfM, images can even be taken from different cameras. Visual SLAM is about solving the localization problem while constructing the map of the environment. The image sequences must be ordered in Visual SLAM and usually they are taken from the same camera. The relation between SfM, Visual SLAM and Visual Odometry is summarized in Figure-3.

References

[1] D. Scaramuzza and F. Fraundorfer, “Visual Odometry Part1: The First 30 Years and Fundamentals”

[2] C. Cadena and L. Carlone and H. Carrillo and Y. Latif and D. Scaramuzza and J. Neira and I. Reid and J.J. Leonard, “Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age”, in IEEE Transactions on Robotics 32 (6) pp 1309–1332, 2016

[3] Xiang Gao, Tao Zhang, Yi Liu, and Qinrui Yan.14 Lectures on Visual SLAM: From Theory to Practice. Publishing Houseof Electronics Industry, 2017.

Visual Odometry vs. Visual SLAM vs. Structure-from-Motion

References

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by GUVEN CETINKAYA

No responses yet