Traditional Feature Detectors and Trackers use information aggregation in 2D patches to detect and match discriminative patches. However, this information does not remain the same at object boundaries when there is object motion against a significantly varying background. In this paper, we propose a new approach for feature detection, tracking and re-detection that gives significantly improved results at the object boundaries. We utilize level lines or iso-intensity curves that often remain stable and can be reliably
detected even at the object boundaries, which they often trace.
Stable portions of long level lines are detected and points of high curvature are detected on such curves for corner detection.
Further, this level line
is used to separate the portions belonging to the two objects, which is then used for robust matching of such points. While such CoMaL (Corners on
Maximally-stable Level Line Segments) points were found to be much more reliable at the object boundary
regions, they perform
comparably at the interior regions as well. This is illustrated in exhaustive experiments
on real-world datasets.
Results are shown on the KITTI tracking dataset (Figure 10), CoMaL dataset (Figure 11), KITTI Stereo dataset (Figure 12), KITTI Flow dataset (Figure 13) and Oxford dataset (Figure 14).
Figure 10. Frame numbers 88 and 93 in the sequence Car-B from the KITTI dataset [4] showing CoMaL + SSD matches in the first row
followed by next performing combination: Hessian [5] + SIFT [6] and FAST [7] + NSD [8] in the 2nd and 3rd rows respectively.
CoMaL points are matched more numerously and accurately at the object boundary regions in spite of a significant change in the background.
Figure 11. First two rows: matches on a Homogenous object - Box. Bottom two rows: Matches on a Textured Object - Hero.
CoMaL + SSD matches are shown in the first row images while Hessian + SIFT matches are shown in the second row images.
Figure 12. Matches on image stereo pair 138 from the KITTI dataset. CoMaL + SSD matches are shown in the first row
while Hessian + SIFT matches are shown in the second. CoMaL matches are clearly more numerous.
Figure 13. Matches on image flow pair 63 from the KITTI dataset. CoMaL + SSD matches are shown in the first row
while Hessian + SIFT matches are shown in the second.
Figure 14. Repeatability and Correspondences Graphs for the Bikes, Leuven and UBC sequences from the Oxford dataset in the first, second and third rows respectively.
References
[1] G. Zhang, and Z. Dong, J. Jia et al. Efficient non-consecutive feature tracking for structure-from-motion. In ECCV 2010.
[2] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions.
In BMVC, 2002.
[3] C. Harris and M. Stephens. A combined corner and edge detector. In Alvey vision conference, 1988.
[4] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The KITTI dataset. In The International Journal of Robotics Research, 2013.
[5] K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point detectors. In IJCV, 2004.
[6] D. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. In IJCV 2004.
[7] E. Rosten and T. Drummond. Machine learning for high-speed corner detection. In ECCV 2006.
[8] J. Byrne and J. Shi. Nested shape descriptors. In ICCV 2013.
[9] B. Lucas, T Kanade et al. An iterative image registration technique with an application to stereo vision. In IJCAI 1981.