Visual Design and Engineering Laboratory

Carnegie Mellon University

Intermodal Image-Based Recognition of Planar Kinematic Mechanisms

Paper

[Paper PDF]    

[Publisher Version]

Abstract

We present a data-driven exploratory study to investigate whether trained object detectors generalize well to test images from a different modality. We focus on the domain of planar kinematic mechanisms, which can be viewed as a set of rigid bodies connected by joints, and use textbook graphics and images of hand-drawn sketches as input modalities. The goal of our algorithm is to automatically recognize the underlying mechanical structure shown in an input image by leveraging well-known computer vision methods for object recognition with the optimizing power of multiobjective evolutionary algorithms. Taking a raw image as input, we detect pin joints using local feature descriptors in a support vector machine framework. Improving upon previous work, detection confidence depends on multiple context-based classifiers of varying image patch size and greedy foreground extraction. The likelihood of rigid body connections is approximated using normalized geodesic time, and NSGA-II is used to evolve optimal mechanisms using this data. The present work is motivated by the observation that textbook diagrams and hand-drawn sketches of mechanisms exhibit similar object structure, yet have different visual characteristics. We apply our method using various combinations of images for training and testing, and the results demonstrate a trade-off between solvability and accuracy.

Citation

Matthew Eicholtz, Levent Burak Kara. (2014). Intermodal Image-Based Recognition of Planar Kinematic Mechanisms. Journal of Visual Languages and Computing, 27(1):38-48. Also in 2014 International Conference on Distributed Multimedia Systems Workshop on Visual Languages and Computing (VLC). August 27-29, 2014. Pittsburgh, PA.

@article{EicholtzKara_2015_JVLC,
 title = "Intermodal image-based recognition of planar kinematic mechanisms ",
 journal = "Journal of Visual Languages & Computing ",
 volume = "27",
 number = "1",
 pages = "38-48",
 year = "2014",
 issn = "1045-926X",
 doi = "http://dx.doi.org/10.1016/j.jvlc.2014.10.024",
 url = "http://www.sciencedirect.com/science/article/pii/S1045926X14001177",
 author = "Matthew Eicholtz and Levent Burak Kara",
 keywords = "Computer vision",
 keywords = "Evolutionary multiobjective optimization",
 keywords = "Kinematic simulation",
 keywords = "Object recognition"
}