In this video you can observe two different demos: a first one in which Qbo detects/recognizes/learns persons; and another in which he detects/recognizes/learns objects. The switch from one demo to the other is made through voice commands, as well as the recognition and the learning commands. Therefore, the interaction with Qbo is very natural, without the need of a display or a keyboard. In both demos, the computer vision library OpenCV (version 2.2) was used for image processing and for the implementation of machine learning algorithms.
In the first demo, Qbo uses the Haar Cascade classifier (inspired by the Viola-Jones method) to find faces in the image and then CAMSHIFT to track the face trough the image by using a skin color filter. In the second demo, the object selection in the image is made using the stereoscopic vision of the robot, in which he focuses his attention to the nearest objects. To get the best possible results with the object selection, one must hold it from behind and try to hide our hand(s) from the robot.
Regarding the recognition and the learning, in both demos the “Bags of Words” approach was used along with SURF Descriptors to represent each face/object. Due to the nature of the SURF Descriptors, the recognition is invariant to colour, scale, rotations and partial occlusions. This means that there aren’t many restrictions when you are showing objects to Qbo – in 2:10 a rotating newspaper is shown to Qbo and still he recognized it. SVM Classifiers were also used for the face/object recognition in the image, that use the extracted SURF Descriptors from the image and the generated “codeword” (see Bags of Word in Computer Vision). The recognition/learning algorithm is of type learning by reinforcement, meaning that the more an object or a person is taught to Qbo, the better Qbo will recognize it/he/she.
In the face recognition demo, every training data (images and SVM’s files) are stored in the Qbo’s PC. However, in the object recognition demo, the training of the SVM classifiers is made in the cloud (in a server) and the generated files are stored there. Therefore, any other Qbo in any place in the world can access these files, download them and can recognize locally objects that were trained elsewhere.