Perceptual Computing Lab.



Director: Prof. Tetsunori Kobayashi

We are developing a man-machine communication interface for humanoid robots, especially for use in our daily life communication. The main part of our study is the modeling and development of a multi-modal interface, which is needed to deal with our daily life communication style.







ROBITA

(Real-world
Oriented
BI-modal
Talking
Agent)


Speech & Image Processing for Man-Machine Communication



We are developing a large continuous vocabulary speech recognizer based on HMM. Natural language processing techniques are utilized in order to understand the meaning of spoken words. For image processing, robots can recognize the face and face direction using a statistic method (PCA, ICA). Gesture recognition is utilized using video stream processing.










Scene.1
A: konnichiha. > R
R: konnichiha. > A
Scene.2
A: konna kanji de shabererundesuyo > B
Scene.3
R: (interest) > B
B: donna shitumon wo shitemo daijobu desuka? > A
Scene.4
R: (interest) > A
A: Hai, nanika kiite mite kudasai. > B
Scene.5
R: (interest) > B
B: nan-sai desuka? > R
Scene.6
R: watashi ha 4-sai desu. > B

Participation in Group Communication



Group conversation is a face-to-face multiparty style of communication, which often occurs in our life-style. Group conversation includes many new problems, which are not considered in conventional one-to-one conversation systems: the perception of message exchange (recognizing who is speaking and to whom he/she is speaking to), the recognition of expression by the users and the creation of a strategy to take part in the conversation. We solved these problems by utilizing a multi-modal interface: face recognition, face direction recognition, sound source estimation, speech recognition and gestural output using the real body of a robot.











Multi-modal Interaction



Humans tend to use a non-verbal form of expression (such as "Take that box!" with a pointing gesture), rather than giving a full account of instructions using linguistic expression. ROBITA can understand the pointing gestures of humans using image processing and can convey his will, using gestures expressed by his arms.




Content Top    


Copyright by Humanoid Robotics Institute, Waseda University. All rights reserved.