|Affiliation||Computer Science, Information Technology|
|Profile||My research fields are Artificial Intelligence, Machine Learning and Signal Processing. My main research interests are Knowledge Acquisition from multimodal signals. Representative examples of some studies are introduced below.
(1) Language acquisition from the multimodal sensors such as microphones and cameras.
How babies learn their native language from their sensors such as hearing and vision, and how linguistic symbols are grounded through multimodal interactions in the community. We have proposed several types of models to allow computers to capture word concepts of nouns and adjectives using a multi-agent framework that does not include any symbols in this model. We are currently researching primitive level agents that can replace deep neural networks.
(2) Audio chord estimation from music signals, Sequence of audio chords estimation
How humans listen to music and what humans get/feel from music. Nowadays, computers can recognize the music title in the way of computers that is called the music finger-print. However, this way of listening is very different from human’s way. When we listen to music, we can feel three primary element as follows, melody, harmony and rhythm. So these three elements are very primary and important to “listening music”. These primary features need to be used by computers to recognize music like humans. Over the last several years, we have proposed some automatic audio chord estimators with deep neural network to achieve state-of-the-art performance. Currently, we are studying the rich musical features for inputting to neural networks. Using this abundant features, we contrive to let the neural network learn the musical chord by focusing and/or selecting the politic features. Using this abundant feature, we are trying to get the neural network to learn the musical chords by focusing and/or selecting only the appropriate features.
(3) Semantic Image Segmentation
Segmenting objects from the scene is the most basic and important behavior of humans living in this world.
We humans assume that visual information from our eyes can be used to segment objects with detailed contours. However, in reality, our brain only creates a beautifully segmented map in the brain, and the actual visual information has a considerably low resolution. So we have proposed the adaptive focusing neural module for visual cortex, and realized high performance for accurately segmenting objects.
|Research Field(Keyword & Summary)||
|Grant-in-Aid for Scientific Research Support: Japan Society for Promotion of Science (JSPS)||https://nrid.nii.ac.jp/en/nrid/1000020212590/|
|Affiliated academic society (Membership type)||(1) IEEE (member)
(2) IEICE (member)
(3) IPSJ (member)
(4) ASJ (member)
|Education Field (Undergraduate level)||Digital signal processing, Pattern Recognition, Speech processing|
|Education Field (Graduate level)||Advanced Pattern processing|