We introduce a new problem domain for activity recognition: the analysis of children’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1-2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3-5 minute child-adult interaction. In each session, the adult examiner followed a semi-structured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.
Full database available from http://www.cbi.gatech.edu/mmdb/