CIAIR constructed the corpus of simultaneous interpretation between Japanese and English for five years (from 1999 to 2003). CIAIR had already completed the transcription and the visualization of speech data and spoken language analysis parts of the corpus with 182 hours of speech data recorded. The transcribed speech data size of CIAIR simultaneous interpretation corpus reaches about one million words (morphemes). The corpus is interactive and bilingual between Japanese and English, containing spoken language data of lectures of daily topics and conversations in travel-related settings. See the heading of database for further information.




