These phone conversations were recorded using the phone conversation module on Summa Linguae’s Robson app. The data set includes 500 hours of time-stamped and transcribed unscripted speech data (i.e. natural conversation) between two speakers. Each speech segment is at maximum 15 seconds in length. Each conversation is at least 15 minutes long. Calls with more than two speakers have been removed from the corpus. The transcription is done in time-stamped segments of 15 seconds in length at maximum. Each segment indicates the speaker, the start and the end of the segment, and additional information on the segment.
Datasets / Speech Data / Phone Conversations in Japanese
Phone Conversations in Japanese
Categories
Dataset Details
Domain
General
Total Hours
500
Number of Files
787
Total File Size
54 GB
Audio Sampling Rate
8 kHz
Bit Rate
128 kb/s (constant)
Bit Depth
16 bits
Format
WAV
Encoding
pcm_s16le
Channels
2 (1 for each speaker)
Dataset Demographics
Country
Japan
Language
Japanese
Interested in purchasing this data set?
Complete the form below and our team will reach out.