README.md 1.05 KB
Newer Older
Yuan-Fu Liao committed
1
# Welcome to the "Taiwanese Speech in the Wild (TSW)" Project
Yuan-Fu Liao committed
2

Yuan-Fu Liao committed
3
##### Yuan-Fu Liao, Taipei University of Technology, yfliao@mail.ntut.edu.tw
Yuan-Fu Liao committed
4

Yuan-Fu Liao committed
5
#### 這是“Taiwanese-Speech-in-the-Wild” project 的語料庫現況簡介,若有關於語料庫的general問題,歡迎在此發問(issues)!
Yuan-Fu Liao committed
6

Yuan-Fu Liao committed
7
#### The first wave of TSW corpora consisted 5 subsets (beta version except MATBN) and will be released on April 9, 2018!
Yuan-Fu Liao committed
8

Yuan-Fu Liao committed
9 10 11 12
|Corpus|abbreviation|Source|Hours|Remark|
|:---|:---|:---:|---:|:--|
|Mandarin Chinese Broadcast News corpus |MATBN|PTS|198.0|story and speaker boundaries|
|NER Phonetic Annotation corpus Vol. 1|NER-PhA-Vol1 |NER|6.5 | phone, syllable, speaker and code-switching|
Yuan-Fu Liao committed
13
|NER Manual Transcription corpus Vol. 1|NER-Trs-Vol1 |NER| 107.4 | manual, word sequences|
Yuan-Fu Liao committed
14
|NER Automatic Transcription corpus Vol. 1|NER-Auto-Vol1 |NER| 309.6 | auto, word sequences|
Yuan-Fu Liao committed
15
|PTS Manual Subtitlig corpus Vol. 1 |PTS-MSub-Vol1 |PTS| 264.0 | manual subtitling with time code|
Yuan-Fu Liao committed
16
|Total|||879.0| exclude NER-PhA-Vol1|
Yuan-Fu Liao committed
17 18 19
  
* PTS: Taiwan Public Television Service
* NER: National Education Radio