are you me?    id passwd

status  

 choosing the third option

picture

 

 for a binary question.

links

git
https://github.com/sephiroce/

timit phone set (61 to 39) - 컴퓨터

ref: http://cdn.intechopen.com/pdfs/15948/InTech-Phoneme_recognition_on_the_timit_database.pdf
Timit[1] data set originally contains 61 phones but in Graves RNN-T paper [2] and in many other pieces of literature use 39 phoneme sets.
Here is a mapping table from 61 classes to 39 classes, as proposed by Lee and Hon[3].
You can find the table far below. I'll fix it later..;

























































aa, aoaa
ah, ax, ax-hah
er, axrer
hh, hvhh
ih, ixih
l, ell
m, emm
n, en, nxn
ng, engng
sh, zhsh
uw, uxuw
pcl, tcl, kcl, bcl, dcl, gcl, h#, pau, episil
q


[1] URL: https://catalog.ldc.upenn.edu/LDC93S1
[2] A. Graves, Sequence Transduction with Recurrent Neural Networks, 2012
[3] Lee, K. and Hon, H. Speaker-independent phone recognition using hidden markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989.

written time : 2019-10-28 17:27:55.0

todo list for rnnt - 컴퓨터

1. Decoding module
2. input noise, then can fully reproduce the RNN-T paper.
3. peephole, then can fully reproduce the CTC paper.

Timit data preperation..

prep timit dataset accoring to the RNN-T paper..
- train 3512
- valid 184 (drawn from training set) why...?
- test 192
but .. Timit composes of 4620 training, 1680 test uttrances..

written time : 2019-10-06 02:14:34.0

lm for asr - 컴퓨터

0. bos vs nobos
1. peephole: TF2.0 can be simple solution for this..
2. generative accuracy uni-lstm vs bi-lstm: 10/1 using ptb.. ?
- https://medium.com/@david.campion/text-generation-using-bidirectional-lstm-and-doc2vec-models-1-3-8979eb65cb3a
3. Reviewing NIPs paper: 10/1
- https://papers.nips.cc/paper/5651-bidirectional-recurrent-neural-networks-as-generative-models.pdf

written time : 2019-09-30 22:42:56.0
...  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |  ...