adding a new lambda layer for keras models in multi-gpu env.
2019-09-23 23:44:09

git: https://github.com/sephiroce/kmlm, commit id: 1578f99

To input variable-length sequences into CuDNNLSTM layers, I needed to build-up a lambda function.

The return value of the lambda function was a logprob which is a scalar.

I faced "Can't+concatenate+scalars+(use+tf.stack+instead)" ...

The solution was to expand the value using tf.expand and I modified to use y_pred[0] not y_pred.

in lambda function.

import keras.backend as K

loss = tf.reduce_sum(full_logprob * seq_mask)

return K.expand_dims(loss, axis=0)

when compiling the models.

model.compile(loss={Constants.KEY_CCE:lambda y_true, y_pred: y_pred[0]},

optimizer=optimizer)

the problem seems to be solved.

▼ more
Computing PPL w/ tensorflow
2019-09-02 21:11:18

related URL: https://github.com/sephiroce/kmlm/blob/master/kmlm/base/utils.py

import numpy as np

# ppl = exp((sum(logprob) * - 1) / words)

# if batch is equal to 1 then,

# 4 words, logprob= -27.016243, ppl= 222.126892

# ppl = exp((-27.016243 * -1) / (4 + 1)) ==> considering eos

# example inputs

length=5

# softmax, it may contain zero probability for the target word

logit=[

    [0.0, -5, 10, 0.0],

    [15,0, 20,0.0],

    [0,25,1,30],

    [100,-100.0,35,0],

    [0.00001,0,7,0],

    [100,200,300,400],

    [100,200,300,400]

]

label= [1.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0]

logits = [] #KEY_INPUT_SEQ: INPUT_SEQUENCE

labels = [] #KEY_TARGET_SEQ: TARGET_SEQUENCE

seq_len = [] #KEY_SEQ_LEN: SEQUENCE_LENGTH

batch_size = 5

for _ in range(batch_size):

  logits.append(logit)   #[batch_size, 5, 4]

  labels.append(label)   #[batch_size, 5]

  seq_len.append(length) #[batch_size]

# calculating a ppl by using tensorflow

import tensorflow as tf

logits = tf.constant(logits)

seq_len = tf.constant(seq_len)

"""

logprobs = tf.nn.log_softmax(logits)

filtered_logprob = tf.multiply(logprobs,

                               tf.one_hot(labels,

                                          tf.shape(logits)[2]))

logprob = tf.reduce_sum(filtered_logprob, axis=2) * -1

"""

# calculates an batch-wise accumulated log probability

full_logprob = \

  tf.nn.sparse_softmax_cross_entropy_with_logits(labels=\

                                                 tf.cast(labels, tf.int32),

                                                 logits=logits)

# generating sequence mask to handle variable length of inputs

# in this case actually squeeze is not needed! but my lm toolkit

# needs it. (I'll figure it out someday later..)

seq_mask = tf.squeeze(tf.sequence_mask(seq_len,

                                       maxlen=\

                                       tf.shape(full_logprob)[1],

                                       dtype=tf.float32))

logprob = tf.reduce_sum(full_logprob * seq_mask)

# calculates ppl

ppl = tf.math.exp(logprob / tf.cast(tf.reduce_sum(seq_len), tf.float32))

expanded_words = tf.reduce_sum(seq_len)

with tf.compat.v1.Session() as sess:

  w, lp, p = sess.run([expanded_words, logprob, ppl])

  print("%d words, logprob= %.6f, ppl= %.6f"%(w - batch_size, -lp, p))

▼ more
몰두할 무언가에
2019-08-06 22:22:19

당위성이 부여되면 폭발적이나 요즌은 현실적으로 그런일은 쉽게 일어나지 않는다.

잠깐 숨좀 돌리고 갈까나

▼ more
끝난 것이 아니고
2019-07-07 16:51:45

조금 더 암시적이고 다채로워진 것 일뿐

▼ more