coming Soon... beta0.02

What's new?

adding a new lambda layer for keras models in multi-gpu env. - 컴퓨터

git: https://github.com/sephiroce/kmlm, commit id: 1578f99

To input variable-length sequences into CuDNNLSTM layers, I needed to build-up a lambda function.
The return value of the lambda function was a logprob which is a scalar.
I faced "Can't+concatenate+scalars+(use+tf.stack+instead)" ...

The solution was to expand the value using tf.expand and I modified to use y_pred[0] not y_pred.

in lambda function.
import keras.backend as K
loss = tf.reduce_sum(full_logprob * seq_mask)
return K.expand_dims(loss, axis=0)

when compiling the models.
model.compile(loss={Constants.KEY_CCE:lambda y_true, y_pred: y_pred[0]},
optimizer=optimizer)

the problem seems to be solved.

written time : 2019-09-23 23:44:09.0

Computing PPL w/ tensorflow - 컴퓨터

related URL: https://github.com/sephiroce/kmlm/blob/master/kmlm/base/utils.py

import numpy as np

# ppl = exp((sum(logprob) * - 1) / words)
# if batch is equal to 1 then,
# 4 words, logprob= -27.016243, ppl= 222.126892
# ppl = exp((-27.016243 * -1) / (4 + 1)) ==> considering eos

# example inputs
length=5

# softmax, it may contain zero probability for the target word
logit=[
    [0.0, -5, 10, 0.0],
    [15,0, 20,0.0],
    [0,25,1,30],
    [100,-100.0,35,0],
    [0.00001,0,7,0],
    [100,200,300,400],
    [100,200,300,400]
]

label= [1.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0]

logits = [] #KEY_INPUT_SEQ: INPUT_SEQUENCE
labels = [] #KEY_TARGET_SEQ: TARGET_SEQUENCE
seq_len = [] #KEY_SEQ_LEN: SEQUENCE_LENGTH

batch_size = 5

for _ in range(batch_size):
  logits.append(logit)   #[batch_size, 5, 4]
  labels.append(label)   #[batch_size, 5]
  seq_len.append(length) #[batch_size]

# calculating a ppl by using tensorflow

import tensorflow as tf
logits = tf.constant(logits)
seq_len = tf.constant(seq_len)

"""
logprobs = tf.nn.log_softmax(logits)
filtered_logprob = tf.multiply(logprobs,
                               tf.one_hot(labels,
                                          tf.shape(logits)[2]))
logprob = tf.reduce_sum(filtered_logprob, axis=2) * -1

"""

# calculates an batch-wise accumulated log probability
full_logprob = \
  tf.nn.sparse_softmax_cross_entropy_with_logits(labels=\
                                                 tf.cast(labels, tf.int32),
                                                 logits=logits)

# generating sequence mask to handle variable length of inputs
# in this case actually squeeze is not needed! but my lm toolkit
# needs it. (I'll figure it out someday later..)
seq_mask = tf.squeeze(tf.sequence_mask(seq_len,
                                      maxlen=\
                                       tf.shape(full_logprob)[1],
                                       dtype=tf.float32))

logprob = tf.reduce_sum(full_logprob * seq_mask)

# calculates ppl
ppl = tf.math.exp(logprob / tf.cast(tf.reduce_sum(seq_len), tf.float32))

expanded_words = tf.reduce_sum(seq_len)

with tf.compat.v1.Session() as sess:
  w, lp, p = sess.run([expanded_words, logprob, ppl])
  print("%d words, logprob= %.6f, ppl= %.6f"%(w - batch_size, -lp, p))

written time : 2019-09-02 21:11:18.0

몰두할 무언가에 - 일상

당위성이 부여되면 폭발적이나 요즌은 현실적으로 그런일은 쉽게 일어나지 않는다.
잠깐 숨좀 돌리고 갈까나

written time : 2019-08-06 22:22:19.0

... 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | ...

status

picture

calender

What's new?

adding a new lambda layer for keras models in multi-gpu env. - 컴퓨터

Computing PPL w/ tensorflow - 컴퓨터

몰두할 무언가에 - 일상