ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 256)

2

I'm working on a sequence to sequence approach using LSTM and a VAE with an attention mechanism.

p = np.random.permutation(len(input_data))
input_data = input_data[p]
teacher_data = teacher_data[p]
target_data = target_data[p]
BUFFER_SIZE = len(input_data)
BATCH_SIZE = 64
embedding_dim = 100
units = 256
vocab_in_size = len(input_lang.word2idx)
vocab_out_size = len(target_lang.word2idx)

# Create the Encoder layers first.

encoder_inputs = Input(shape=(None,))
encoder_emb = Embedding(input_dim=vocab_in_size, output_dim=embedding_dim)
encoder_lstm =LSTM(units=units, return_sequences=True, return_state=True)
encoder_outputs, state_h, state_c = 
encoder_lstm(encoder_emb(encoder_inputs))
encoder_states = [state_h, state_c]
 #################### Adding VAE #######################
latent_dim =256
# output layer for mean and log variance
z_mu = Dense(latent_dim)(encoder_outputs)  #remplacer h
z_log_var = Dense(latent_dim)(encoder_outputs)
def sampling(args):
     batch_size=1
     z_mean, z_log_sigma = args
     epsilon = K.random_normal(shape=(batch_size, latent_dim),
                          mean=0., stddev=1.)
     return z_mean + K.exp(z_log_sigma) * epsilon

z = Lambda(sampling, output_shape=(latent_dim,))([z_mu, z_log_var])
state_h= z
state_c = z
encoder_states = [state_h, state_c] 

def vae_loss(y_true, y_pred):

    recon = K.sum(K.binary_crossentropy(y_pred, y_true), axis=-1)
    kl = 0.5 * K.sum(K.exp(z_log_var) + K.square(z_mu) - 1. - z_log_var, 
       axis=-1)
     return recon + kl[:, None]

 ##########################

# Now create the Decoder layers.
decoder_inputs = Input(shape=(None,))
decoder_emb = Embedding(input_dim=vocab_out_size, output_dim=embedding_dim)
decoder_lstm = LSTM(units=units, return_sequences=True, return_state=True)
decoder_lstm_out, _, _ = decoder_lstm(decoder_emb(decoder_inputs), 
    initial_state=encoder_states)

# Attention layer
attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_lstm_out])

When I execute this code I get this error at the last line:

ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 256)

I tried all solutions given for this error, no one solved my problem, if any one can help, I'll be so thankfull.

Kahina

Posted 2019-04-28T23:24:10.627

Reputation: 524

Answers

1

The problem is inside the sampling functions. I had the same problem and found out the answer in the tutorial here.

my original code is:

def sampling(args):
     z_mean, z_log_sigma = args
     epsilon = K.random_normal(shape=z_mean.shape)
     return z_mean + K.exp(z_log_sigma) * epsilon

with this sampling method, I got the same error with yours.

the codes fixed the problem is:

class Sampling(layers.Layer):
  def call(self, inputs):
    z_mean, z_log_var = inputs
    batch = tf.shape(z_mean)[0]
    dim = tf.shape(z_mean)[1]
    epsilon = tf.keras.backend.random_normal(shape=(batch, dim))
    return z_mean + tf.exp(0.5 * z_log_var) * epsilon

It uses tf.shape to get the required shape parameters. It somehow fixes the problem.

Simon Ren

Posted 2019-04-28T23:24:10.627

Reputation: 11