I am trying to train a CNN-LSTM model. The size of my images is 640x640. I have a GTX 1080 ti 11GB. I am using Keras with the TensorFlow backend.
Here is the model.
img_input_1 = Input(shape=(1, n_width, n_height, n_channels)) conv_1 = TimeDistributed(Conv2D(96, (11,11), activation='relu', padding='same'))(img_input_1) pool_1 = TimeDistributed(MaxPooling2D((3,3)))(conv_1) conv_2 = TimeDistributed(Conv2D(128, (11,11), activation='relu', padding='same'))(pool_1) flat_1 = TimeDistributed(Flatten())(conv_2) dense_1 = TimeDistributed(Dense(4096, activation='relu'))(flat_1) drop_1 = TimeDistributed(Dropout(0.5))(dense_1) lstm_1 = LSTM(17, activation='linear')(drop_1) dense_2 = Dense(4096, activation='relu')(lstm_1) dense_output_2 = Dense(1, activation='sigmoid')(dense_2) model = Model(inputs=img_input_1, outputs=dense_output_2) op = optimizers.Adam(lr=0.00001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.001) model.compile(loss='mean_absolute_error', optimizer=op, metrics=['accuracy']) model.fit(X, Y, epochs=3, batch_size=1)
Right now, using this model, I can only use the training data when the images are resized to 60x60, any larger and I run out of GPU memory.
I want to use the largest possible size as I want to retain as much discriminatory information as possible. (The labels will be mouse screen coordinates between 0 - 640).
Among many others, I found this question: How to handle images of large sizes in CNN?
Though I am not sure how I can "restrict your CNN" or "stream your data in each epoch" or if these would help.
How can I reduce the amount of memory used so I can increase the image sizes?
Is it possible to sacrifice training time/computation speed in favor of higher resolution data whilst retaining model effectiveness?
Note: the above model is not final, just a basic outlay.