How to download dynamic files created during work on Google Colab?

7

6

I have two different files and on the first, I tried to save data to file as:

np.save(open(Q1_TRAINING_DATA_FILE, 'wb'), q1_data)

On second file, i'm trying to load it the same way using:

q1_data = np.load(open(Q1_TRAINING_DATA_FILE, 'rb'))

I then get the error:

FileNotFoundError: [Errno 2] No such file or directory: 'q1_train.npy'

I searched my google drive but couldn't find this file.

Platform: https://research.google.com

Edit: I'm trying to run below Kaggle problem on Colab platform. The author has two files (Jupyter and nbs) - one to prepare and 2nd to train. The step on nb1 where it's creating some files - which later to be consumed by file 2 is where I'm struck.

https://github.com/bradleypallen/keras-quora-question-pairs/blob/master/quora-question-pairs-training.ipynb

vikbehal

Posted 2018-02-18T19:31:35.513

Reputation: 175

1What is your data format? – Media – 2018-02-18T19:32:15.977

I just formatted the question to add more info. Concretely, there are files generated in 1st nb which 2nd nb will use. The Colab doesn't give any error but I'm unable to find the place where it's saving them. – vikbehal – 2018-02-18T19:36:42.127

1type !ls in the jupyter and see the current files, what do you see? – Media – 2018-02-18T19:56:50.853

1

You can setup an automatic real time sync from colab to Google Drive using clouderizer. No python code is needed to upload it manually to Google Drive. Watch this https://www.youtube.com/watch?v=9ntDy0H6D_I

– Prakash Gupta – 2018-05-07T10:49:04.350

Answers

4

Based on what I've seen and experienced, the best way is to store and retrieve your data from your drive account. Actually your question is a bit unclear but first I say, try to use the following command to see the current files in your directory, although I guess each 12 hours they all would be deleted automatically.

!ls

Anyway I recommend the following instructions:

Use the following code for having permission to access to your drive account:

!pip install -U -q PyDrive

import tensorflow as tf
import timeit

config = tf.ConfigProto()
config.gpu_options.allow_growth = True

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

Use the following code to get the id of contents in your drive:

file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file1 in file_list:
  print('title: %s, id: %s' % (file1['title'], file1['id']))

Put the id of the desired file, e.g. a typical text file, in the content of the following dictionary with id key:

downloaded = drive.CreateFile({'id': 'the id of typical text file'})
file = downloaded.GetContentString()
print('Downloaded content "{}"'.format(len(file)))

Till now you have copied the text file, then you have to write it in your Colab disk using the following code:

text_file = open("your desired name.txt", "w")
text_file.write(file)    
text_file.close()

Create & upload a file.

uploaded = drive.CreateFile({'title': 'filename.csv'})
uploaded.SetContentFile('filename.csv')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Downloading from Colab without Uploading to drive

from google.colab import files
files.download('your typical h5 file or what ever.h5')

For more information about transferring different data formats there are more explanations in the notebook provided with Colab.

Media

Posted 2018-02-18T19:31:35.513

Reputation: 12 077

Apologies for the confusion but the content that's dynamically create like weights etc. are not stored directly in my google drive account. I guess they live in the VM to which my account is linked to. – vikbehal – 2018-02-18T22:53:49.917

!ls lists files like .h5 which I intend to download – vikbehal – 2018-02-18T22:54:23.597

I further modified the question title - in case that's helpful? – vikbehal – 2018-02-18T22:55:56.133

1I could make it work so updated your code with those changes! Thanks, again – vikbehal – 2018-02-19T00:17:14.283

@vikbehal I'm happy it you solved it. – Media – 2018-02-19T05:31:22.963

While loading saved .npy files I'm getting an error. Any guidance? "UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte" – vikbehal – 2018-02-21T07:00:04.373

@vikbehal have you used from google.colab import files files.download('your typical h5 file or what ever.npy')? – Media – 2018-02-21T15:55:02.573

1I think I'm jumping onto stackover before trying myself. Again, thank you very much! I'll try this and ensure to do my homework before troubling you further. – vikbehal – 2018-02-21T18:24:46.277