How to load numerous files from google drive into colab



I am trying to load in 30k images (600mb) from Google drive into Google Colaboratory to further process them with Keras/PyTorch.

Therefore I have first mounted my Google drive using:

from google.colab import drive

Next I have unzipped the image file using:

!unzip -uq "/content/gdrive/My Drive/" -d "/content/gdrive/My Drive/path/"

Counting how many files are located in the directory using:


I only find 13k images (whereas I should find 30k). According to the output of unzip, the files appear to be unzipped correctly.

Also, I found there are some issues with loading in many files from a google directory:

Does anyone know where I am going wrong? Or whether there is a workaround?


Posted 2019-11-27T12:54:54.477

Reputation: 71



One possible option would be operate directly on the zip files using zipfile.ZipFile.

Counting the number of items in a zip file:

from contextlib import closing
from zipfile import ZipFile

with closing("/content/gdrive/My Drive/") as zip_file:
    count = len(zip_file.infolist())

Brian Spiering

Posted 2019-11-27T12:54:54.477

Reputation: 10 864