Easily Import Kaggle Datasets in Google Colab with Python

Photo by Filiberto Santillán on Unsplash

When I first tried to import Kaggle datasets into Google Colab, I scoured Medium, Kaggle, and other online forums for how to do so and each answer was slightly different from the next. I also found it rather tedious to try to remember all the steps in order to successfully import the dataset.

Therefore in this post, I will share my solution to these problems with a function I wrote to handle all the dirty work for you. All you have to do is provide the Kaggle dataset url and your api credentials. If you would like to understand the script, please read the entire post. Otherwise, the script in it’s entirety can be found at the end.

First and foremost, we will handle the imports. In order to install kaggle into Google Colab’s environment through a script, we will use the subprocess module.

The function will have two parameters. The first parameter will be a string containing the url for the Kaggle dataset. The second parameter will be a dictionary containing the Kaggle username and key credentials.

We then perform a check to ensure that the url and api credentials are valid

If the conditional is met, the function will then create a kaggle.json file which to dump the api credentials in.

Finally, the function will download and upzip the contents of the Kaggle dataset.

Put together, the entire script is as follows:

I hope this post has helped you better understand how to install Kaggle datasets in Google Colab. If you encounter any issues with this code, wording in the post, or have any other inquires, please don’t hestiate to let me know. Thank you!

I leverage data to tell untold stories and model the future.