Kaggle supports code in Python and R languages. For Python the familiar python notebook is provided. However to run Julia code is not supported, moreover if you do a pip install Julia
, or JuliaCall, or some other method (for example I tried the following blog) to run Julia from Python, still it is difficult to submit a notebook to a competition which requires internet off.
The problem I faced in the latter blog post is that it requires internet to download dependencies, and then again to download the packages required by your own code. Therefore I present a simple approach to submit notebooks to Kaggle competitions, especially those requiring internet off.
Set up a Ubuntu VM and have your Julia project running on it
- Download a Virtual Machine software suitable for your PC.
- Download and install Ubuntu OS (preferably x86_64, but any other will do). Henceforth we will refer to this Ubuntu OS as guest. And at the time of writing this post
julia-1.11.3
was the latest Julia version and therefore throughout this tutorial we have used this. Replace accordingly for your own case. - Your PC is the host, and you can enable sharing a folder with the guest for easy file transfers between host and guest.
- Install the Julia Language on guest.
- Copy your Julia project code and its data files to guest (for example in a folder called
myproject
). - Now instantiate and update your Julia project on guest. This will download and install all the project dependencies (packages required by your code) in the
~/.julia
folder.
Bundle your Julia Project and its dependencies in a zip file
Zip in a bundle the ~/.julia
folder, and the Project folder on guest using the following command
tar czvf julia_bundle.tar.gz \
myproject \
~/.julia
Now copy the bundle to Host and upload the bundle to your Kaggle notebook as a dataset.
cp julia_bundle.tar.gz /media/share/julia_bundle.tar.gz.csv
Upload Julia as a dataset to Kaggle
Download Julia for Generic Linux on x86 and upload to Kaggle with a .csv appended to the name.
julia-1.11.3-linux-x86_64.tar.gz -> julia-1.11.3-linux-x86_64.tar.gz.csv
Extract Julia and your Bundle in the Kaggle Notebook
!tar xvf /kaggle/input/julia-exe/julia-1.11.3-linux-x86_64.tar.csv
!tar xvf /kaggle/input/bundle/julia_bundle.tar.gz.csv
These will be extracted in the /kaggle/working
directory
Set environment variables for Julia and its packages
import os
os.environ['JULIA_DEPOT_PATH'] = ':/kaggle/working/home/ubuntu/.julia'
os.environ['PATH'] += ':/kaggle/working/julia-1.11.3/bin'
The /home/ubuntu/
path may be different in your case, adjust appropriately.
Execute your Julia file on Kaggle
!julia --project=/kaggle/working/home/ubuntu/myproject /kaggle/working/home/ubuntu/myproject/Test.jl
Typically your Julia file, in our case Test.jl, contains all the code which would then call on other code and files as needed. If you need to use data from a competition then you must insert into your Test.jl, the path of necessary competition files such as a train file for example /kaggle/input/competition_directory/train.csv
. This is relevant when you have trained a model on your host machine and now simply want to submit a trained model and run it on the data provided by the competition. Using the Kaggle paths in Test.jl for any inputs and outputs from your Julia code will help seamlessly run you code for any new data which the competition sponsor wants to test later on.
submission.csv
Typically for competitions, the Kaggle notebook needs to output a submission.csv. So we need to remove everything except the submission.csv from the /kaggle/working
directory. Your Test.jl should have written a submission.csv file to the /kaggle/working
directory. Run the following commands to remove Julia installation and your project and write a submission.csv using pandas just to double check that you can read it and then write it again.
import pandas as pd
submission=pd.read_csv('/kaggle/working/submission.csv')
print("Read Submission File")
!rm -rf /kaggle/working/julia-1.11.3
!rm -rf /kaggle/working/home/
submission.to_csv('submission.csv', index=False)
print('Wrote Subsmission')
Hope that was helpful. Any comments or issues, please post, I'd be happy to help.
Top comments (0)