How to import own modules from repo on Databricks?

4.7k Views Asked by At

I have connected a Github repository to my Databricks workspace, and am trying to import a module that's in this repo into a notebook also within the repo. The structure is as such:

Repo_Name

  • Checks.py
  • Test.ipynb

The path to this repo is in my sys.path(), yet I still get ModuleNotFoundError: No module named 'Checks'. When I try to do import Checks. This link explains that you should be able to import any modules that are in the PATH. Does anyone know why it might still not be working?

2

There are 2 best solutions below

3
On
  • I have tried doing the same and got a similar error even after following the procedure as given in the link provided in the question.
  • I have the following python files in my GIT repo (3 files with .py extension).

enter image description here

  • Now when I add the path /Workspace/Repos/<username>/repro0812 to sys.path and try to import the sample module from this repo, it throws the same error.

enter image description here

  • This is because, for some reason, this file is not being rendered as a python file. When I open the repo, you can actually see the difference.

enter image description here

  • There was no problem while I import the other 2 python modules check and sample2. The following is an image for refernce.

enter image description here

  • Check and make sure that the file is being considered as a .py file after adding your repo.
1
On

Most probably your .py file is a notebook, not a Python file. Notebooks couldn't be imported as Python modules, only Python files could be used in this case.

You can check if this .py has the following text in the first line:

# Databricks notebook source

If it contains, you can remove this line, and file will be considered as Python file, not as a notebook.