How Do Kurtosis Imports Work?
Background
Kurtosis allows a Starlark script to use content from other files. This might be importing code from another Starlark file (via the import_module
instruction), or using contents of a static file (via the read_file
instruction).
In both cases, the Kurtosis engine needs to know where to find the external file. There are two cases where external files might live:
- Locally: the external file lives on the same filesystem as the Starlark script that is trying to use it.
- Remotely: the external file lives somewhere on the internet.
Therefore, Kurtosis needs to handle both of these.
This external files problem is not unique to Kurtosis. Every programming language faces the same challenge, and each programming language solves it differently.
Click to see examples
In Javascript, local files are referenced via relative imports:
import something from ../../someDirectory/someFile
and remote files are downloaded as modules using
npm
oryarn
and stored in thenode_modules
directory. The remote files will then be available via:import some-package
In Python, local files are handled via the relative import syntax:
from .moduleY import spam
from ..moduleA import fooand remote files are downloaded as packages using
pip
, stored somewhere on your machine, and made available via thePYTHONPATH
variable. The package will then be available via regular import syntax:import some_package
In Java, the difference between local and remote files is less distinct because all files are packaged in JARs. Classes are imported using Java's import syntax:
import com.docker.clients.Client;
and the Java classpath is searched for each import to see if any JAR contains a matching file. It is the responsibility of the user to build the correct classpath, and various tools and dependency managers help developers download JARs and construct the classpath correctly.
Kurtosis Packages
Remote file imports in any language are always handled through a packaging system. This is because any language that allows remote external files must have a solution for identifying the remote files, downloading them locally, and making them available on the import path (PYTHONPATH
, node_modules
, classpath, etc.). Furthermore, authors must be able to bundle files together into a package, publish them, and share them. Thus, For Kurtosis to allow Starlark scripts to depend on remote external files, we needed a packaging system of our own.
Of all the languages, we have been most impressed by Go's packaging system (which Go calls "modules"). In Go:
- Modules are easy to create by adding a
go.mod
manifest file to a directory (example) - Dependencies are easy to declare in the
go.mod
file - Modules are published to the world simply by pushing up to GitHub
Kurtosis code needs to be easy to share, so we modelled our packaging system off Go's.
In Kurtosis, a directory that has a kurtosis.yml
file is the package root of a Kurtosis package, and all the contents of that directory will be part of the package. Any Starlark script inside the package will have the ability to use external files (e.g. via read_file
or import_module
) by specifying the locator of the file.
Each package will be named with the name
key inside the kurtosis.yml
file. Package names follow the format github.com/package-author/package-repo/path/to/directory-with-kurtosis.yml
as specified in the kurtosis.yml
documentation. This package name is used to determine whether a file being imported is local (meaning "found inside the package") or remote (meaning "found from the internet"). The logic for resolving a read_file
/import_module
is as follows:
If the package name in the
kurtosis.yml
is a prefix of the locator used inread_file
/import_module
, then the file is assumed to be local inside the package. The package name in the locator (github.com/package-author/package-repo/path/to/directory-with-kurtosis.yml
) references the package root (which is the directory where thekurtosis.yml
lives), and each subpath appended to the package name will traverse down in the repo.If the package name is not a prefix of the locator used in
read_file
/import_module
, then the file is assumed to be remote. Kurtosis will look at thegithub.com/package-author/package-repo
prefix of the locator, clone the repository from GitHub, and use the file inside the package i.e a directory that contains kurtosis.yml.
Since kurtosis.yml
can live in any directory, users have the ability to create multiple packages per repo (sibling packages). We do not currently support a package importing a sibling package (i.e. if foo
and bar
packages are subdirectories of repo
, then bar
cannot import files from foo
). Please let us know if you need this functionality.
Kurtosis does not allow referencing local files outside the package (i.e. in a directory above the package root with the kurtosis.yml
file). This is to ensure that all files used in the package get pushed to GitHub when the package is published.
Packages in Practice
There are three ways to run Kurtosis Starlark. The first is by running a script directly:
kurtosis run some-script.star
Because only a script was specified, Kurtosis does not have the kurtosis.yml
or package name necessary to resolve file imports. Therefore, any imports used in the script will fail.
The second way is to run a runnable package by pointing to the package root:
# OPTION 1: Point to the directory containing the `kurtosis.yml` and `main.star`
kurtosis run /path/to/package/root # Can also be "."
# OPTION 2: Point to a `kurtosis.yml` file directly, with a `main.star` next to it
kurtosis run /path/to/package/root/kurtosis.yml
In both cases, Kurtosis will run the main.star
in the package root and resolve any file imports using the package name specified in the kurtosis.yml
. All local imports (imports that have the package name as a prefix to the locator) will be resolved within the directory on your filesystem; this is very useful for local development.
Not all packages have a main.star
file, meaning not all packages are runnable; some packages are simply libraries intended to be imported in other packages.
The third way is to run a runnable package by its package name (can be found in the kurtosis.yml from the directory):
# if kurtosis.yml is in repository root
kurtosis run github.com/package-author/package-repo
# if kurtosis.yml is in any other directory
kurtosis run github.com/package-author/package-repo/path/to/directory-with-kurtosis.yml
Kurtosis will clone the package from GitHub, run the main.star
, and use the kurtosis.yml
to resolve any imports. This method always uses the version on GitHub.
If you want to run a non-main branch, tag or commit use the following syntax
kurtosis run github.com/package-author/package-repo@tag-branch-commit
When you're developing locally, before your package has been pushed to GitHub, the package name
can be anything you like - e.g. github.com/test/test
. The only thing that is important for correctly resolving local file imports is that your read_file
/import_module
locators also are prefixed with github.com/test/test
.
Once you push to GitHub, however, your package name
will need to match the author and repo. If they don't, your package will be broken when another user depends on your package because Kurtosis will go looking for a github.com/test/test
package that likely doesn't exist.