Setup

Creating a development environment in your host

Python >=3.10,<3.13 is required.

Clone sapientml and core. If you need to modify preprocess and loaddata, please clone them as well.

mkdir AutoML
cd AutoML
git clone https://github.com/sapientml/sapientml.git
git clone https://github.com/sapientml/core.git

# optional
git clone https://github.com/sapientml/preprocess.git
git clone https://github.com/sapientml/loaddata.git

Setup an environment in the sapientml repository folder.

cd /path/to/AutoML/sapientml
python -m venv venv
. venv/bin/activate
pip install poetry
poetry install
pre-commit install
pip install -e ../core

# optional
pip install -e ../preprocess
pip install -e ../loaddata

For ubuntu, poetry install may fail. If so, try the following command:

PYTHON_KEYRING_BACKEND="keyring.backends.null.Keyring" poetry install

As sapientml and core are interdependent. Use below command to integrate.

pip install -e /path/to/AutoML/core
deactivate

Now download corpus inside sapientml_core.

. venv/bin/activate
cd /path/to/AutoML/core/sapientml_core
pip install dvc
wget https://github.com/sapientml/sapientml/files/13432403/sapientml-corpus-0.1.3.zip
unzip sapientml-corpus-0.1.3.zip
mv sapientml-corpus-0.1.3 corpus
cd corpus
bash ./scripts/pull.sh
rm -f sapientml-corpus-0.1.3.zip
deac

After successfull installation, the following directory structure should reflect.

AutoML/
├── core/
│   ├── sapientml_core/
│       ├── corpus/
│       │   ├── clean-notebooks/
│       │   ├── annotated-notebooks/
│       │   └── dataset/
│       ├── design/
│       └── training/
│
└── sapientml/
    └──sapientml/