Bluemist AI
Bluemist AI is an advanced, open-source, low-code machine learning library developed in Python. Its primary purpose is to facilitate the development, evaluation, and deployment of automated machine learning models and pipelines.
The library acts as a convenient wrapper service, integrating seamlessly with popular Python libraries such as scikit-learn, NumPy, pandas, mlflow, and FastAPI. For visualization purposes, Bluemist AI leverages the capabilities of pandas-profiling, sweetviz, dtale, and autoviz.
Key Features
Bluemist AI offers a range of powerful features, including:
Native Data Integration: Seamlessly extract data from various sources, including MySQL, PostgreSQL, MS SQL, Oracle, MariaDB, Amazon Aurora, and Amazon S3, providing a unified interface for working with diverse datasets.
Exploratory Data Analysis (EDA): Conduct thorough EDA on your data, enabling comprehensive insights into the characteristics, distributions, and relationships within the dataset.
Data Preprocessing: Streamline the preprocessing pipeline by leveraging Bluemist AI’s built-in preprocessing capabilities, allowing you to handle missing data, perform feature scaling, encoding, and other essential preprocessing tasks efficiently.
Algorithm Selection and Comparison: Bluemist AI facilitates training data across multiple algorithms, enabling you to compare and evaluate various models using comprehensive metrics and insights.
Hyperparameter Tuning: Optimize model performance with built-in hyperparameter tuning functionalities, automatically searching for the optimal combination of hyperparameters to achieve the best results.
Experiment Tracking: Keep track of your experiments effortlessly, storing essential metadata, parameters, and results, allowing for easy reproducibility and experimentation management.
API Deployment: Bluemist AI simplifies the process of deploying machine learning models as APIs, making it easier to integrate your models into production systems and applications.
These features make Bluemist AI a powerful tool for developing, evaluating, and deploying machine learning models efficiently and effectively.
Getting Started
For the detailed list of supported and upcoming features, visit https://www.bluemist-ai.one
Full documentation is available @ https://bluemist-ai.readthedocs.io
User installation
Method 1
To install minimal version of the package with hard dependencies listed in requirements.txt
pip install -U bluemist
To install the complete package including optional dependencies listed in requirements-optional.txt. Refer Minimal package vs Full package for more details.
pip install -U bluemist[complete]
Method 2 (recommended)
It is advised to set up a separate Python environment to avoid conflicts with package dependencies. Follow the steps below:
Install the package
virtualenv
pip install virtualenv
Create a separate directory where bluemist environment will be created
mkdir /path/to/bluemist-ai
cd /path/to/bluemist-ai
Create the bluemist environment
virtualenv bluemist-env
Activate the environment and install bluemist
source bluemist-env/bin/activate
pip install -U bluemist
Method 3
bluemist package can be installed using pipx
utility. It automatically creates an isolated environment to run the
bluemist package
pip install pipx
pipx install bluemist
pipx upgrade bluemist
Support for GPU/CPU Acceleration
GPU Acceleration
If your system is equipped with an NVIDIA GPU and you want to utilize GPU acceleration for model training, you can install the RAPIDS cuML package by executing the following command:
pip install cudf-cu11 cuml-cu11 --extra-index-url=https://pypi.nvidia.com
The
cuml
package provides GPU-accelerated implementations of various machine learning algorithms. By installing this package, you can leverage the power of your GPU to significantly improve the speed and performance of your machine learning workflows.For the list of supported algorithms, please refer https://docs.rapids.ai/api/cuml/stable/api/#regression-and-classification
Acceleration extensions can ben enabled by passing the parameter
enable_acceleration_extensions=True
during theinitialize
phase.
CPU Acceleration
If your system is equipped with an Intel CPU and you want to utilize CPU acceleration for model training, you can install the Intel® Extension for Scikit-learn package by executing the following command:
pip install -U scikit-learn-intelex
The
scikit-learn-intelex
package provides CPU-accelerated implementations of Scikit-learn algorithms. By installing this package, you can leverage the power of your Intel CPU to significantly improve the speed and performance of your machine learning workflows.For the list of supported algorithms, please refer to the Intel® Extension for Scikit-learn documentation.
Acceleration extensions can ben enabled by passing the parameter
enable_acceleration_extensions=True
during theinitialize
phase.
Minimal package vs Full package
Below functionalities are available only with complete package installation
Data extraction from RDBMS or cloud
EDA Visualizations using dtale and sweetviz
CPU/GPU Acceleration
Alternatively a single optional package can be installed using the pip
command. For example, if you would like to
extract data from Amazon S3 but do not wish to install other optional packages :
pip install boto3
License
This project is licensed under the MIT License.
See Third Party Libraries for license details of third party libraries included in the distribution.