Advent of 2023, Day 12 – Creating data science experiments with Microsoft Fabric
This article is originally published at https://tomaztsql.wordpress.com
In this Microsoft Fabric series:
- Dec 01: What is Microsoft Fabric?
- Dec 02: Getting started with Microsoft Fabric
- Dec 03: What is lakehouse in Fabric?
- Dec 04: Delta lake and delta tables in Microsoft Fabric
- Dec 05: Getting data into lakehouse
- Dec 06: SQL Analytics endpoint
- Dec 07: SQL commands in SQL Analytics endpoint
- Dec 08: Using Lakehouse REST API
- Dec 09: Building custom environments
- Dec 10: Creating Job Spark definition
- Dec 11: Starting data science with Microsoft Fabric
We have started working with the data and now, we would like to create and submit the experiment. In this case, MLFlow will be used here.
Create a new experiment and give it a name. I have named my “Advent2023_Experiment_v3”. Don’t ask why V3
Adding a simple model and wrapping the MLFlow logging and runs.
import mlflow
import mlflow.sklearn
import numpy as np
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
mlflow.set_experiment("advent2023_experiment_v3")
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=123
)
for idx, depth in enumerate([1, 2, 5, 10, 20]):
clf = DecisionTreeClassifier(max_depth=depth)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
# Start MLflow
with mlflow.start_run() as run:
mlflow.log_param("depth", depth)
mlflow.log_param("alpha", "alpha")
mlflow.log_metric("accuracy", accuracy)
mlflow.sklearn.log_model(clf, "classifier")
You will get the experiment created in your workspace:
Looking into the experiment, you can further down explore the experiment and each run:
There you can manually explore the statistics, metrics and logs for each run, compare the runs and save (export) the models, .yml files, pickled models and notebooks. Similar experience as MLFlow UI.
Tomorrow we will look the continue with data science! With ML Models
Complete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Microsoft-Fabric
Happy Advent of 2023!
Thanks for visiting r-craft.org
This article is originally published at https://tomaztsql.wordpress.com
Please visit source website for post related comments.