Advent of 2024, Day 16 – Microsoft Azure AI – Fine-tuning a model
This article is originally published at https://tomaztsql.wordpress.com
In this Microsoft Azure AI series:
- Dec 01: Microsoft Azure AI – What is Foundry?
- Dec 02: Microsoft Azure AI – Working with Azure AI Foundry
- Dec 03: Microsoft Azure AI – Creating project in Azure AI Foundry
- Dec 04: Microsoft Azure AI – Deployment in Azure AI Foundry
- Dec 05: Microsoft Azure AI – Deployment parameters in Azure AI Foundry
- Dec 06: Microsoft Azure AI – AI Services in Azure AI Foundry
- Dec 07: Microsoft Azure AI – Speech service in AI Services
- Dec 08: Microsoft Azure AI – Speech Studio in Azure with AI Services
- Dec 09: Microsoft Azure AI – Speech SDK with Python
- Dec 10: Microsoft Azure AI – Language and Translation in Azure AI Foundry
- Dec 11: Microsoft Azure AI – Language and Translation Python SDK
- Dec 12: Microsoft Azure AI – Vision and Document AI Service
- Dec 13: Microsoft Azure AI – Vision and Document Python SDK
- Dec 14: Microsoft Azure AI – Content safety AI service
- Dec 15: Microsoft Azure AI – Content safety Python SDK
Fine-tuning is the process of optimizing a pretrained model by training it on your specific dataset, which often contains more examples than you can typically fit in a prompt. Fine-tuning helps you achieve higher quality results for specific tasks, save on token costs with shorter prompts, and improve request latency.
The following models support fine-tuning:
babbage-002
davinci-002
gpt-35-turbo
(0613)gpt-35-turbo
(1106)gpt-35-turbo
(0125)gpt-4
(0613)gpt-4o
(2024-08-06)gpt-4o-mini
(2024-07-18)
First we need to find the model we would like to fine-tune. Under the model catalog, select the Fine-tune option:
As you get comfortable and begin building your solution, it’s important to understand where prompt engineering falls short and that will help you realize if you should try fine-tuning.
- Is the base model failing on edge cases or exceptions?
- Is the base model not consistently providing output in the right format?
- Is it difficult to fit enough examples in the context window to steer the model?
- Is there high latency?
Make sure that you are in the right region where fine-tune is available. Sweden-central usually works
Upload your data as shared resource:
or directly into the fine-tune
and change the task parameters:
The parameters explained:
Name | Type | Description |
---|---|---|
batch_size | integer | The batch size to use for training. The batch size is the number of training examples used to train a single forward and backward pass. In general, we’ve found that larger batch sizes tend to work better for larger datasets. The default value as well as the maximum value for this property are specific to a base model. A larger batch size means that model parameters are updated less frequently, but with lower variance. |
learning_rate_multiplier | number | The learning rate multiplier to use for training. The fine-tuning learning rate is the original learning rate used for pre-training multiplied by this value. Larger learning rates tend to perform better with larger batch sizes. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. A smaller learning rate may be useful to avoid overfitting. |
n_epochs | integer | The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. |
seed | integer | The seed controls the reproducibility of the job. Passing in the same seed and job parameters should produce the same results, but may differ in rare cases. If a seed isn’t specified, one will be generated for you |
And the fine-tune is available:
With the fine-tuned model available (after you trained it and you get the overview metrics of your fine-tuned model), you can deploy the model and start using it in the Chat -playground.
Overview of the running model:
Once completed, you will be able to get the fine-tune metrics.
Metrics of the fine-tuned model:
And you can deploy the model:
and start using it in chat playground:
Tomorrow we will look into the Azure OpenAI service.
All of the code samples will be available on my Github.
Thanks for visiting r-craft.org
This article is originally published at https://tomaztsql.wordpress.com
Please visit source website for post related comments.