Advent of 2024, Day 5 – Microsoft Azure AI – Deployment parameters in Azure AI Foundry
This article is originally published at https://tomaztsql.wordpress.com
In this Microsoft Azure AI series:
- Dec 01: Microsoft Azure AI – What is Foundry?
- Dec 02: Microsoft Azure AI – Working with Azure AI Foundry
- Dec 03: Microsoft Azure AI – Creating project in Azure AI Foundry
- Dec 04: Microsoft Azure AI – Deployment in Azure AI Foundry
When you are in Azure AI Foundry, and deploying the model, you can select the couple of additional settings. The Instructions and Context, Add your Data and Parameters.
![](https://i0.wp.com/tomaztsql.wordpress.com/wp-content/uploads/2024/12/image-11.png?resize=621%2C635&ssl=1)
Instruction and context
Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant’s personality, tell it what it should and shouldn’t answer, and tell it how to format responses. There’s no token limit for this section, but it will be included with every API call, so it counts against the overall token limit
You can also add the section ” + Add section” and you will have
![](https://i0.wp.com/tomaztsql.wordpress.com/wp-content/uploads/2024/12/image-12.png?resize=588%2C230&ssl=1)
Add Safety system messages to avoid harmful content, to avoid ungrounded content, copy infringements, jailbreaks and manipulations. You can insert one or more prepared system messages into your prompt; you can alter or add to them if you’d like. Token usage will be incurred when you begin chatting with the model in the playground.
Add Examples to show the chat what responses you want. It will try to mimic any responses you add here so make sure they match the rules you laid out in the system message.
Add your data
Adding your data is similar ability to RAG, where you can ask questions about your own data. The data remains stored in the data source you designate and will be added to you.
You will be building a new vector index based on your data source, which can be e.g.: Azure AI Search and you will need to have the resource available
![](https://i0.wp.com/tomaztsql.wordpress.com/wp-content/uploads/2024/12/image-13.png?resize=720%2C393&ssl=1)
Parameters
Sliders of parameters can differ based on the model, but some are in general the same. For “GPT-4” these are the parameter.
![](https://i0.wp.com/tomaztsql.wordpress.com/wp-content/uploads/2024/12/image-14.png?resize=552%2C512&ssl=1)
Past messages included – Select the number of past messages to include in each new API request. This helps give the model context for new user queries. Setting this number to 10 will include 5 user queries and 5 system responses.
Max response – Set a limit on the number of tokens per model response. The supported number of tokens are shared between the prompt (including system message, examples, message history, and user query) and the model’s response. One token is roughly 4 characters for typical English text.
Temperature – Controls randomness. Lowering the temperature means that the model will produce more repetitive and deterministic responses. Increasing the temperature will result in more unexpected or creative responses. Try adjusting temperature or Top P but not both.
Top P – Similar to temperature, this controls randomness but uses a different method. Lowering Top P will narrow the model’s token selection to likelier tokens. Increasing Top P will let the model choose from tokens with both high and low likelihood. Try adjusting temperature or Top P but not both.
Stop Sequence – Make the model end its response at a desired point. The model response will end before the specified sequence, so it won’t contain the stop sequence text. For ChatGPT, using <|im_end|> ensures that the model response doesn’t generate a follow-up user query. You can include as many as four stop sequences.
Frequency Penalty – Reduce the chance of repeating a token proportionally based on how often it has appeared in the text so far. This decreases the likelihood of repeating the exact same text in a response.
Presence Penalty – Reduce the chance of repeating any token that has appeared in the text at all so far. This increases the likelihood of introducing new topics in a response.
With these parameters you can heavily influence the results of the prompts and you can export the settings of your context and parameters in *.prompty file format.
![](https://i0.wp.com/tomaztsql.wordpress.com/wp-content/uploads/2024/12/image-15.png?resize=381%2C512&ssl=1)
Tomorrow we will look into AI Services.
All of the code samples will be available on my Github.
Thanks for visiting r-craft.org
This article is originally published at https://tomaztsql.wordpress.com
Please visit source website for post related comments.