While working on Open AI for the past few months, I found a Terraform module that really helped me get started with the Azure OpenAI component.
The module,Azure/openai/azurerm, has already been built with the intention of customisation, scalability, and security. As such, this becomes a great module.
In this blog, I will demonstrate how to use the module to deploy a public instance. Future blog posts will cover private network integration, adding a WAF, and further enhancements.
Public Instance Deployment
The instance we are deploying will be on the internet (public). I would recommend increasing security by restricting networks from accessing the instance, something I will cover in future posts.
Below is a sample module block I use to deploy an instance. This alone will deploy a working OpenAI instance in an existing resource group:
module "openai-uks" {
source = "Azure/openai/azurerm"
version = "0.1.3"
location = "UK South"
resource_group_name = "Resource Group Name"
account_name = "open-ai"
sku_name = "S0"
public_network_access_enabled = true
deployment = {
"gpt-3.5-turbo" = {
name = "gpt-35-turbo"
model_format = "OpenAI"
model_name = "gpt-35-turbo"
model_version = "1106"
scale_type = "Standard"
capacity = 1
},
"gpt-4-turbo" = {
name = "gpt-4-turbo"
model_format = "OpenAI"
model_name = "gpt-4"
model_version = "1106-Preview"
scale_type = "Standard"
capacity = 1
}
}
}
Let's break down what I have done in this module:
Source - This is the location of the module we'll be using
Version - The version of the module we are going to use
Location - The location where we will deploy this resource. Please note that some models are not available in every region. You can find this by going to this resource.
Resource_group_name - The resource group name the resource will be logically located
Account_name - The name of the resource
Sku_name - The SKU you want to apply to the account.
Public_network_access_enabled - If the resources will be public accessible
Deployment - This is where you will want to specify the models you want to deploy
You will see I have two models configured for deployment. Here is a breakdown of one of them to explain how you configure it:
Name - The name you want to give your model. Can be anything
Model_format - The model format you want to use, so OpenAI is required here
Model_name - The name of the model. This can be GPT3.5, 3.5 Turbo, 4, etc...
Model_version - The version of the model you want to use. It is important to specify this to get specific releases, including preview/stable versions.
Scale_type - The scaling of the model
Capacity - Token capacity. This is in thousands, so 1 equals 1,000.
Outputs
When this instance deploys, there will be a couple of values you will want. You can get these from the portal, or you can output them as part of Terraform. I recommend outputting the following to an Azure Key Vault:
module.openai-uks.openai_endpoint - This will be the endpoint address
module.openai-uks.openai_primary_key - This will be the primary key to authenticate with the instance
module.openai-uks.openai_secondary_key - This will be the secondary key to authenticate with the instance
These values will allow you to connect to the OpenAI APIs for development.