Skip to main content

A Complete Guide to Fine-Tuning GPT-3.5

OpenAI's GPT-3.5 is one of the most advanced language processing models available today. It can generate human-like text, answer questions, create conversational agents, provide tutoring in a range of subjects, translate languages, simulate characters for video games, and much more. However, while GPT-3.5 is incredibly powerful out of the box, there are scenarios where you might want to customize the model to better suit your specific application. This is where fine-tuning comes in.

OpenAI has recently announced the availability of fine-tuning for GPT-3.5 Turbo, with GPT-4 fine-tuning coming soon. This significant update allows developers to customize models to better suit their use cases and run these custom models at scale. Early tests indicate that a fine-tuned version of GPT-3.5 Turbo can match or even outperform base GPT-4 capabilities on specific narrow tasks. Importantly, data sent in and out of the fine-tuning API is owned by the customer and is not used by OpenAI or any other organization to train other models.

Fine-tuning Use Cases

Since the release of GPT-3.5 Turbo, developers and businesses have expressed a desire to customize the model to create unique and differentiated experiences for their users. With this launch, developers can now run supervised fine-tuning to optimize the model for their use cases. Fine-tuning has already helped improve model performance across common use cases, such as:

  • Improved Steerability: Fine-tuning helps the model follow instructions better, such as making outputs terse or always responding in a given language.
  • Reliable Output Formatting: Fine-tuning enhances the model's ability to consistently format responses, which is crucial for applications that demand a specific response format, such as code completion or composing API calls.
  • Custom Tone: Fine-tuning allows businesses to hone the qualitative feel of the model output, so it better fits the voice of their brand.

Fine-tuning also enables businesses to shorten their prompts while ensuring similar performance, handle 4k tokens—double the previous fine-tuned models, and reduce prompt size by up to 90% by fine-tuning instructions into the model itself, thereby speeding up each API call and cutting costs.

What is Fine-Tuning?

Fine-tuning is the process of taking a pre-trained model (like GPT-3.5) and training it further on a smaller, custom dataset to adapt the model to a specific task or domain. Fine-tuning provides several benefits:

  1. Higher Quality Results: Fine-tuning can help the model generate higher quality results that are more aligned with the desired output.
  2. Token Savings: Fine-tuning can help reduce the number of tokens required to generate a response, which can lead to cost savings.
  3. Lower Latency Requests: Fine-tuning can help reduce the response time of the model.
  4. Ability to Train on More Examples: Fine-tuning allows you to train the model on more examples than can fit in a single prompt.

When to Use Fine-Tuning?

While GPT-3.5 is incredibly powerful and can generate high-quality results for a wide range of tasks, there are scenarios where fine-tuning can be beneficial:

  1. Setting the Style, Tone, or Format: If you want the model to generate text in a specific style, tone, or format, fine-tuning can help.
  2. Improving Reliability: If the model is generating incorrect or unreliable answers, fine-tuning can help improve its reliability.
  3. Correcting Failures: If the model is failing to generate any response or is generating nonsensical responses, fine-tuning can help correct these failures.
  4. Handling Edge Cases: If the model is struggling with edge cases, fine-tuning can help improve its performance.
  5. Performing New Skills or Tasks: If you want the model to perform a new skill or task that it was not originally trained for, fine-tuning can help.

How to Fine-Tune GPT-3.5?

Fine-tuning GPT-3.5 involves several steps: To use a fine-tuned model for generating SQL queries based on natural language input, you would follow these steps:

  1. Prepare Your Data: Create a dataset of example conversations that include natural language questions or statements from the user and the corresponding SQL queries as responses from the assistant. For example:

    {
    "messages": [
    { "role": "system", "content": "You are a database assistant that converts natural language questions into SQL queries." },
    { "role": "user", "content": "What are the error messages that occurred more than 10 times?" },
    { "role": "assistant", "content": "SELECT error_message FROM errors_table WHERE occurrence > 10;" }
    ]
    }

    Make sure to include a diverse set of examples that cover all the different types of queries that you want the model to be able to generate.

  2. Upload Files: Upload your training data file to OpenAI.

  3. Create a Fine-tuning Job: Create a fine-tuning job using the OpenAI API.

import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
openai.FineTuningJob.create(training_file="file-abc123", model="gpt-3.5-turbo")
  1. Use a Fine-tuned Model: Use the fine-tuned model to generate SQL queries based on natural language input.

Here is how you can use the fine-tuned model in Python:

import openai

openai.api_key = 'your-api-key'

response = openai.ChatCompletion.create(
model="ft:gpt-3.5-turbo:your-org:custom_suffix:id",
messages=[
{"role": "system", "content": "You are a database assistant that converts natural language questions into SQL queries."},
{"role": "user", "content": "What are the error messages that occurred more than 10 times?"}
]
)
print(response['choices'][0]['message']['content'])

This will print the SQL query generated by the model based on the user's question.

Make sure to replace 'your-api-key' with your actual OpenAI API key and ft:gpt-3.5-turbo:your-org:custom_suffix:id with your fine-tuned model ID. Also, you need to install the OpenAI Python package by running pip install openai.

Safety and Pricing

OpenAI is committed to ensuring the safety of fine-tuning deployment. To preserve the default model's safety features, fine-tuning training data is passed through OpenAI's Moderation API and a GPT-4 powered moderation system to detect unsafe training data that conflicts with OpenAI's safety standards.

The costs for fine-tuning are divided into two categories: the initial training cost and usage cost. The training cost is $0.008 per 1K tokens, the usage input cost is $0.012 per 1K tokens, and the usage output cost is $0.016 per 1K tokens. For example, a GPT-3.5 Turbo fine-tuning job with a training file of 100,000 tokens trained for 3 epochs would have an expected cost of $2.40.

Updated GPT-3 Models

OpenAI has also announced that the original GPT-3 base models (ada, babbage, curie, and davinci) will be turned off on January 4th, 2024. As replacements, babbage-002 and davinci-002 are now available as either base or fine-tuned models. Customers can access these models by querying the Completions API.

The new API endpoint /v1/fine_tuning/jobs offers pagination and more extensibility to support the future evolution of the fine-tuning API. Transitioning from /v1/fine-tunes to the updated endpoint is straightforward, and more details can be found in OpenAI's new fine-tuning guide. This deprecates the old /v1/fine-tunes endpoint, which will be turned off on January 4th, 2024.

The pricing for base and fine-tuned GPT-3 models is as follows:

  • babbage-002: $0.0004 per 1K tokens for input, output, and training tokens, and $0.0016 per 1K tokens for fine-tuned input and output tokens.
  • davinci-002: $0.002 per 1K tokens for input and output tokens, $0.006 per 1K tokens for training tokens, and $0.012 per 1K tokens for fine-tuned input and output tokens.

Conclusion

The availability of fine-tuning for GPT-3.5 Turbo marks a significant milestone for developers and businesses looking to create unique and differentiated experiences for their users. With the ability to customize models, improve steerability, ensure reliable output formatting, and maintain a custom tone, developers can build more robust and dynamic applications. Additionally, the safety measures implemented by OpenAI and the detailed pricing structure make it easier for developers to integrate fine-tuned models into their applications responsibly and cost-effectively.