DEV Community

Cover image for Submitting a Fine-Tuning Job: Organising the Workforce
es404020
es404020

Posted on

1 1 1 1 1

Submitting a Fine-Tuning Job: Organising the Workforce

The Six Triple Eight relied on discipline and coordination to execute their mission. We’ll mirror this by creating and submitting a fine-tuning job, allowing the LLM to learn from our curated dataset.

Fine-Tuning with OpenAI

When you create a fine-tuning job via client.fine_tuning.job.create(), you submit your configuration and dataset to OpenAI for training. Below are the key parameters and their purposes.


1. Parameters Overview

model

  • Description: The pre-trained GPT model you wish to fine-tune.
  • Examples: "gpt-3.5-turbo", "davinci", "gpt-4-mini" (hypothetical).

training_file

  • Description: The file ID of an uploaded JSONL file containing your training data.
  • Note: Obtain this ID by uploading your dataset with the Files API and storing the file_id.

hyperparameters

  • Description: A dictionary specifying the fine-tuning hyperparameters.
  • Key Fields:
    • batch_size: Number of examples per batch (auto by default).
    • learning_rate_multiplier: Scale factor for the learning rate (auto by default).
    • n_epochs: Number of epochs (passes through the entire dataset).

suffix

  • Description: A custom string (up to 18 characters) appended to the fine-tuned model name.

seed

  • Description: Integer for reproducibility.
  • Usage: Ensures the same randomization and consistent training results across runs.

validation_file

  • Description: The file ID of a JSONL file containing your validation set.
  • Optional: But recommended for tracking overfitting and ensuring a well-generalized model.

integrations

  • Description: A list of integrations (e.g., Weights & Biases) you want enabled for the job.
  • Fields: Typically includes type and integration-specific configurations.

client.fine_tuning.job.create(
    model="gpt-3.5-turbo",
    training_file="train_id",
    hyperparameters={
        "n_epochs": 1
    },
    validation_file="val_id"
)
Enter fullscreen mode Exit fullscreen mode

Managing Fine-Tuning Jobs
Retrieves up to 10 fine-tuning jobs.

client.fine_tuning.jobs.list(limit=10)

Enter fullscreen mode Exit fullscreen mode

Retrieve a Specific Job

client.fine_tuning.retrieve("job_id")


Enter fullscreen mode Exit fullscreen mode

List Events for a Job

client.fine_tuning.list_events(
    fine_tuning_job_id="xxxx",
    limit=5
)
Enter fullscreen mode Exit fullscreen mode

Summary

  • Model Selection: Choose a suitable GPT model to fine-tune.

  • Data Preparation: Upload JSONL files and note their IDs.

  • Hyperparameters: Tune batch size, learning rate, and epochs for optimal performance.

  • Monitoring: Use validation files, job retrieval, and event logging to ensure your model trains effectively.

  • Reproducibility: Set a seed if consistent results are important for your workflow.

  • By following these steps, you’ll have a clear path to submitting and managing your fine-tuning jobs in OpenAI, ensuring your model is trained precisely on your custom data.

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more