Importing YOLO Datasets Into Label Studio: A Practical Guide

Nov 14, 2025 by Alex Braham 61 views

So, you've got a dataset annotated in YOLO format and you're looking to bring it into Label Studio? Great choice! Label Studio is a fantastic tool for data labeling, and integrating your YOLO data is a common need. Let's dive into how you can make this happen seamlessly. This guide will walk you through the process, ensuring you understand each step along the way. Whether you're refining existing annotations or leveraging Label Studio's capabilities for further data enhancement, mastering this import process is crucial. So, buckle up, and let's get started!

Understanding the Basics

Before we jump into the nitty-gritty, let's make sure we're all on the same page. YOLO (You Only Look Once) is a popular object detection format, where annotations are typically stored in .txt files. Each file corresponds to an image and contains bounding box coordinates along with class labels. Label Studio, on the other hand, is a powerful data labeling platform that supports various data formats, including JSON, CSV, and, yes, YOLO. The key is to transform your YOLO annotations into a format that Label Studio understands. This transformation is crucial because Label Studio needs structured data to properly display and manage your annotations.

Preparing Your YOLO Dataset

First things first, you need to organize your YOLO dataset. Typically, a YOLO dataset consists of two main components: image files and annotation files. Each image should have a corresponding annotation file with the same name (e.g., image1.jpg and image1.txt). The annotation files contain the bounding box coordinates and class labels for each object in the image. Ensure that your dataset follows this structure; otherwise, the import process might get messy. Check that your images are in a common format like .jpg or .png, and your annotation files are plain text files with the .txt extension. Each line in the annotation file should represent one object, with the format: class_id center_x center_y width height. Remember, these coordinates are normalized relative to the image width and height, so they should be between 0 and 1.

Converting YOLO to JSON

Label Studio prefers data in JSON format. Since YOLO annotations are in .txt files, you'll need to convert them. This conversion involves reading the .txt files, parsing the data, and then structuring it into a JSON format that Label Studio can ingest. You can achieve this using a Python script. Here's a basic example:

import os
import json

def convert_yolo_to_json(image_dir, annotation_dir, output_json):
    data = []
    for filename in os.listdir(annotation_dir):
        if not filename.endswith('.txt'):
            continue
        image_name = filename[:-4] + '.jpg'  # Assuming images are JPG format
        image_path = os.path.join(image_dir, image_name)
        if not os.path.exists(image_path):
            image_name = filename[:-4] + '.png'  # Try PNG format
            image_path = os.path.join(image_dir, image_name)
            if not os.path.exists(image_path):
                print(f"Image not found for annotation: {filename}")
                continue

        annotations = []
        with open(os.path.join(annotation_dir, filename), 'r') as f:
            for line in f:
                class_id, center_x, center_y, width, height = map(float, line.strip().split())
                annotations.append({
                    'class_id': int(class_id),
                    'center_x': center_x,
                    'center_y': center_y,
                    'width': width,
                    'height': height
                })

        data.append({
            'image': image_path,
            'annotations': annotations
        })

    with open(output_json, 'w') as f:
        json.dump(data, f, indent=4)


# Example usage
image_dir = 'path/to/your/images'
annotation_dir = 'path/to/your/annotations'
output_json = 'output.json'
convert_yolo_to_json(image_dir, annotation_dir, output_json)

This script reads the YOLO .txt files and structures the data into a JSON format, including the image path and annotation details. Make sure to replace 'path/to/your/images' and 'path/to/your/annotations' with the actual paths to your image and annotation directories. The output.json file will contain the converted data, ready for import into Label Studio.

Setting Up Label Studio

Now that you have your data in JSON format, it's time to set up Label Studio. If you haven't already, install Label Studio using pip: pip install label-studio. Once installed, you can start Label Studio by running label-studio in your terminal. This will open Label Studio in your web browser, typically at http://localhost:8080. Create a new project and give it a meaningful name. This project will house your YOLO dataset and annotations.

Configuring the Labeling Interface

Next, you need to configure the labeling interface. Label Studio uses XML-like labeling configurations to define how your data is displayed and how annotations are created. For object detection tasks, you'll need to use the <RectangleLabels> tag. Here’s a simple example:

<View>
  <Image name="image" value="$image"/>
  <RectangleLabels name="label" toName="image">
    <Label value="Class1" background="green"/>
    <Label value="Class2" background="blue"/>
    <!-- Add more classes as needed -->
  </RectangleLabels>
</View>

In this configuration, <Image> displays the image, and <RectangleLabels> allows you to draw bounding boxes and assign labels to them. **Replace `