Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件-摩杜云开发者社区

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件

前言
前提条件
相关介绍
实验环境
生成随机仿射变换的数据集和标注文件

代码实现
输出结果

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_目标检测

前言

由于本人水平有限，难免出现错漏，敬请批评改正。

更多精彩内容，可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏或我的个人主页查看

YOLOv8 Ultralytics：使用Ultralytics框架训练RT-DETR实时目标检测模型

基于DETR的人脸伪装检测

YOLOv7训练自己的数据集（口罩检测）

YOLOv8训练自己的数据集（足球检测）

YOLOv5：TensorRT加速YOLOv5模型推理

YOLOv5：IoU、GIoU、DIoU、CIoU、EIoU

玩转Jetson Nano（五）：TensorRT加速YOLOv5目标检测

YOLOv5：添加SE、CBAM、CoordAtt、ECA注意力机制

YOLOv5：yolov5s.yaml配置文件解读、增加小目标检测层

Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集

YOLOv5：使用7.0版本训练自己的实例分割模型（车辆、行人、路标、车道线等实例分割）

使用Kaggle GPU资源免费体验Stable Diffusion开源项目

前提条件

熟悉Python

实验环境

Python 3.x （面向对象的高级语言）

生成随机仿射变换的数据集和标注文件

背景：将标注好的数据集，随机仿射变换，以达到数据增强的目的。

目录结构示例

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_python_02

images：原始图片数据集所在的文件夹。

jsons：原始Labelme标注文件所在的文件夹。

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_目标检测_03

{
  "version": "5.2.0.post4",
  "flags": {},
  "shapes": [
    {
      "label": "cat",
      "points": [
        [
          161.0612244897959,
          152.265306122449
        ],
        [
          610.0408163265306,
          399.7142857142857
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    }
  ],
  "imagePath": "cat.png",
  "imageData": null,
  "imageHeight": 478,
  "imageWidth": 766
}

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_python_04

{
  "version": "5.2.0.post4",
  "flags": {},
  "shapes": [
    {
      "label": "flower",
      "points": [
        [
          301.9230769230769,
          19.52747252747254
        ],
        [
          452.4725274725275,
          168.42857142857144
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    },
    {
      "label": "flower",
      "points": [
        [
          378.2967032967033,
          183.81318681318683
        ],
        [
          529.3956043956044,
          364.032967032967
        ]
      ],
      "group_id": null,
      "description": null,
      "shape_type": "rectangle",
      "flags": {}
    }
  ],
  "imagePath": "flower.png",
  "imageData": null,
  "imageHeight": 394,
  "imageWidth": 850
}

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_Labelme_05

{
  "version": "5.2.0.post4",
  "flags": {},
  "shapes": [
    {
      "label": "swan",
      "points": [
        [
          147.76178010471205,
          212.01570680628274
        ],
        [
          294.88219895287955,
          476.93717277486905
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    },
    {
      "label": "swan",
      "points": [
        [
          271.8455497382199,
          243.42931937172776
        ],
        [
          342.0026178010471,
          322.4869109947644
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    },
    {
      "label": "swan",
      "points": [
        [
          305.35340314136124,
          215.6806282722513
        ],
        [
          394.3586387434555,
          421.4397905759162
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    },
    {
      "label": "swan",
      "points": [
        [
          549.8560209424083,
          202.59162303664922
        ],
        [
          655.0916230366491,
          345.52356020942403
        ]
      ],
      "group_id": null,
      "description": "",
      "shape_type": "rectangle",
      "flags": {}
    }
  ],
  "imagePath": "swan.png",
  "imageData": null,
  "imageHeight": 490,
  "imageWidth": 795
}

代码实现

import os
import cv2
import math
import json
import random
import numpy as np

def random_perspective(im, targets=(), segments=(), degrees=10, translate=.1, scale=.1, shear=10, perspective=0.0,
                       border=(0, 0)):
    # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(0.1, 0.1), scale=(0.9, 1.1), shear=(-10, 10))
    # targets = [cls, xyxy]

    height = im.shape[0] + border[0] * 2  # shape(h,w,c)
    width = im.shape[1] + border[1] * 2

    # Center
    C = np.eye(3)
    C[0, 2] = -im.shape[1] / 2  # x translation (pixels)
    C[1, 2] = -im.shape[0] / 2  # y translation (pixels)

    # Perspective
    P = np.eye(3)
    P[2, 0] = random.uniform(-perspective, perspective)  # x perspective (about y)
    P[2, 1] = random.uniform(-perspective, perspective)  # y perspective (about x)

    # Rotation and Scale
    R = np.eye(3)
    a = random.uniform(-degrees, degrees)
    # a += random.choice([-180, -90, 0, 90])  # add 90deg rotations to small rotations
    s = random.uniform(1 - scale, 1 + scale)
    # s = 2 ** random.uniform(-scale, scale)
    R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)

    # Shear
    S = np.eye(3)
    S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)
    S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)

    # Translation
    T = np.eye(3)
    T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width  # x translation (pixels)
    T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height  # y translation (pixels)

    # Combined rotation matrix
    M = T @ S @ R @ P @ C  # order of operations (right to left) is IMPORTANT
    if (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any():  # image changed
        if perspective:
            im = cv2.warpPerspective(im, M, dsize=(width, height), borderValue=(114, 114, 114))
        else:  # affine
            im = cv2.warpAffine(im, M[:2], dsize=(width, height), borderValue=(114, 114, 114))

    # Visualize
    # import matplotlib.pyplot as plt
    # ax = plt.subplots(1, 2, figsize=(12, 6))[1].ravel()
    # ax[0].imshow(im[:, :, ::-1])  # base
    # ax[1].imshow(im2[:, :, ::-1])  # warped

    # Transform label coordinates
    n = len(targets)
    if n:
        use_segments = any(x.any() for x in segments)
        new = np.zeros((n, 4))

        # warp boxes
        xy = np.ones((n * 4, 3))
        xy[:, :2] = targets[:, [1, 2, 3, 4, 1, 4, 3, 2]].reshape(n * 4, 2)  # x1y1, x2y2, x1y2, x2y1
        xy = xy @ M.T  # transform
        xy = (xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2]).reshape(n, 8)  # perspective rescale or affine

        # create new boxes
        x = xy[:, [0, 2, 4, 6]]
        y = xy[:, [1, 3, 5, 7]]
        new = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T

        # clip
        new[:, [0, 2]] = new[:, [0, 2]].clip(0, width)
        new[:, [1, 3]] = new[:, [1, 3]].clip(0, height)

        # filter candidates
        i = box_candidates(box1=targets[:, 1:5].T * s, box2=new.T, area_thr=0.01 if use_segments else 0.10)
        targets = targets[i]
        targets[:, 1:5] = new[i]

    return im, targets


def box_candidates(box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1e-16):  # box1(4,n), box2(4,n)
    # Compute candidate boxes: box1 before augment, box2 after augment, wh_thr (pixels), aspect_ratio_thr, area_ratio
    w1, h1 = box1[2] - box1[0], box1[3] - box1[1]
    w2, h2 = box2[2] - box2[0], box2[3] - box2[1]
    ar = np.maximum(w2 / (h2 + eps), h2 / (w2 + eps))  # aspect ratio
    return (w2 > wh_thr) & (h2 > wh_thr) & (w2 * h2 / (w1 * h1 + eps) > area_thr) & (ar < ar_thr)  # candidates


# 图像显示函数
def show(name, img):
    cv2.namedWindow(name, 0)  # 用来创建指定名称的窗口,0表示CV_WINDOW_NORMAL
    # cv2.resizeWindow(name, img.shape[1], img.shape[0]); # 设置宽高大小为640*480
    cv2.imshow(name, img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

def xyxy2xminyminxmaxymax(rect):
    '''
    (x1,y1,x2,y2)  -> (xmin,ymin,xmax,ymax)
    '''
    xmin = min(rect[0],rect[2])
    ymin = min(rect[1],rect[3])
    xmax = max(rect[0],rect[2])
    ymax = max(rect[1],rect[3])
    return xmin,ymin,xmax,ymax

def read_img_json(in_img_path,in_json_path):

    label_map = {'cat':0,'flower':1,'swan':2}

    img = cv2.imread(in_img_path)

    with open(in_json_path, "r", encoding='utf-8') as f:
        # json.load数据到变量json_data
        json_data = json.load(f)
    labels = []
    # print(json_data['shapes'])
    # 读取原始jsons的 [[x1,y1],[x2,y2]]
    for i in json_data['shapes']:
        label = label_map[i['label']]
        rect = int(i['points'][0][0]),int(i['points'][0][1]),int(i['points'][1][0]),int(i['points'][1][1]) # x1,y1,x2,y2
        xmin,ymin,xmax,ymax = xyxy2xminyminxmaxymax(rect)
        labels.append([label,xmin,ymin,xmax,ymax])
    return img, np.array(labels)

def write_img_json(img_array,img_targets,out_img_name,out_img_path,out_json_path):

    json_dict = {
                    "version": "4.5.6",
                    "flags": {},
                    "shapes": [],
                }


    label_map = {0:'cat',1:'flower',2:'swan'}

    cv2.imwrite(out_img_path,img_array)
    new_img_height,new_img_width = img_array.shape[0],img_array.shape[1]

    for i in img_targets:
        label = label_map[i[0]]
        box = i[1:]
        shapes_dict = {'label': '', 
                'points': [], # [[x1,y1],[x2,y2]]
                'group_id': None, 
                'shape_type': 'rectangle', 
                'flags': {}}
        shapes_dict['label'] = label
        '''
        将 numpy int32 对象传递给 json.dumps() 方法，但该方法默认不处理 numpy integers。
        要解决该错误，请在序列化之前使用内置的 int()或 float()函数将 numpy int32 对象转换为Python intege
        x1,y1,x2,y2 = box
        '''
        x1,y1,x2,y2 = int(box[0]),int(box[1]),int(box[2]),int(box[3])
        shapes_dict['points'] = [[x1,y1],[x2,y2]]
        json_dict['shapes'].append(shapes_dict)
    
    '''
    写新的json文件
    '''
    json_dict["imagePath"] = out_img_name
    json_dict["imageData"] = None
    json_dict["imageHeight"] = new_img_height
    json_dict["imageWidth"] = new_img_width
    
    # 创建一个写文件
    with open(out_json_path, "w", encoding='utf-8') as f:
        # 将修改后的数据写入文件
        f.write(json.dumps(json_dict))



if __name__=="__main__":
    # 输出图片所在文件夹
    out_imgs_dir  = 'out_images/'
    # 输出jsons所在文件夹
    out_jsons_dir = 'out_jsons/'
    if not os.path.exists(out_imgs_dir):
        os.mkdir(out_imgs_dir)
    if not os.path.exists(out_jsons_dir):
        os.mkdir(out_jsons_dir)

    # 输入图片所在文件夹
    in_imgs_dir  = 'images/'
    # 输入jsons所在文件夹
    in_jsons_dir = 'jsons/'
    # 输入图片名列表
    file_name_list = os.listdir(in_imgs_dir)
    img_name_list = [i for i in file_name_list if i.endswith('.png')]
    # 输入jsons文件名列表
    file_name_list = os.listdir(in_jsons_dir)
    json_name_list = [i for i in file_name_list if i.endswith('.json')]
    # print(img_name_list,json_name_list)

    # 定义剪裁图片的左右填充数
    pad = 0

    for img_name,json_name in zip(img_name_list,json_name_list):
        in_img_path = os.path.join(in_imgs_dir,img_name)
        out_img_path = os.path.join(out_imgs_dir,img_name)
        in_json_path = os.path.join(in_jsons_dir,json_name)
        out_json_path = os.path.join(out_jsons_dir,json_name)
        # 原始图片和标注信息labels = [[label,xmin,ymin,xmax,ymax]]
        img,labels = read_img_json(in_img_path,in_json_path)
        # print(img,labels)

        # 随机仿射后的图片和标注信息targets = [[label,xmin,ymin,xmax,ymax]]
        img_res,targets = random_perspective(img,labels)
        # print(img_res,targets)

        write_img_json(img_res,targets,img_name,out_img_path,out_json_path)

输出结果

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_Labelme_06

out_images：随机仿射变换后的图片所在的文件夹。

out_jsons：随机仿射变换后的Labelme标注文件所在的文件夹。

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_json_07

{
    "version": "4.5.6",
    "flags": {},
    "shapes": [
        {
            "label": "cat",
            "points": [
                [
                    211,
                    94
                ],
                [
                    700,
                    374
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        }
    ],
    "imagePath": "cat.png",
    "imageData": null,
    "imageHeight": 478,
    "imageWidth": 766
}

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_Labelme_08

{
    "version": "4.5.6",
    "flags": {},
    "shapes": [
        {
            "label": "flower",
            "points": [
                [
                    266,
                    49
                ],
                [
                    421,
                    193
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        },
        {
            "label": "flower",
            "points": [
                [
                    355,
                    205
                ],
                [
                    515,
                    379
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        }
    ],
    "imagePath": "flower.png",
    "imageData": null,
    "imageHeight": 394,
    "imageWidth": 850
}

Python将原始数据集和标注文件进行数据增强（随机仿射变换），并生成随机仿射变换的数据集和标注文件_json_09

{
    "version": "4.5.6",
    "flags": {},
    "shapes": [
        {
            "label": "swan",
            "points": [
                [
                    202,
                    135
                ],
                [
                    363,
                    417
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        },
        {
            "label": "swan",
            "points": [
                [
                    329,
                    182
                ],
                [
                    404,
                    269
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        },
        {
            "label": "swan",
            "points": [
                [
                    362,
                    158
                ],
                [
                    461,
                    374
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        },
        {
            "label": "swan",
            "points": [
                [
                    607,
                    175
                ],
                [
                    721,
                    331
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        }
    ],
    "imagePath": "swan.png",
    "imageData": null,
    "imageHeight": 490,
    "imageWidth": 795
}