zerohertzLib.vision ¶

Vision

다양한 image들을 handling하고 시각화하는 함수 및 class들

Important

Bbox의 types

cwh: [cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])
xyxy: [x0, y0, x1, y1] 로 구성된 bbox ([4] or [N, 4])
poly: [[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

Modules:

Name	Description
`cli`
`compare`
`convert`
`data`
`eval`
`gif`
`loader`
`transform`
`util`
`visual`

Classes:

Name	Description
`CocoLoader`	COCO format의 dataset을 읽고 시각화하는 class
`ImageLoader`	경로와 image의 수를 지정하여 경로 내 image를 return하는 class
`JsonImageLoader`	JSON file을 통해 image와 JSON file 내 정보를 불러오는 class
`LabelStudio`	Label Studio 관련 data를 handling하는 class
`YoloLoader`	YOLO format의 dataset을 읽고 시각화하는 class

Functions:

Name	Description
`bbox`	여러 Bbox 시각화
`before_after`	두 image를 비교하는 image 생성
`cutout`	Image 내에서 지정한 좌표를 제외한 부분을 투명화
`cwh2poly`	Bbox 변환
`cwh2xyxy`	Bbox 변환
`evaluation`	단일 image 내 detection model의 추론 성능 평가
`grid`	여러 image를 입력받아 정방형 image로 병합
`img2gif`	Directory 내 image들을 GIF로 변환
`iou`	IoU (Intersection over Union)를 계산하는 function
`is_pts_in_poly`	지점들의 좌표 내 존재 여부 확인 function
`mask`	Mask 시각화
`meanap`	Detection model의 P-R curve 시각화 및 mAP 산출
`pad`	입력 image를 원하는 shape로 resize 및 pad
`paste`	`target` image를 `img` 위에 투명도를 포함하여 병합
`poly2area`	다각형의 면적을 산출하는 function
`poly2cwh`	Bbox 변환
`poly2mask`	다각형 좌표를 입력받아 mask로 변환
`poly2ratio`	다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function
`poly2xyxy`	Bbox 변환
`text`	Text 시각화
`transparent`	입력 image에 대해 `threshold` 미만의 pixel들을 투명화
`vert`	여러 image를 입력받아 가로 image로 병합
`vid2gif`	동영상을 GIF로 변환
`xyxy2cwh`	Bbox 변환
`xyxy2poly`	Bbox 변환

all `module-attribute` ¶

__all__ = ['img2gif', 'vid2gif', 'before_after', 'grid', 'bbox', 'mask', 'text', 'cwh2poly', 'cwh2xyxy', 'poly2cwh', 'poly2mask', 'poly2xyxy', 'xyxy2cwh', 'xyxy2poly', 'cutout', 'paste', 'is_pts_in_poly', 'JsonImageLoader', 'vert', 'pad', 'poly2area', 'poly2ratio', 'ImageLoader', 'transparent', 'YoloLoader', 'LabelStudio', 'iou', 'meanap', 'evaluation', 'CocoLoader']

CocoLoader ¶

CocoLoader(data_path: str, vis_path: str | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None)

COCO format의 dataset을 읽고 시각화하는 class

Parameters:

Name	Type	Description	Default
`data_path`	`str`	Image 및 annotation이 존재하는 directory 경로	required
`vis_path`	`str \| None`	시각화 image들이 저장될 경로	`None`
`class_color`	`dict[int \| str, tuple[int, int, int]] \| None`	시각화 결과에 적용될 class에 따른 색상	`None`

Examples:

>>> data_path = "train"
>>> class_color = {"label1": (0, 255, 0), "label2": (255, 0, 0)}
>>> coco = zz.vision.CocoLoader(data_path, vis_path="tmp", class_color=class_color)
>>> image, class_list, bboxes, polys = coco(0, False, True)
>>> type(image)
<class 'str'>
>>> image
'{IMAGE_PATH}.jpg'
>>> class_list
[0, 1]
>>> type(bboxes)
<class 'numpy.ndarray'>
>>> bboxes.shape
(2, 4)
>>> image, class_list, bboxes, polys = coco[0]
>>> type(image)
<class 'numpy.ndarray'>
>>> class_list
['label1', 'label2']
>>> type(bboxes)
<class 'numpy.ndarray'>
>>> bboxes.shape
(2, 4)
>>> type(polys)
<class 'list'>

Methods:

Name	Description
`__call__`	Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)
`__getitem__`	Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)
`__len__`	Image 수를 반환
`yolo`	COCO format을 YOLO format으로 변환

Attributes:

Name	Type	Description
`annotations`
`class_color`
`classes`
`data_path`
`image2annotation`
`images`
`vis_path`

Source code in zerohertzLib/vision/loader.py

def __init__(
    self,
    data_path: str,
    vis_path: str | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
) -> None:
    self.data_path = data_path
    data = Json(f"{data_path}.json")
    self.images = data["images"]
    self.annotations = data["annotations"]
    self.images.sort(key=lambda x: x["id"])
    self.annotations.sort(key=lambda x: x["image_id"])
    self.image2annotation = defaultdict(list)
    for idx, annotation in enumerate(self.annotations):
        self.image2annotation[annotation["image_id"]].append(idx)
    self.classes = {}
    for idx, cls in enumerate(data["categories"]):
        self.classes[cls["id"]] = (idx, cls["name"])
    self.vis_path = vis_path
    if vis_path is not None:
        if class_color is None:
            raise ValueError(
                "Visualization requires the 'class_color' variable to be specified"
            )
        rmtree(vis_path)
        self.class_color = class_color

annotations `instance-attribute` ¶

annotations = data['annotations']

class_color `instance-attribute` ¶

class_color = class_color

classes `instance-attribute` ¶

classes = {}

data_path `instance-attribute` ¶

data_path = data_path

image2annotation `instance-attribute` ¶

image2annotation = defaultdict(list)

images `instance-attribute` ¶

images = data['images']

vis_path `instance-attribute` ¶

vis_path = vis_path

call ¶

__call__(idx: int, read: bool = False, int_class: bool = False) -> tuple[str | NDArray[uint8], list[int | str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]

Index에 따른 image와 annotation에 대한 정보 return (vis_path 와 class_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required
`read`	`bool`	Image 읽음 여부	`False`
`int_class`	`bool`	출력될 class의 type 지정	`False`

Returns:

Type	Description
`tuple[str \| NDArray[uint8], list[int \| str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]`	Image 경로 혹은 읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`

Source code in zerohertzLib/vision/loader.py

def __call__(
    self, idx: int, read: bool = False, int_class: bool = False
) -> tuple[
    str | NDArray[np.uint8],
    list[int | str],
    NDArray[DTypeLike],
    list[NDArray[DTypeLike]],
]:
    """
    Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index
        read: Image 읽음 여부
        int_class: 출력될 class의 type 지정

    Returns:
        Image 경로 혹은 읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`
    """
    img_path = os.path.join(
        self.data_path, os.path.basename(self.images[idx]["file_name"])
    )
    if read:
        img = cv2.imread(img_path)
    else:
        img = img_path
    class_list = []
    bboxes = []
    polys = []
    for idx_ in self.image2annotation[self.images[idx]["id"]]:
        annotation = self.annotations[idx_]
        if int_class:
            class_list.append(self.classes[annotation["category_id"]][0])
        else:
            class_list.append(self.classes[annotation["category_id"]][1])
        bboxes.append(
            [
                annotation["bbox"][0] + annotation["bbox"][2] / 2,
                annotation["bbox"][1] + annotation["bbox"][3] / 2,
                annotation["bbox"][2],
                annotation["bbox"][3],
            ]
        )
        if "segmentation" in annotation.keys():
            polys.append(np.array(annotation["segmentation"][0]).reshape(-1, 2))
    bboxes = np.array(bboxes)
    return img, class_list, bboxes, polys

getitem ¶

__getitem__(idx: int) -> tuple[NDArray[uint8], list[str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]

Index에 따른 image와 annotation에 대한 정보 return (vis_path 와 class_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required

Returns:

Type	Description
`tuple[NDArray[uint8], list[str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]`	읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`

Source code in zerohertzLib/vision/loader.py

def __getitem__(
    self, idx: int
) -> tuple[
    NDArray[np.uint8], list[str], NDArray[DTypeLike], list[NDArray[DTypeLike]]
]:
    """
    Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index

    Returns:
        읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`
    """
    img, class_list, bboxes, polys = self(idx, read=True)
    if self.vis_path is not None:
        self._visualization(
            os.path.basename(self.images[idx]["file_name"]),
            img,
            class_list,
            bboxes,
            polys,
        )
    return img, class_list, bboxes, polys

len ¶

__len__() -> int

Image 수를 반환

Returns:

Type	Description
`int`	읽어온 image file들의 수

Source code in zerohertzLib/vision/loader.py

def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 image file들의 수
    """
    return len(self.images)

_visualization ¶

_visualization(file_name: str, img: NDArray[uint8], class_list: list[str], bboxes: NDArray[DTypeLike], polys: list[NDArray[DTypeLike]]) -> None

Source code in zerohertzLib/vision/loader.py

def _visualization(
    self,
    file_name: str,
    img: NDArray[np.uint8],
    class_list: list[str],
    bboxes: NDArray[DTypeLike],
    polys: list[NDArray[DTypeLike]],
) -> None:
    for cls, box in zip(class_list, bboxes):
        img = bbox(img, box, self.class_color[cls])
    if polys:
        mks = np.zeros((len(polys), *img.shape[:2]), bool)
        for idx, poly in enumerate(polys):
            mks[idx] = poly2mask(poly, img.shape[:2])
        img = mask(img, mks, class_list=class_list, class_color=self.class_color)
    cv2.imwrite(os.path.join(self.vis_path, file_name), img)

yolo ¶

yolo(target_path: str, label: list[str] | None = None, poly: bool = False) -> None

COCO format을 YOLO format으로 변환

Parameters:

Name	Type	Description	Default
`target_path`	`str`	YOLO format data가 저장될 경로	required
`label`	`list[str] \| None`	COCO에서 사용한 label을 정수로 변환하는 list (index 사용)	`None`
`poly`	`bool`	Segmentation format 유무	`False`

Returns:

Type	Description
`None`	`{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

Examples:

>>> coco = zz.vision.CocoLoader(data_path)
>>> coco.yolo(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = ["label1", "label2"]
>>> cooc.yolo(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

Source code in zerohertzLib/vision/loader.py

def yolo(
    self,
    target_path: str,
    label: list[str] | None = None,
    poly: bool = False,
) -> None:
    """COCO format을 YOLO format으로 변환

    Args:
        target_path: YOLO format data가 저장될 경로
        label: COCO에서 사용한 label을 정수로 변환하는 list (index 사용)
        poly: Segmentation format 유무

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

    Examples:
        >>> coco = zz.vision.CocoLoader(data_path)
        >>> coco.yolo(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = ["label1", "label2"]
        >>> cooc.yolo(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for idx in tqdm(range(len(self))):
        img_path, class_list, bboxes, polys = self(
            idx, read=False, int_class=label is None
        )
        converted_gt = []
        if poly:
            for cls, poly_ in zip(class_list, polys):
                poly_ /= (self.images[idx]["width"], self.images[idx]["height"])
                if label:
                    cls = label.index(cls)
                converted_gt.append(
                    f"{cls} " + " ".join(map(str, poly_.reshape(-1)))
                )
        else:
            for cls, box in zip(class_list, bboxes):
                box /= (self.images[idx]["width"], self.images[idx]["height"]) * 2
                if label:
                    cls = label.index(cls)
                converted_gt.append(f"{cls} " + " ".join(map(str, box)))
        img_file_name = os.path.basename(img_path)
        txt_file_name = ".".join(img_file_name.split(".")[:-1]) + ".txt"
        try:
            shutil.copy(
                img_path, os.path.join(target_path, "images", img_file_name)
            )
            with open(
                os.path.join(target_path, "labels", txt_file_name),
                "w",
                encoding="utf-8",
            ) as file:
                file.writelines("\n".join(converted_gt))
        except FileNotFoundError:
            print(f"'{img_path}' is not found")

ImageLoader ¶

ImageLoader(path: str = './', cnt: int = 1)

경로와 image의 수를 지정하여 경로 내 image를 return하는 class

Parameters:

Name	Type	Description	Default
`path`	`str`	Image들이 존재하는 경로	`'./'`
`cnt`	`int`	호출 시 return 할 image의 수	`1`

Attributes:

Name	Type	Description
`image_paths`		지정한 경로 내 image들의 경로

Examples:

>>> il = zz.vision.ImageLoader()
>>> len(il)
510
>>> il[0][0]
'./1.2.410.200001.1.9999.1.20220513101953581.1.1.jpg'
>>> il[0][1].shape
(480, 640, 3)
>>> il = zz.vision.ImageLoader(cnt=4)
>>> len(il)
128
>>> il[0][0]
['./1.2.410.200001.1.9999.1.20220513101953581.1.1.jpg', '...', '...', '...']
>>> il[0][1][0].shape
(480, 640, 3)
>>> len(il[0][0])
4
>>> len(il[0][1])
4

Methods:

Name	Description
`__getitem__`	Index에 따른 image 정보를 반환
`__len__`	Image 수를 반환

Source code in zerohertzLib/vision/loader.py

def __init__(self, path: str = "./", cnt: int = 1) -> None:
    self.cnt = cnt
    self.image_paths = _get_image_paths(path)
    self.image_paths.sort()

cnt `instance-attribute` ¶

cnt = cnt

image_paths `instance-attribute` ¶

image_paths = _get_image_paths(path)

getitem ¶

__getitem__(idx: int) -> tuple[str, NDArray[uint8]] | tuple[list[str], list[NDArray[uint8]]]

Index에 따른 image 정보를 반환

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required

Returns:

Type	Description
`tuple[str, NDArray[uint8]] \| tuple[list[str], list[NDArray[uint8]]]`	`cnt` 에 따른 file 경로 및 image 값

Source code in zerohertzLib/vision/loader.py

def __getitem__(
    self, idx: int
) -> tuple[str, NDArray[np.uint8]] | tuple[list[str], list[NDArray[np.uint8]]]:
    """Index에 따른 image 정보를 반환

    Args:
        idx: 입력 index

    Returns:
        `cnt` 에 따른 file 경로 및 image 값
    """
    if self.cnt == 1:
        return (
            self.image_paths[idx],
            cv2.imread(self.image_paths[idx], cv2.IMREAD_UNCHANGED),
        )
    return (
        self.image_paths[self.cnt * idx : self.cnt * (idx + 1)],
        [
            cv2.imread(path, cv2.IMREAD_UNCHANGED)
            for path in self.image_paths[self.cnt * idx : self.cnt * (idx + 1)]
        ],
    )

len ¶

__len__() -> int

Image 수를 반환

Returns:

Type	Description
`int`	`cnt` 에 해당하는 image들의 수

Source code in zerohertzLib/vision/loader.py

def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        `cnt` 에 해당하는 image들의 수
    """
    return math.ceil(len(self.image_paths) / self.cnt)

JsonImageLoader ¶

JsonImageLoader(data_path: str, json_path: str, json_key: str)

JSON file을 통해 image와 JSON file 내 정보를 불러오는 class

Parameters:

Name	Type	Description	Default
`data_path`	`str`	목표 data가 존재하는 directory 경로	required
`json_path`	`str`	목표 JSON file이 존재하는 directory 경로	required
`json_key`	`str`	`data_path` 에서 data의 file 이름을 나타내는 key 값	required

Attributes:

Name	Type	Description
`json`		JSON file들을 읽어 data 구축 시 활용

Examples:

>>> jil = zz.vision.JsonImageLoader(data_path, json_path, json_key)
100%|█████████████| 17248/17248 [00:04<00:00, 3581.22it/s]
>>> img, js = jil[10]
>>> img.shape
(600, 800, 3)
>>> js.tree()
└─ info
    └─ name
    └─ date_created
...

Methods:

Name	Description
`__getitem__`	읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return
`__len__`	Image 수를 반환

Source code in zerohertzLib/vision/loader.py

def __init__(
    self,
    data_path: str,
    json_path: str,
    json_key: str,
) -> None:
    self.data_path = data_path
    self.json_path = json_path
    self.json = JsonDir(json_path)
    self.json_key = self.json._get_key(json_key)

data_path `instance-attribute` ¶

data_path = data_path

json `instance-attribute` ¶

json = JsonDir(json_path)

json_key `instance-attribute` ¶

json_key = _get_key(json_key)

json_path `instance-attribute` ¶

json_path = json_path

getitem ¶

__getitem__(idx: int) -> tuple[NDArray[uint8], Json]

읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required

Returns:

Type	Description
`tuple[NDArray[uint8], Json]`	Image와 JSON 내 정보

Source code in zerohertzLib/vision/loader.py

def __getitem__(self, idx: int) -> tuple[NDArray[np.uint8], Json]:
    """
    읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return

    Args:
        idx: 입력 index

    Returns:
        Image와 JSON 내 정보
    """
    data_name = self.json[idx].get(self.json_key)
    img = cv2.imread(os.path.join(self.data_path, data_name), cv2.IMREAD_UNCHANGED)
    return img, self.json[idx]

len ¶

__len__() -> int

Image 수를 반환

Returns:

Type	Description
`int`	읽어온 JSON file들의 수

Source code in zerohertzLib/vision/loader.py

def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 JSON file들의 수
    """
    return len(self.json)

LabelStudio ¶

LabelStudio(data_path: str, json_path: str | None = None)

Label Studio 관련 data를 handling하는 class

Parameters:

Name	Type	Description	Default
`data_path`	`str`	Image들이 존재하는 directory 경로	required
`json_path`	`str \| None`	Label Studio에서 다른 format으로 변환할 시 사용될 annotation 정보가 담긴 JSON file	`None`

Examples:

Without json_path:

>>> ls = zz.vision.LabelStudio(data_path)
>>> ls[0]
('0000007864.png', {'data': {'image': 'data/local-files/?d=/label-studio/data/local/tmp/0000007864.png'}})
>>> ls[1]
('0000008658.png', {'data': {'image': 'data/local-files/?d=/label-studio/data/local/tmp/0000008658.png'}})

With json_path: Bbox:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls[0]
>>> ls[0]
('/PATH/TO/IMAGE', {'labels': ['label1', ...], 'polys': [array([0.39471694, 0.30683403, 0.03749811, 0.0167364 ]), ...], 'whs': [(1660, 2349), ...]})
>>> ls[1]
('/PATH/TO/IMAGE', {'labels': ['label2', ...], 'polys': [array([0.29239837, 0.30149896, 0.04013469, 0.02736506]), ...], 'whs': [(1655, 2324), ...]})
>>> ls.labels
{'label1', 'label2'}
>>> ls.type
'rectanglelabels'

Poly:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls[0]
('/PATH/TO/IMAGE', {'labels': ['label1', ...], 'polys': [array([[0.4531892 , 0.32880674], ..., [0.46119428, 0.32580483]]), ...], 'whs': [(3024, 4032), ...]})
>>> ls[1]
('/PATH/TO/IMAGE', {'labels': ['label2', ...], 'polys': [array([[0.31973699, 0.14660367], ..., [0.29032053, 0.1484422 ]]), ...], 'whs': [(3024, 4032), ...]})
>>> ls.labels
{'label1', 'label2'}
>>> ls.type
'polygonlabels'

Methods:

Name	Description
`__getitem__`	Args:
`__len__`	데이터 개수를 반환
`classification`	Label Studio로 annotation한 JSON data를 classification format으로 변환
`coco`	Label Studio로 annotation한 JSON data를 COCO format으로 변환
`json`	Label Studio에 mount된 data를 불러오기 위한 JSON file 생성
`labelme`	Label Studio로 annotation한 JSON data를 LabelMe format으로 변환
`yolo`	Label Studio로 annotation한 JSON data를 YOLO format으로 변환

Attributes:

Name	Type	Description
`annotations`
`data_path`
`data_paths`
`labels`
`path`
`type`

Source code in zerohertzLib/vision/data.py

def __init__(
    self,
    data_path: str,
    json_path: str | None = None,
) -> None:
    self.annotations = None
    if json_path is None:
        self.path = "/label-studio/data/local"
        self.data_paths = _get_image_paths(data_path)
    else:
        self.annotations = Json(json_path)
        self.type = self.annotations[0]["annotations"][0]["result"][0]["type"]
    self.data_path = data_path
    self.labels = set()

annotations `instance-attribute` ¶

annotations = None

data_path `instance-attribute` ¶

data_path = data_path

data_paths `instance-attribute` ¶

data_paths = _get_image_paths(data_path)

labels `instance-attribute` ¶

labels = set()

path `instance-attribute` ¶

path = '/label-studio/data/local'

type `instance-attribute` ¶

type = annotations[0]['annotations'][0]['result'][0]['type']

getitem ¶

__getitem__(idx: int) -> tuple[str, dict[str, dict[str, str]]] | tuple[str, dict[str, list[Any]]]

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required

Returns:

Type	Description
`tuple[str, dict[str, dict[str, str]]] \| tuple[str, dict[str, list[Any]]]`	Index에 따른 image file 이름 또는 경로와 JSON file에 포함될 dictionary 또는 annotation 정보

Source code in zerohertzLib/vision/data.py

def __getitem__(
    self, idx: int
) -> tuple[str, dict[str, dict[str, str]]] | tuple[str, dict[str, list[Any]]]:
    """
    Args:
        idx: 입력 index

    Returns:
        Index에 따른 image file 이름 또는 경로와 JSON file에 포함될 dictionary 또는 annotation 정보
    """
    if self.annotations is None:
        file_name = os.path.basename(self.data_paths[idx])
        return (
            file_name,
            {
                "data": {
                    "image": f"data/local-files/?d={self.path}/{self.data_paths[idx]}"
                }
            },
        )
    file_name = os.path.basename(self.annotations[idx]["data"]["image"])
    file_name = urllib.parse.unquote(file_name)
    if len(file_name) > 8 and "-" == file_name[8]:
        file_name = "-".join(file_name.split("-")[1:])
    file_path = os.path.join(self.data_path, file_name)
    if len(self.annotations[idx]["annotations"]) > 1:
        raise ValueError("The 'annotations' are plural")
    if self.type == "rectanglelabels":
        return (
            file_path,
            self._dict2cwh(self.annotations[idx]["annotations"][0]["result"]),
        )
    if self.type == "polygonlabels":
        return (
            file_path,
            self._dict2poly(self.annotations[idx]["annotations"][0]["result"]),
        )
    raise ValueError(f"Unknown annotation type: {self.type}")

len ¶

__len__() -> int

데이터 개수를 반환

Returns:

Type	Description
`int`	읽어온 image file 혹은 annotation들의 수

Source code in zerohertzLib/vision/data.py

def __len__(self) -> int:
    """데이터 개수를 반환

    Returns:
        읽어온 image file 혹은 annotation들의 수
    """
    if self.annotations is None:
        return len(self.data_paths)
    return len(self.annotations)

_dict2cwh ¶

_dict2cwh(results: list[dict[str, Any]]) -> dict[str, Any]

Source code in zerohertzLib/vision/data.py

def _dict2cwh(self, results: list[dict[str, Any]]) -> dict[str, Any]:
    labels, polys, whs = [], [], []
    for result in results:
        width, height = result["original_width"], result["original_height"]
        box_cwh = (
            np.array(
                [
                    result["value"]["x"],
                    result["value"]["y"],
                    result["value"]["width"],
                    result["value"]["height"],
                ]
            )
            / 100
        )
        if len(result["value"]["rectanglelabels"]) > 1:
            raise ValueError("The 'rectanglelabels' are plural")
        label = result["value"]["rectanglelabels"][0]
        labels.append(label)
        self.labels.add(label)
        polys.append(box_cwh)
        whs.append((width, height))
    return {"labels": labels, "polys": polys, "whs": whs}

_dict2poly ¶

_dict2poly(results: list[dict[str, Any]]) -> dict[str, Any]

Source code in zerohertzLib/vision/data.py

def _dict2poly(self, results: list[dict[str, Any]]) -> dict[str, Any]:
    labels, polys, whs = [], [], []
    for result in results:
        width, height = result["original_width"], result["original_height"]
        box_poly = np.array(result["value"]["points"]) / 100
        if len(result["value"]["polygonlabels"]) > 1:
            raise ValueError("The 'polygonlabels' are plural")
        label = result["value"]["polygonlabels"][0]
        labels.append(label)
        self.labels.add(label)
        polys.append(box_poly)
        whs.append((width, height))
    return {"labels": labels, "polys": polys, "whs": whs}

classification ¶

classification(target_path: str, label: dict[str, Any] | None = None, rand: int = 0, shrink: bool = True, aug: int = 1) -> None

Label Studio로 annotation한 JSON data를 classification format으로 변환

Parameters:

Name	Type	Description	Default
`target_path`	`str`	Classification format data가 저장될 경로	required
`label`	`dict[str, Any] \| None`	Label Studio에서 사용한 label을 변경하는 dictionary	`None`
`rand`	`int`	Image crop 시 random 범위 추가	`0`
`shrink`	`bool`	`rand` 에 의한 crop 시 image의 수축 여부	`True`
`aug`	`int`	한 annotation 당 저장할 image의 수	`1`

Returns:

Type	Description
`None`	annotation의 index, `i`: `rand` 의 index)

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.classification(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = {"label1": "lab1", "label2": "lab2"}
>>> ls.classification(target_path, label, rand=10, aug=10, shrink=False)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

Source code in zerohertzLib/vision/data.py

def classification(
    self,
    target_path: str,
    label: dict[str, Any] | None = None,
    rand: int = 0,
    shrink: bool = True,
    aug: int = 1,
) -> None:
    """Label Studio로 annotation한 JSON data를 classification format으로 변환

    Args:
        target_path: Classification format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary
        rand: Image crop 시 random 범위 추가
        shrink: `rand` 에 의한 crop 시 image의 수축 여부
        aug: 한 annotation 당 저장할 image의 수

    Returns:
        annotation의 index, `i`: `rand` 의 index)

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.classification(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = {"label1": "lab1", "label2": "lab2"}
        >>> ls.classification(target_path, label, rand=10, aug=10, shrink=False)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = {}
    for file_path, result in tqdm(self):
        img = cv2.imread(file_path)
        if img is None:
            print(f"'{file_path}' is not found")
            continue
        img_file = file_path.split("/")[-1].split(".")
        img_file_name = ".".join(img_file[:-1])
        img_file_ext = img_file[-1]
        for idx, (lab, poly, wh) in enumerate(
            zip(result["labels"], result["polys"], result["whs"])
        ):
            if self.type == "rectanglelabels":
                box_xyxy = poly * (wh * 2)
                box_xyxy[2:] += box_xyxy[:2]
            elif self.type == "polygonlabels":
                box_poly = poly * wh
                box_xyxy = poly2xyxy(box_poly)
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            os.makedirs(
                os.path.join(target_path, label.get(lab, lab)), exist_ok=True
            )
            for i in range(aug):
                bias = (2 * rand * (np.random.rand(4) - 0.5)).astype(np.int32)
                if not shrink:
                    bias[:2] = -abs(bias[:2])
                    bias[2:] = abs(bias[2:])
                x_0, y_0, x_1, y_1 = box_xyxy.astype(np.int32) + bias
                try:
                    cv2.imwrite(
                        os.path.join(
                            target_path,
                            label.get(lab, lab),
                            f"{img_file_name}_{idx}_{i}.{img_file_ext}",
                        ),
                        img[y_0:y_1, x_0:x_1, :],
                    )
                except cv2.error:
                    print(
                        f"Impossible crop ('x_0': {x_0}, 'y_0': {y_0}, 'x_1': {x_1}, 'y_1': {y_1})"
                    )

coco ¶

coco(target_path: str, label: dict[str, int]) -> None

Label Studio로 annotation한 JSON data를 COCO format으로 변환

Parameters:

Name	Type	Description	Default
`target_path`	`str`	COCO format data가 저장될 경로	required
`label`	`dict[str, int]`	Label Studio에서 사용한 label을 변경하는 dictionary	required

Returns:

Type	Description
`None`	`{target_path}.json` 에 JSON file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> label = {"label1": 1, "label2": 2}
>>> ls.coco(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

Source code in zerohertzLib/vision/data.py

def coco(self, target_path: str, label: dict[str, int]) -> None:
    """Label Studio로 annotation한 JSON data를 COCO format으로 변환

    Args:
        target_path: COCO format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary

    Returns:
        `{target_path}.json` 에 JSON file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> label = {"label1": 1, "label2": 2}
        >>> ls.coco(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    converted_gt = {
        "images": [],
        "annotations": [],
        "categories": [],
    }
    for lab, id_ in label.items():
        converted_gt["categories"].append({"id": id_, "name": lab})
    ant_id = 0
    for id_, (file_path, result) in enumerate(tqdm(self)):
        _images = {
            "file_name": os.path.basename(file_path),
            "height": result["whs"][0][1],
            "width": result["whs"][0][0],
            "id": id_,
        }
        _annotations = []
        for ant_id_, (lab, poly, wh) in enumerate(
            zip(result["labels"], result["polys"], result["whs"])
        ):
            # box_cwh is [x_0, y_0, width, height] not [cx, cy, width, height]
            if self.type == "rectanglelabels":
                poly = poly * (wh * 2)
                box_cwh = poly.copy()
                poly[2:] += poly[:2]
                poly = xyxy2poly(poly)
            elif self.type == "polygonlabels":
                poly = poly * wh
                box_cwh = poly2cwh(poly)
                box_cwh[:2] -= box_cwh[2:] / 2
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            box_cwh = box_cwh.tolist()
            _annotations.append(
                {
                    "segmentation": [poly.reshape(-1).tolist()],
                    "area": box_cwh[2] * box_cwh[3],
                    "iscrowd": 0,
                    "image_id": id_,
                    "bbox": box_cwh,
                    "category_id": label[lab],
                    "id": ant_id + ant_id_,
                }
            )
        converted_gt["images"].append(_images)
        converted_gt["annotations"] += _annotations
        ant_id += len(result["labels"]) + 1
    write_json(converted_gt, target_path)

json ¶

json(path: str = '/label-studio/data/local', data_function: Callable[[str], dict[str, Any]] | None = None) -> None

Label Studio에 mount된 data를 불러오기 위한 JSON file 생성

Note

아래와 같이 환경 변수가 설정된 Label Studio image를 사용하면 LabelStudio class로 생성된 JSON file을 적용할 수 있다.

FROM heartexlabs/label-studio

ENV LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true

docker run --name label-studio -p 8080:8080 -v ${PWD}/data:/label-studio/data label-studio

Projects → {PROJECT_NAME} → Settings → Cloud Storage → Add Source Storage 클릭 후 아래와 같이 정보를 기재하고 Sync Storage 를 누른다.

Storage Type: Local files
Absolute local path: /label-studio/data/local/${PATH} (data_path: ${PWD}/data/local)
File Filter Regex: ^.*\.(jpe?g|JPE?G|png|PNG|tiff?|TIFF?)$
Treat every bucket object as a source file: True

Sync 이후 LabelStudio class로 생성된 JSON file을 Label Studio에 import하면 아래와 같이 setup 할 수 있다.

Parameters:

Name	Type	Description	Default
`path`	`str`	Local files의 경로	`'/label-studio/data/local'`
`data_function`	`Callable[[str], dict[str, Any]] \| None`	Label Studio에서 사용할 수 있는 `data` 항목 추가 method (예시 참고)	`None`

Returns:

Type	Description
`None`	`{data_path}.json` 에 결과 저장

Examples:

Default:

>>> ls = zz.vision.LabelStudio(data_path)
>>> ls.json()
100%|█████████████| 476/476 [00:00<00:00, 259993.32it/s

[
    {
        "data": {
            "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png"
        }
    },
    {
        "data": {
            "...": "..."
        }
    },
]

With data_function:

def data_function:
    return data_store[file_name]

>>> ls = zz.vision.LabelStudio(data_path)
>>> ls.json(data_function)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

[
    {
        "data": {
            "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png",
            "Label": "...",
            "patient_id": "...",
            "...": "...",
        }
    },
    {
        "data": {
            "...": "..."
        }
    },
]

Source code in zerohertzLib/vision/data.py

def json(
    self,
    path: str = "/label-studio/data/local",
    data_function: Callable[[str], dict[str, Any]] | None = None,
) -> None:
    r"""Label Studio에 mount된 data를 불러오기 위한 JSON file 생성

    Note:
        아래와 같이 환경 변수가 설정된 Label Studio image를 사용하면 `LabelStudio` class로 생성된 JSON file을 적용할 수 있다.

        ```dockerfile
        FROM heartexlabs/label-studio

        ENV LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
        ```

        ```bash
        docker run --name label-studio -p 8080:8080 -v ${PWD}/data:/label-studio/data label-studio
        ```

        `Projects` → `{PROJECT_NAME}` → `Settings` → `Cloud Storage` → `Add Source Storage` 클릭 후 아래와 같이 정보를 기재하고 `Sync Storage` 를 누른다.

        + Storage Type: `Local files`
        + Absolute local path: `/label-studio/data/local/${PATH}` (`data_path`: `${PWD}/data/local`)
        + File Filter Regex: `^.*\.(jpe?g|JPE?G|png|PNG|tiff?|TIFF?)$`
        + Treat every bucket object as a source file: `True`

        ![Label Studio Setup 1](../../../assets/vision/LabelStudio.json.1.png)

        Sync 이후 `LabelStudio` class로 생성된 JSON file을 Label Studio에 import하면 아래와 같이 setup 할 수 있다.

        ![Label Studio Setup 2](../../../assets/vision/LabelStudio.json.2.png)

    Args:
        path: Local files의 경로
        data_function: Label Studio에서 사용할 수 있는 `data` 항목 추가 method (예시 참고)

    Returns:
        `{data_path}.json` 에 결과 저장

    Examples:
        Default:
            ```python
            >>> ls = zz.vision.LabelStudio(data_path)
            >>> ls.json()
            100%|█████████████| 476/476 [00:00<00:00, 259993.32it/s
            ```
            ```json
            [
                {
                    "data": {
                        "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png"
                    }
                },
                {
                    "data": {
                        "...": "..."
                    }
                },
            ]
            ```
        With `data_function`:
            ```python
            def data_function:
                return data_store[file_name]
            ```
            ```python
            >>> ls = zz.vision.LabelStudio(data_path)
            >>> ls.json(data_function)
            100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
            ```
            ```json
            [
                {
                    "data": {
                        "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png",
                        "Label": "...",
                        "patient_id": "...",
                        "...": "...",
                    }
                },
                {
                    "data": {
                        "...": "..."
                    }
                },
            ]
            ```
    """
    self.path = path
    json_data = []
    for file_name, data in tqdm(self):
        if "aug" in file_name:
            continue
        if data_function is not None:
            data["data"].update(data_function(file_name))
        json_data.append(data)
    write_json(json_data, self.data_path)

labelme ¶

labelme(target_path: str, label: dict[str, Any] | None = None) -> None

Label Studio로 annotation한 JSON data를 LabelMe format으로 변환

Parameters:

Name	Type	Description	Default
`target_path`	`str`	LabelMe format data가 저장될 경로	required
`label`	`dict[str, Any] \| None`	Label Studio에서 사용한 label을 변경하는 dictionary	`None`

Returns:

Type	Description
`None`	`{target_path}/images` 및 `{target_path}/labels` 에 image와 JSON file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.labelme(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = {"label1": "lab1", "label2": "lab2"}
>>> ls.labelme(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

Source code in zerohertzLib/vision/data.py

def labelme(self, target_path: str, label: dict[str, Any] | None = None) -> None:
    """Label Studio로 annotation한 JSON data를 LabelMe format으로 변환

    Args:
        target_path: LabelMe format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 JSON file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.labelme(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = {"label1": "lab1", "label2": "lab2"}
        >>> ls.labelme(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = {}
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for file_path, result in tqdm(self):
        img_file_name = file_path.split("/")[-1]
        json_file_name = ".".join(img_file_name.split(".")[:-1])
        converted_gt = []
        for lab, poly, wh in zip(result["labels"], result["polys"], result["whs"]):
            if self.type == "rectanglelabels":
                box_xyxy = poly * (wh * 2)
                box_xyxy[2:] += box_xyxy[:2]
                box_poly = xyxy2poly(box_xyxy)
            elif self.type == "polygonlabels":
                box_poly = poly * wh
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            converted_gt.append(
                {
                    "label": label.get(lab, lab),
                    "points": box_poly.tolist(),
                    "shape_type": "polygon",
                }
            )
        try:
            shutil.copy(
                file_path, os.path.join(target_path, "images", img_file_name)
            )
            write_json(
                {"shapes": converted_gt},
                os.path.join(target_path, "labels", json_file_name),
            )
        except FileNotFoundError:
            print(f"'{file_path}' is not found")

yolo ¶

yolo(target_path: str, label: list[str] | None = None) -> None

Label Studio로 annotation한 JSON data를 YOLO format으로 변환

Parameters:

Name	Type	Description	Default
`target_path`	`str`	YOLO format data가 저장될 경로	required
`label`	`list[str] \| None`	Label Studio에서 사용한 label을 정수로 변환하는 list (index 사용)	`None`

Returns:

Type	Description
`None`	`{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.yolo(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = ["label1", "label2"]
>>> ls.yolo(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]

Source code in zerohertzLib/vision/data.py

def yolo(self, target_path: str, label: list[str] | None = None) -> None:
    """Label Studio로 annotation한 JSON data를 YOLO format으로 변환

    Args:
        target_path: YOLO format data가 저장될 경로
        label: Label Studio에서 사용한 label을 정수로 변환하는 list (index 사용)

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.yolo(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = ["label1", "label2"]
        >>> ls.yolo(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = []
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for file_path, result in tqdm(self):
        img_file_name = os.path.basename(file_path)
        txt_file_name = ".".join(img_file_name.split(".")[:-1]) + ".txt"
        converted_gt = []
        for lab, poly in zip(result["labels"], result["polys"]):
            if self.type == "rectanglelabels":
                poly[:2] += poly[2:] / 2
                box_cwh = poly
            elif self.type == "polygonlabels":
                box_cwh = poly2cwh(poly)
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            if lab not in label:
                label.append(lab)
            converted_gt.append(
                f"{label.index(lab)} " + " ".join(map(str, box_cwh)) + "\n"
            )
        try:
            shutil.copy(
                file_path, os.path.join(target_path, "images", img_file_name)
            )
            with open(
                os.path.join(target_path, "labels", txt_file_name),
                "w",
                encoding="utf-8",
            ) as file:
                file.writelines(converted_gt)
        except FileNotFoundError:
            print(f"'{file_path}' is not found")

YoloLoader ¶

YoloLoader(data_path: str = 'images', txt_path: str = 'labels', poly: bool = False, absolute: bool = False, vis_path: str | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None)

YOLO format의 dataset을 읽고 시각화하는 class

Parameters:

Name	Type	Description	Default
`data_path`	`str`	Image가 존재하는 directory 경로	`'images'`
`txt_path`	`str`	YOLO format의 `.txt` 가 존재하는 directory 경로	`'labels'`
`poly`	`bool`	`.txt` file의 format (`False`: detection, `True`: segmentation)	`False`
`absolute`	`bool`	`.txt` file의 절대 좌표계 여부 (`False`: relative coordinates, `True`: absolute coordinates)	`False`
`vis_path`	`str \| None`	시각화 image들이 저장될 경로	`None`
`class_color`	`dict[int \| str, tuple[int, int, int]] \| None`	시각화 결과에 적용될 class에 따른 색상	`None`

Examples:

>>> data_path = ".../images"
>>> txt_path = ".../labels"
>>> class_color = {0: (0, 255, 0), 1: (255, 0, 0), 2: (0, 0, 255)}
>>> yolo = zz.vision.YoloLoader(data_path, txt_path, poly=True, absolute=False, vis_path="tmp", class_color=class_color)
>>> image, class_list, objects = yolo[0]
>>> type(image)
<class 'numpy.ndarray'>
>>> class_list
[1, 1]
>>> len(objects)
2

Methods:

Name	Description
`__getitem__`	Index에 따른 image와 `.txt` file에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)
`__len__`	Image 수를 반환
`labelstudio`	YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

Attributes:

Name	Type	Description
`absolute`
`class_color`
`data_path`
`data_paths`
`poly`
`txt_path`
`vis_path`

Source code in zerohertzLib/vision/loader.py

def __init__(
    self,
    data_path: str = "images",
    txt_path: str = "labels",
    poly: bool = False,
    absolute: bool = False,
    vis_path: str | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
) -> None:
    self.data_path = data_path
    self.data_paths = _get_image_paths(self.data_path)
    self.txt_path = txt_path
    self.poly = poly
    self.absolute = absolute
    self.vis_path = vis_path
    if vis_path is not None:
        if class_color is None:
            raise ValueError(
                "Visualization requires the 'class_color' variable to be specified"
            )
        rmtree(vis_path)
        self.class_color = class_color

absolute `instance-attribute` ¶

absolute = absolute

class_color `instance-attribute` ¶

class_color = class_color

data_path `instance-attribute` ¶

data_path = data_path

data_paths `instance-attribute` ¶

data_paths = _get_image_paths(data_path)

poly `instance-attribute` ¶

poly = poly

txt_path `instance-attribute` ¶

txt_path = txt_path

vis_path `instance-attribute` ¶

vis_path = vis_path

getitem ¶

__getitem__(idx: int) -> tuple[NDArray[uint8], list[int], list[NDArray[DTypeLike]]]

Index에 따른 image와 .txt file에 대한 정보 return (vis_path 와 class_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name	Type	Description	Default
`idx`	`int`	입력 index	required

Returns:

Type	Description
`tuple[NDArray[uint8], list[int], list[NDArray[DTypeLike]]]`	읽어온 image와 그에 따른 `class_list` 및 `bbox` 혹은 `poly`

Source code in zerohertzLib/vision/loader.py

def __getitem__(
    self, idx: int
) -> tuple[NDArray[np.uint8], list[int], list[NDArray[DTypeLike]]]:
    """
    Index에 따른 image와 `.txt` file에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index

    Returns:
        읽어온 image와 그에 따른 `class_list` 및 `bbox` 혹은 `poly`
    """
    data_path = self.data_paths[idx]
    data_file_name = data_path.split("/")[-1]
    txt_path = os.path.join(
        self.txt_path, ".".join(data_file_name.split(".")[:-1]) + ".txt"
    )
    img = cv2.imread(data_path)
    try:
        class_list, objects = self._convert(txt_path, img)
    except FileNotFoundError:
        print(f"'{data_file_name}' is not found")
        return None, None, None
    if self.vis_path is not None:
        self._visualization(data_file_name, img, class_list, objects)
    return img, class_list, objects

len ¶

__len__() -> int

Image 수를 반환

Returns:

Type	Description
`int`	읽어온 image file들의 수

Source code in zerohertzLib/vision/loader.py

def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 image file들의 수
    """
    return len(self.data_paths)

_annotation ¶

_annotation(args: list[int | str | list[str]]) -> dict[str, Any]

Source code in zerohertzLib/vision/loader.py

def _annotation(self, args: list[int | str | list[str]]) -> dict[str, Any]:
    idx, directory, labels = args
    img, class_list, objects = self[idx]
    data_path = self.data_paths[idx]
    data_file_name = data_path.split("/")[-1]
    annotation = {
        "data": {"image": f"data/local-files/?d={directory}/{data_file_name}"}
    }
    result_data = []
    for cls, obj in zip(class_list, objects):
        result_data.append(self._value(img, obj, labels, cls))
    annotation["annotations"] = [{"result": result_data}]
    return annotation

_convert ¶

_convert(txt_path: str, img: NDArray[uint8]) -> tuple[list[int], list[NDArray[DTypeLike]]]

Source code in zerohertzLib/vision/loader.py

def _convert(
    self, txt_path: str, img: NDArray[np.uint8]
) -> tuple[list[int], list[NDArray[DTypeLike]]]:
    class_list = []
    objects = []
    with open(txt_path, "r", encoding="utf-8") as file:
        data_lines = file.readlines()
    for data_line in data_lines:
        data_str = data_line.strip().split(" ")
        class_list.append(int(data_str[0]))
        if self.poly:
            obj = np.array(list(map(float, data_str[1:]))).reshape(-1, 2)
            if not self.absolute:
                obj *= img.shape[:2][::-1]
        else:
            obj = np.array(list(map(float, data_str[1:])))
            if not self.absolute:
                obj *= img.shape[:2][::-1] * 2
        objects.append(obj)
    return class_list, objects

_value ¶

_value(img: NDArray[uint8], obj: NDArray[DTypeLike], labels: list[str], cls: int)

Source code in zerohertzLib/vision/loader.py

def _value(
    self,
    img: NDArray[np.uint8],
    obj: NDArray[DTypeLike],
    labels: list[str],
    cls: int,
):
    original_height, original_width = img.shape[:2]
    obj *= 100
    if self.poly:
        obj /= (original_width, original_height)
        return {
            "original_width": original_width,
            "original_height": original_height,
            "image_rotation": 0,
            "value": {
                "points": obj.tolist(),
                "closed": True,
                "polygonlabels": [labels[cls]],
            },
            "from_name": "label",
            "to_name": "image",
            "type": "polygonlabels",
            "origin": "manual",
        }
    obj[:2] -= obj[2:] / 2
    obj /= (original_width, original_height) * 2
    obj = obj.tolist()
    return {
        "original_width": original_width,
        "original_height": original_height,
        "image_rotation": 0,
        "value": {
            "x": obj[0],
            "y": obj[1],
            "width": obj[2],
            "height": obj[3],
            "rectanglelabels": [labels[cls]],
        },
        "from_name": "label",
        "to_name": "image",
        "type": "rectanglelabels",
        "origin": "manual",
    }

_visualization ¶

_visualization(file_name: str, img: NDArray[uint8], class_list: list[int], objects: list[NDArray[DTypeLike]]) -> None

Source code in zerohertzLib/vision/loader.py

def _visualization(
    self,
    file_name: str,
    img: NDArray[np.uint8],
    class_list: list[int],
    objects: list[NDArray[DTypeLike]],
) -> None:
    if self.poly:
        mks = np.zeros((len(objects), *img.shape[:2]), bool)
        for idx, poly in enumerate(objects):
            mks[idx] = poly2mask(poly, img.shape[:2])
        img = mask(img, mks, class_list=class_list, class_color=self.class_color)
    else:
        for cls, box in zip(class_list, objects):
            img = bbox(img, box, self.class_color[cls])
    cv2.imwrite(os.path.join(self.vis_path, file_name), img)

labelstudio ¶

labelstudio(directory: str = 'image', labels: list[str | None] = None, mp_num: int = 0) -> None

YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

Parameters:

Name	Type	Description	Default
`directory`	`str`	Label Studio 내 `/home/user/{directory}` 의 이름	`'image'`
`labels`	`list[str \| None]`	YOLO format의 `.txt` 상에서 index에 따른 label의 이름	`None`
`mp_num`	`int`	병렬 처리에 사용될 process의 수 (`0`: 직렬 처리)	`0`

Returns:

Type	Description
`None`	`{path}.json` 으로 결과 저장

Examples >>> yolo.labelstudio("images", mp_num=10, labels=["t1", "t2", "t3", "t4"])

Source code in zerohertzLib/vision/loader.py

def labelstudio(
    self,
    directory: str = "image",
    labels: list[str | None] = None,
    mp_num: int = 0,
) -> None:
    """
    YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

    Args:
        directory: Label Studio 내 `/home/user/{directory}` 의 이름
        labels: YOLO format의 `.txt` 상에서 index에 따른 label의 이름
        mp_num: 병렬 처리에 사용될 process의 수 (`0`: 직렬 처리)

    Returns:
        `{path}.json` 으로 결과 저장

    Examples
        >>> yolo.labelstudio("images", mp_num=10, labels=["t1", "t2", "t3", "t4"])
    """
    if labels is None:
        labels = [str(i) for i in range(100)]
    json_data = []
    if mp_num == 0:
        for idx in range(len(self)):
            json_data.append(self._annotation([idx, directory, labels]))
    else:
        args = [[idx, directory, labels] for idx in range(len(self))]
        with mp.Pool(processes=mp_num) as pool:
            annotations = pool.map(self._annotation, args)
        for annotation in annotations:
            json_data.append(annotation)
    write_json(json_data, self.data_path)

bbox ¶

bbox(img: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], color: tuple[int, int, int] = (0, 0, 255), thickness: int = 2) -> NDArray[uint8]

여러 Bbox 시각화

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	Input image (`[H, W, C]`)	required
`box`	`list[int \| float] \| NDArray[DTypeLike]`	하나 혹은 여러 개의 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)	required
`color`	`tuple[int, int, int]`	bbox의 색	`(0, 0, 255)`
`thickness`	`int`	bbox 선의 두께	`2`

Returns:

Type	Description
`NDArray[uint8]`	시각화 결과 (`[H, W, C]`)

Examples:

Bbox: >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]]) >>> box.shape (4, 2) >>> res1 = zz.vision.bbox(img, box, thickness=10)

Bboxes: >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]]) >>> boxes.shape (3, 4) >>> res2 = zz.vision.bbox(img, boxes, (0, 255, 0), thickness=10)

Source code in zerohertzLib/vision/visual.py

def bbox(
    img: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    color: tuple[int, int, int] = (0, 0, 255),
    thickness: int = 2,
) -> NDArray[np.uint8]:
    """여러 Bbox 시각화

    Args:
        img: Input image (`[H, W, C]`)
        box: 하나 혹은 여러 개의 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)
        color: bbox의 색
        thickness: bbox 선의 두께

    Returns:
        시각화 결과 (`[H, W, C]`)

    Examples:
        Bbox:
            >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
            >>> box.shape
            (4, 2)
            >>> res1 = zz.vision.bbox(img, box, thickness=10)

        Bboxes:
            >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
            >>> boxes.shape
            (3, 4)
            >>> res2 = zz.vision.bbox(img, boxes, (0, 255, 0), thickness=10)

        ![Bounding box visualization example](../../../assets/vision/bbox.png){ width="600" }
    """
    box = _list2np(box)
    img = img.copy()
    shape = img.shape
    if len(shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif shape[2] == 4 and len(color) == 3:
        color = (*color, 255)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        box = cwh2poly(box)
    if multi:
        for box_ in box:
            img = _bbox(img, box_, color, thickness)
    else:
        img = _bbox(img, box, color, thickness)
    return img

before_after ¶

before_after(before: NDArray[uint8], after: NDArray[uint8], area: list[int | float] | None = None, per: bool = True, quality: int = 100, file_name: str = 'tmp') -> None

두 image를 비교하는 image 생성

Parameters:

Name	Type	Description	Default
`before`	`NDArray[uint8]`	원본 image	required
`after`	`NDArray[uint8]`	영상 처리 혹은 모델 추론 후 image	required
`area`	`list[int \| float] \| None`	비교할 좌표 (`[x_0, y_0, x_1, y_1]`)	`None`
`per`	`bool`	`area` 의 백분율 여부	`True`
`quality`	`int`	출력 image의 quality (단위: %)	`100`
`file_name`	`str`	저장될 file의 이름	`'tmp'`

Returns:

Type	Description
`None`	현재 directory에 바로 image 저장

Examples:

BGR, GRAY:

>>> after = cv2.GaussianBlur(before, (0, 0), 25)
>>> after = cv2.cvtColor(after, cv2.COLOR_BGR2GRAY)
>>> zz.vision.before_after(before, after, quality=10)

BGR, Resize:

>>> after = cv2.resize(before, (100, 100))
>>> zz.vision.before_after(before, after, [20, 40, 30, 60])

Source code in zerohertzLib/vision/compare.py

def before_after(
    before: NDArray[np.uint8],
    after: NDArray[np.uint8],
    area: list[int | float] | None = None,
    per: bool = True,
    quality: int = 100,
    file_name: str = "tmp",
) -> None:
    """두 image를 비교하는 image 생성

    Args:
        before: 원본 image
        after: 영상 처리 혹은 모델 추론 후 image
        area: 비교할 좌표 (`[x_0, y_0, x_1, y_1]`)
        per: `area` 의 백분율 여부
        quality: 출력 image의 quality (단위: %)
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        BGR, GRAY:
            ```python
            >>> after = cv2.GaussianBlur(before, (0, 0), 25)
            >>> after = cv2.cvtColor(after, cv2.COLOR_BGR2GRAY)
            >>> zz.vision.before_after(before, after, quality=10)
            ```
        ![Before after comparison 1](../../../assets/vision/before_after.1.png){ width="300" }
        BGR, Resize:
            ```python
            >>> after = cv2.resize(before, (100, 100))
            >>> zz.vision.before_after(before, after, [20, 40, 30, 60])
            ```
        ![Before after comparison 2](../../../assets/vision/before_after.2.png){ width="300" }
    """
    before_shape = before.shape
    if area is None:
        if per:
            area = [0.0, 0.0, 100.0, 100.0]
        else:
            raise ValueError("'area' not provided while 'per' is False")
    if per:
        x_0, y_0, x_1, y_1 = _rel2abs(*area, *before_shape[:2])
    else:
        x_0, y_0, x_1, y_1 = area
    before = _cvt_bgra(before)
    before_shape = before.shape
    after = _cvt_bgra(after)
    after_shape = after.shape
    if not before_shape == after_shape:
        after = cv2.resize(after, before_shape[:2][::-1])
        after_shape = after.shape
    before, after = before[x_0:x_1, y_0:y_1, :], after[x_0:x_1, y_0:y_1, :]
    before_shape = before.shape
    height, width, channel = before_shape
    palette = np.zeros((height, 2 * width, channel), dtype=np.uint8)
    palette[:, :width, :] = before
    palette[:, width:, :] = after
    palette = cv2.resize(palette, (0, 0), fx=quality / 100, fy=quality / 100)
    cv2.imwrite(f"{file_name}.png", palette)

cutout ¶

cutout(img: NDArray[uint8], poly: list[int | float] | NDArray[DTypeLike], alpha: int = 255, crop: bool = True, background: int = 0) -> NDArray[uint8]

Image 내에서 지정한 좌표를 제외한 부분을 투명화

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`poly`	`list[int \| float] \| NDArray[DTypeLike]`	지정할 좌표 (`[N, 2]`)	required
`alpha`	`int`	지정한 좌표 영역의 투명도	`255`
`crop`	`bool`	출력 image의 Crop 여부	`True`
`background`	`int`	지정한 좌표 외 배경의 투명도	`0`

Returns:

Type	Description
`NDArray[uint8]`	출력 image (`[H, W, 4]`)

Examples:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res1 = zz.vision.cutout(img, poly)
>>> res2 = zz.vision.cutout(img, poly, 128, False)
>>> res3 = zz.vision.cutout(img, poly, background=128)

Source code in zerohertzLib/vision/transform.py

def cutout(
    img: NDArray[np.uint8],
    poly: list[int | float] | NDArray[DTypeLike],
    alpha: int = 255,
    crop: bool = True,
    background: int = 0,
) -> NDArray[np.uint8]:
    """Image 내에서 지정한 좌표를 제외한 부분을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        poly: 지정할 좌표 (`[N, 2]`)
        alpha: 지정한 좌표 영역의 투명도
        crop: 출력 image의 Crop 여부
        background: 지정한 좌표 외 배경의 투명도

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
        >>> res1 = zz.vision.cutout(img, poly)
        >>> res2 = zz.vision.cutout(img, poly, 128, False)
        >>> res3 = zz.vision.cutout(img, poly, background=128)

        ![Image cutout example](../../../assets/vision/cutout.png){ width="600" }
    """
    shape = img.shape[:2]
    poly = _list2np(poly)
    poly = poly.astype(np.int32)
    x_0, x_1 = poly[:, 0].min(), poly[:, 0].max()
    y_0, y_1 = poly[:, 1].min(), poly[:, 1].max()
    mask = poly2mask(poly, shape)
    if background == 0:
        mask = (mask * alpha).astype(np.uint8)
    else:
        mask = mask.astype(np.uint8)
        mask[mask == 0] = background
        mask[mask == 1] = alpha
    img = Image.fromarray(img)
    mask = Image.fromarray(mask)
    img.putalpha(mask)
    if crop:
        return np.array(img)[y_0:y_1, x_0:x_1, :]
    return np.array(img)

cwh2poly ¶

cwh2poly(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

Examples:

>>> zz.vision.cwh2poly([20, 30, 20, 20])
array([[10, 20],
       [30, 20],
       [30, 40],
       [10, 40]])
>>> zz.vision.cwh2poly(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
array([[[ 10,  20],
        [ 30,  20],
        [ 30,  40],
        [ 10,  40]],
       [[ 30,  50],
        [ 70,  50],
        [ 70, 100],
        [ 30, 100]]])

Source code in zerohertzLib/vision/convert.py

def cwh2poly(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Examples:
        >>> zz.vision.cwh2poly([20, 30, 20, 20])
        array([[10, 20],
               [30, 20],
               [30, 40],
               [10, 40]])
        >>> zz.vision.cwh2poly(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
        array([[[ 10,  20],
                [ 30,  20],
                [ 30,  40],
                [ 10,  40]],
               [[ 30,  50],
                [ 70,  50],
                [ 70, 100],
                [ 30, 100]]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'cwh' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4, 2), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _cwh2poly(box_)
        return boxes
    return _cwh2poly(box)

cwh2xyxy ¶

cwh2xyxy(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4])

Examples:

>>> zz.vision.cwh2xyxy([20, 30, 20, 20])
array([10, 20, 30, 40])
>>> zz.vision.cwh2xyxy(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
array([[ 10,  20,  30,  40],
       [ 30,  50,  70, 100]])

Source code in zerohertzLib/vision/convert.py

def cwh2xyxy(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4])

    Examples:
        >>> zz.vision.cwh2xyxy([20, 30, 20, 20])
        array([10, 20, 30, 40])
        >>> zz.vision.cwh2xyxy(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
        array([[ 10,  20,  30,  40],
               [ 30,  50,  70, 100]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'cwh' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _cwh2xyxy(box_)
        return boxes
    return _cwh2xyxy(box)

evaluation ¶

evaluation(ground_truths: NDArray[DTypeLike], inferences: NDArray[DTypeLike], confidences: list[float], gt_classes: list[str] | None = None, inf_classes: list[str] | None = None, file_name: str | None = None, threshold: float = 0.5) -> DataFrame

단일 image 내 detection model의 추론 성능 평가

Parameters:

Name	Type	Description	Default
`ground_truths`	`NDArray[DTypeLike]`	Ground truth object들의 polygon (`[N, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)	required
`inferences`	`NDArray[DTypeLike]`	Model이 추론한 각 object들의 polygon (`[M, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)	required
`confidences`	`list[float]`	Model이 추론한 각 object들의 confidence(`[M]`)	required
`gt_classes`	`list[str] \| None`	Ground truth object들의 class (`[N]`)	`None`
`inf_classes`	`list[str] \| None`	Model이 추론한 각 object들의 class (`[M]`)	`None`
`file_name`	`str \| None`	평가 image의 이름	`None`
`threshold`	`float`	IoU의 threshold	`0.5`

Note

N: 한 image의 ground truth 내 존재하는 object의 수
M: 한 image의 inference 결과 내 존재하는 object의 수

Returns:

Type	Description
`DataFrame`	단일 image의 model 성능 평가 결과

Examples:

>>> poly = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
>>> ground_truths = np.array([poly, poly + 20, poly + 40])
>>> inferences = np.array([poly, poly + 19, poly + 80])
>>> confidences = np.array([0.6, 0.7, 0.8])
>>> zz.vision.evaluation(ground_truths, inferences, confidences, file_name="test.png")
  file_name  instance  confidence  class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
0  test.png         0         0.8    0.0  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
1  test.png         1         0.7    0.0  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
2  test.png         2         0.6    0.0  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
3  test.png         3         0.0    0.0  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
>>> gt_classes = np.array(["cat", "dog", "cat"])
>>> inf_classes = np.array(["cat", "dog", "cat"])
>>> zz.vision.evaluation(ground_truths, inferences, confidences, gt_classes, inf_classes)
   instance  confidence class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
0         0         0.8   cat  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
1         1         0.6   cat  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
2         2         0.0   cat  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
3         3         0.7   dog  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0

Source code in zerohertzLib/vision/eval.py

def evaluation(
    ground_truths: NDArray[DTypeLike],
    inferences: NDArray[DTypeLike],
    confidences: list[float],
    gt_classes: list[str] | None = None,
    inf_classes: list[str] | None = None,
    file_name: str | None = None,
    threshold: float = 0.5,
) -> pd.DataFrame:
    """단일 image 내 detection model의 추론 성능 평가

    Args:
        ground_truths: Ground truth object들의 polygon (`[N, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)
        inferences: Model이 추론한 각 object들의 polygon (`[M, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)
        confidences: Model이 추론한 각 object들의 confidence(`[M]`)
        gt_classes: Ground truth object들의 class (`[N]`)
        inf_classes: Model이 추론한 각 object들의 class (`[M]`)
        file_name: 평가 image의 이름
        threshold: IoU의 threshold

    Note:
        - `N`: 한 image의 ground truth 내 존재하는 object의 수
        - `M`: 한 image의 inference 결과 내 존재하는 object의 수

        ![Model evaluation visualization](../../../assets/vision/evaluation.png){ width="600" }

    Returns:
        단일 image의 model 성능 평가 결과

    Examples:
        >>> poly = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
        >>> ground_truths = np.array([poly, poly + 20, poly + 40])
        >>> inferences = np.array([poly, poly + 19, poly + 80])
        >>> confidences = np.array([0.6, 0.7, 0.8])
        >>> zz.vision.evaluation(ground_truths, inferences, confidences, file_name="test.png")
          file_name  instance  confidence  class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
        0  test.png         0         0.8    0.0  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
        1  test.png         1         0.7    0.0  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
        2  test.png         2         0.6    0.0  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
        3  test.png         3         0.0    0.0  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
        >>> gt_classes = np.array(["cat", "dog", "cat"])
        >>> inf_classes = np.array(["cat", "dog", "cat"])
        >>> zz.vision.evaluation(ground_truths, inferences, confidences, gt_classes, inf_classes)
           instance  confidence class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
        0         0         0.8   cat  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
        1         1         0.6   cat  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
        2         2         0.0   cat  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
        3         3         0.7   dog  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
    """
    logs = defaultdict(list)
    if gt_classes is None and inf_classes is None:
        gt_classes = np.zeros(len(ground_truths))
        inf_classes = np.zeros(len(inferences))
    instance = 0
    for cls in set(gt_classes).union(set(inf_classes)):
        cls_gt = ground_truths[np.where(gt_classes == cls)]
        cls_inf = inferences[np.where(inf_classes == cls)]
        cls_conf = confidences[np.where(inf_classes == cls)]
        sorted_indices = np.argsort(-cls_conf)
        cls_inf = cls_inf[sorted_indices]
        cls_conf = cls_conf[sorted_indices]
        matched = set()
        for confidence, inf in zip(cls_conf, cls_inf):
            best_iou = 0
            best_gt_idx = -1
            for gt_idx, gt in enumerate(cls_gt):
                if gt_idx in matched:
                    continue
                iou_ = iou(gt, inf)
                if iou_ > best_iou:
                    best_iou = iou_
                    best_gt_idx = gt_idx
            if best_iou >= threshold:
                matched.add(best_gt_idx)
                _append(
                    logs,
                    instance,
                    confidence,
                    cls,
                    best_iou,
                    "TP",
                    cls_gt[best_gt_idx],
                    inf,
                )
                instance += 1
            else:
                _append(logs, instance, confidence, cls, 0.0, "FP", None, inf)
                instance += 1
        for gt_idx, gt in enumerate(cls_gt):
            if gt_idx not in matched:
                _append(logs, instance, 0.0, cls, 0.0, "FN", gt, None)
                instance += 1
    logs = pd.DataFrame(logs)
    if file_name is not None:
        logs["file_name"] = file_name
        logs = logs[["file_name"] + [col for col in logs.columns if col != "file_name"]]
    return logs

grid ¶

grid(imgs: list[NDArray[uint8]], size: int = 1000, color: tuple[int, int, int] = (255, 255, 255), file_name: str = 'tmp') -> None

여러 image를 입력받아 정방형 image로 병합

Parameters:

Name	Type	Description	Default
`imgs`	`list[NDArray[uint8]]`	입력 image	required
`size`	`int`	출력 image의 크기	`1000`
`color`	`tuple[int, int, int]`	Padding의 색	`(255, 255, 255)`
`file_name`	`str`	저장될 file의 이름	`'tmp'`

Returns:

Type	Description
`None`	현재 directory에 바로 image 저장

Examples:

>>> imgs = [cv2.resize(img, (random.randrange(300, 1000), random.randrange(300, 1000))) for _ in range(8)]
>>> imgs[2] = cv2.cvtColor(imgs[2], cv2.COLOR_BGR2GRAY)
>>> imgs[3] = cv2.cvtColor(imgs[3], cv2.COLOR_BGR2BGRA)
>>> zz.vision.grid(imgs)
>>> zz.vision.grid(imgs, color=(0, 255, 0))
>>> zz.vision.grid(imgs, color=(0, 0, 0, 0))

Source code in zerohertzLib/vision/compare.py

def grid(
    imgs: list[NDArray[np.uint8]],
    size: int = 1000,
    color: tuple[int, int, int] = (255, 255, 255),
    file_name: str = "tmp",
) -> None:
    """여러 image를 입력받아 정방형 image로 병합

    Args:
        imgs: 입력 image
        size: 출력 image의 크기
        color: Padding의 색
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        >>> imgs = [cv2.resize(img, (random.randrange(300, 1000), random.randrange(300, 1000))) for _ in range(8)]
        >>> imgs[2] = cv2.cvtColor(imgs[2], cv2.COLOR_BGR2GRAY)
        >>> imgs[3] = cv2.cvtColor(imgs[3], cv2.COLOR_BGR2BGRA)
        >>> zz.vision.grid(imgs)
        >>> zz.vision.grid(imgs, color=(0, 255, 0))
        >>> zz.vision.grid(imgs, color=(0, 0, 0, 0))

        ![Image grid example](../../../assets/vision/grid.png){ width="600" }
    """
    cnt = math.ceil(math.sqrt(len(imgs)))
    length = size // cnt
    size = int(length * cnt)
    palette = np.full((size, size, 4), 0, dtype=np.uint8)
    for idx, img in enumerate(imgs):
        d_y, d_x = divmod(idx, cnt)
        x_0, y_0, x_1, y_1 = (
            d_x * length,
            d_y * length,
            (d_x + 1) * length,
            (d_y + 1) * length,
        )
        img = _cvt_bgra(img)
        palette[y_0:y_1, x_0:x_1, :], _ = pad(img, (length, length), color)
    cv2.imwrite(f"{file_name}.png", palette)

img2gif ¶

img2gif(path: str, file_name: str = 'tmp', duration: int = 500) -> None

Directory 내 image들을 GIF로 변환

Parameters:

Name	Type	Description	Default
`path`	`str`	GIF로 변환할 image들이 존재하는 경로	required
`file_name`	`str`	출력될 GIF file 이름	`'tmp'`
`duration`	`int`	ms 단위의 사진 간 간격	`500`

Returns:

Type	Description
`None`	현재 directory에 바로 GIF 저장

Examples:

>>> zz.vision.img2gif("./")

Source code in zerohertzLib/vision/gif.py

def img2gif(
    path: str,
    file_name: str = "tmp",
    duration: int = 500,
) -> None:
    """Directory 내 image들을 GIF로 변환

    Args:
        path: GIF로 변환할 image들이 존재하는 경로
        file_name: 출력될 GIF file 이름
        duration: ms 단위의 사진 간 간격

    Returns:
        현재 directory에 바로 GIF 저장

    Examples:
        >>> zz.vision.img2gif("./")

        ![Images to GIF conversion example](../../../assets/vision/img2gif.gif){ width="200" }
    """
    ext = (
        "jpg",
        "JPG",
        "jpeg",
        "JPEG",
        "png",
        "PNG",
        "tif",
        "TIF",
        "tiff",
        "TIFF",
    )
    image_files = [f for f in os.listdir(path) if f.endswith(ext)]
    image_files.sort()
    images = [Image.open(os.path.join(path, image_file)) for image_file in image_files]
    _create_gif_from_frames(images, file_name, duration)

iou ¶

iou(poly1: NDArray[DTypeLike], poly2: NDArray[DTypeLike]) -> float

IoU (Intersection over Union)를 계산하는 function

Parameters:

Name	Type	Description	Default
`poly1`	`NDArray[DTypeLike]`	IoU를 계산할 polygon (`[S1, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)	required
`poly2`	`NDArray[DTypeLike]`	IoU를 계산할 polygon (`[S2, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)	required

Returns:

Type	Description
`float`	IoU 값

Examples:

>>> poly1 = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
>>> poly2 = poly1 + (5, 0)
>>> poly2
array([[ 5,  0],
       [15,  0],
       [15, 10],
       [ 5, 10]])
>>> zz.vision.iou(poly1, poly2)
0.3333333333333333

Source code in zerohertzLib/vision/eval.py

def iou(poly1: NDArray[DTypeLike], poly2: NDArray[DTypeLike]) -> float:
    """IoU (Intersection over Union)를 계산하는 function

    Args:
        poly1: IoU를 계산할 polygon (`[S1, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)
        poly2: IoU를 계산할 polygon (`[S2, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)

    Returns:
        IoU 값

    Examples:
        >>> poly1 = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
        >>> poly2 = poly1 + (5, 0)
        >>> poly2
        array([[ 5,  0],
               [15,  0],
               [15, 10],
               [ 5, 10]])
        >>> zz.vision.iou(poly1, poly2)
        0.3333333333333333
    """
    polygon1 = Polygon(poly1)
    polygon2 = Polygon(poly2)
    return polygon1.intersection(polygon2).area / polygon1.union(polygon2).area

is_pts_in_poly ¶

is_pts_in_poly(poly: NDArray[DTypeLike], pts: list[int | float] | NDArray[DTypeLike]) -> bool | NDArray[bool]

지점들의 좌표 내 존재 여부 확인 function

Parameters:

Name	Type	Description	Default
`poly`	`NDArray[DTypeLike]`	다각형 (`[N, 2]`)	required
`pts`	`list[int \| float] \| NDArray[DTypeLike]`	point (`[2]` or `[N, 2]`)	required

Returns:

Type	Description
`bool \| NDArray[bool]`	입력 `point` 의 다각형 `poly` 내부 존재 여부

Examples:

>>> poly = np.array([[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]])
>>> zz.vision.is_pts_in_poly(poly, [20, 20])
True
>>> zz.vision.is_pts_in_poly(poly, [[20, 20], [100, 100]])
array([ True, False])
>>> zz.vision.is_pts_in_poly(poly, np.array([20, 20]))
True
>>> zz.vision.is_pts_in_poly(poly, np.array([[20, 20], [100, 100]]))
array([ True, False])

Source code in zerohertzLib/vision/util.py

def is_pts_in_poly(
    poly: NDArray[DTypeLike], pts: list[int | float] | NDArray[DTypeLike]
) -> bool | NDArray[bool]:
    """지점들의 좌표 내 존재 여부 확인 function

    Args:
        poly: 다각형 (`[N, 2]`)
        pts: point (`[2]` or `[N, 2]`)

    Returns:
        입력 `point` 의 다각형 `poly` 내부 존재 여부

    Examples:
        >>> poly = np.array([[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]])
        >>> zz.vision.is_pts_in_poly(poly, [20, 20])
        True
        >>> zz.vision.is_pts_in_poly(poly, [[20, 20], [100, 100]])
        array([ True, False])
        >>> zz.vision.is_pts_in_poly(poly, np.array([20, 20]))
        True
        >>> zz.vision.is_pts_in_poly(poly, np.array([[20, 20], [100, 100]]))
        array([ True, False])
    """
    poly = Path(poly)
    if isinstance(pts, list):
        if isinstance(pts[0], list):
            return poly.contains_points(pts)
        return poly.contains_point(pts)
    if isinstance(pts, np.ndarray):
        shape = pts.shape
        if len(shape) == 1:
            return poly.contains_point(pts)
        if len(shape) == 2:
            return poly.contains_points(pts)
        raise ValueError("The 'pts' must be of shape [2], [N, 2]")
    raise TypeError("The 'pts' must be 'list' or 'np.ndarray'")

mask ¶

mask(img: NDArray[uint8], mks: NDArray[bool] | None = None, poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]] | None = None, color: tuple[int, int, int] = (0, 0, 255), class_list: list[int | str] | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None, border: bool = True, alpha: float = 0.5) -> NDArray[uint8]

Mask 시각화

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`mks`	`NDArray[bool] \| None`	입력 image 위에 병합할 mask (`[H, W]` or `[N, H, W]`)	`None`
`poly`	`list[int \| float] \| NDArray[DTypeLike] \| list[NDArray[DTypeLike]] \| None`	입력 image 위에 병합할 mask (`[M, 2]` or `[N, M, 2]`)	`None`
`color`	`tuple[int, int, int]`	Mask의 색	`(0, 0, 255)`
`class_list`	`list[int \| str] \| None`	`mks` 의 index에 따른 class	`None`
`class_color`	`dict[int \| str, tuple[int, int, int]] \| None`	Class에 따른 색 (`color` 무시)	`None`
`border`	`bool`	Mask의 경계선 표시 여부	`True`
`alpha`	`float`	Mask의 투명도	`0.5`

Returns:

Type	Description
`NDArray[uint8]`	시각화 결과 (`[H, W, C]`)

Examples:

Mask:

>>> H, W, _ = img.shape
>>> cnt = 30
>>> mks = np.zeros((cnt, H, W), np.uint8)
>>> for mks_ in mks:
>>>     center_x = random.randint(0, W)
>>>     center_y = random.randint(0, H)
>>>     radius = random.randint(30, 200)
>>>     cv2.circle(mks_, (center_x, center_y), radius, (True), -1)
>>> mks = mks.astype(bool)
>>> res1 = zz.vision.mask(img, mks)

Mask:

>>> cls = [i for i in range(cnt)]
>>> class_list = [cls[random.randint(0, 5)] for _ in range(cnt)]
>>> class_color = {}
>>> for c in cls:
>>>     class_color[c] = [random.randint(0, 255) for _ in range(3)]
>>> res2 = zz.vision.mask(img, mks, class_list=class_list, class_color=class_color)

Poly:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res3 = zz.vision.mask(img, poly=poly)

Poly:

>>> poly = zz.vision.xyxy2poly(zz.vision.poly2xyxy((np.random.rand(cnt, 4, 2) * (W, H))))
>>> res4 = zz.vision.mask(img, poly=poly, class_list=class_list, class_color=class_color)

Source code in zerohertzLib/vision/visual.py

def mask(
    img: NDArray[np.uint8],
    mks: NDArray[bool] | None = None,
    poly: (
        list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]] | None
    ) = None,
    color: tuple[int, int, int] = (0, 0, 255),
    class_list: list[int | str] | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
    border: bool = True,
    alpha: float = 0.5,
) -> NDArray[np.uint8]:
    """Mask 시각화

    Args:
        img: 입력 image (`[H, W, C]`)
        mks: 입력 image 위에 병합할 mask (`[H, W]` or `[N, H, W]`)
        poly: 입력 image 위에 병합할 mask (`[M, 2]` or `[N, M, 2]`)
        color: Mask의 색
        class_list: `mks` 의 index에 따른 class
        class_color: Class에 따른 색 (`color` 무시)
        border: Mask의 경계선 표시 여부
        alpha: Mask의 투명도

    Returns:
        시각화 결과 (`[H, W, C]`)

    Examples:
        Mask:
            ```python
            >>> H, W, _ = img.shape
            >>> cnt = 30
            >>> mks = np.zeros((cnt, H, W), np.uint8)
            >>> for mks_ in mks:
            >>>     center_x = random.randint(0, W)
            >>>     center_y = random.randint(0, H)
            >>>     radius = random.randint(30, 200)
            >>>     cv2.circle(mks_, (center_x, center_y), radius, (True), -1)
            >>> mks = mks.astype(bool)
            >>> res1 = zz.vision.mask(img, mks)
            ```
        Mask:
            ```python
            >>> cls = [i for i in range(cnt)]
            >>> class_list = [cls[random.randint(0, 5)] for _ in range(cnt)]
            >>> class_color = {}
            >>> for c in cls:
            >>>     class_color[c] = [random.randint(0, 255) for _ in range(3)]
            >>> res2 = zz.vision.mask(img, mks, class_list=class_list, class_color=class_color)
            ```
        Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res3 = zz.vision.mask(img, poly=poly)
            ```
        Poly:
            ```python
            >>> poly = zz.vision.xyxy2poly(zz.vision.poly2xyxy((np.random.rand(cnt, 4, 2) * (W, H))))
            >>> res4 = zz.vision.mask(img, poly=poly, class_list=class_list, class_color=class_color)
            ```

        ![Mask visualization example](../../../assets/vision/mask.png){ width="600" }
    """
    assert (mks is None) ^ (poly is None)
    shape = img.shape
    if len(shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif shape[2] == 4 and len(color) == 3:
        color = (*color, 255)
        if class_list is not None and class_color is not None:
            for key, value in class_color.items():
                if len(value) == 3:
                    class_color[key] = [*value, 255]
    if poly is not None:
        mks = poly2mask(poly, (shape[:2]))
    shape = mks.shape
    overlay = img.copy()
    cumulative_mask = np.zeros(img.shape[:2], dtype=bool)
    if len(shape) == 2:
        overlay[mks] = color
        if border:
            edges = cv2.Canny(mks.astype(np.uint8) * 255, 100, 200)
            overlay[edges > 0] = color
    elif len(shape) == 3:
        for idx, mks_ in enumerate(mks):
            if class_list is not None and class_color is not None:
                color = class_color[class_list[idx]]
            overlapping = cumulative_mask & mks_
            non_overlapping = mks_ & ~cumulative_mask
            cumulative_mask |= mks_
            if overlapping.any():
                overlapping_color = overlay[overlapping].astype(np.float32)
                mixed_color = ((overlapping_color + color) / 2).astype(np.uint8)
                overlay[overlapping] = mixed_color
            if non_overlapping.any():
                overlay[non_overlapping] = color
            if border:
                edges = cv2.Canny(mks_.astype(np.uint8) * 255, 100, 200)
                overlay[edges > 0] = color
    else:
        raise ValueError("The 'mks' must be of shape [H, W] or [N, H, W]")
    return cv2.addWeighted(img, 1 - alpha, overlay, alpha, 0)

meanap ¶

meanap(logs: DataFrame) -> tuple[float, dict[str, float]]

Detection model의 P-R curve 시각화 및 mAP 산출

Parameters:

Name	Type	Description	Default
`logs`	`DataFrame`	`zz.vision.evaluation` function을 통해 평가된 결과	required

Returns:

Type	Description
`tuple[float, dict[str, float]]`	mAP 값 및 class에 따른 AP 값 (시각화 결과는 `prc_curve.png`, `pr_curve.png` 로 현재 directory에 저장)

Examples:

>>> logs1 = zz.vision.evaluation(ground_truths_1, inferences_1, confidences_1, gt_classes, inf_classes, file_name="test_1.png")
>>> logs2 = zz.vision.evaluation(ground_truths_2, inferences_2, confidences_2, gt_classes, inf_classes, file_name="test_2.png")
>>> logs = pd.concat([logs1, logs2], ignore_index=True)
>>> zz.vision.meanap(logs)
(0.7030629916206652, defaultdict(<class 'float'>, {'dog': 0.7177078883735305, 'cat': 0.6884180948677999}))

Source code in zerohertzLib/vision/eval.py

def meanap(logs: pd.DataFrame) -> tuple[float, dict[str, float]]:
    """Detection model의 P-R curve 시각화 및 mAP 산출

    Args:
        logs: `zz.vision.evaluation` function을 통해 평가된 결과

    Returns:
        mAP 값 및 class에 따른 AP 값 (시각화 결과는 `prc_curve.png`, `pr_curve.png` 로 현재 directory에 저장)

    Examples:
        >>> logs1 = zz.vision.evaluation(ground_truths_1, inferences_1, confidences_1, gt_classes, inf_classes, file_name="test_1.png")
        >>> logs2 = zz.vision.evaluation(ground_truths_2, inferences_2, confidences_2, gt_classes, inf_classes, file_name="test_2.png")
        >>> logs = pd.concat([logs1, logs2], ignore_index=True)
        >>> zz.vision.meanap(logs)
        (0.7030629916206652, defaultdict(<class 'float'>, {'dog': 0.7177078883735305, 'cat': 0.6884180948677999}))

        ![Mean Average Precision curves](../../../assets/vision/meanap.png){ width="600" }
    """
    logs = logs.sort_values(by="confidence", ascending=False)
    confidence_per_cls = defaultdict(list)
    recall_per_cls = defaultdict(list)
    precision_per_cls = defaultdict(list)
    pr_curve = defaultdict(list)
    aps = defaultdict(float)
    classes = set(logs["class"])
    for cls in classes:
        gt = len(
            logs[
                (logs["class"] == cls)
                & ((logs["results"] == "TP") | (logs["results"] == "FN"))
            ]
        )
        for confidence in set(logs[logs["class"] == cls]["confidence"]):
            true_positive = len(
                logs[
                    (logs["class"] == cls)
                    & (logs["confidence"] >= confidence)
                    & (logs["results"] == "TP")
                ]
            )
            false_positive = len(
                logs[
                    (logs["class"] == cls)
                    & (logs["confidence"] >= confidence)
                    & (logs["results"] == "FP")
                ]
            )
            if true_positive + false_positive == 0:
                precision = 0
            else:
                precision = true_positive / (true_positive + false_positive)
            if gt == 0:
                recall = 0
            else:
                recall = true_positive / gt  # (true_positive + false_negative)
            pr_curve[cls].append((recall, precision))
            confidence_per_cls[cls].append(confidence)
            recall_per_cls[cls].append(recall)
            precision_per_cls[cls].append(precision)
        pr_curve[cls] = sorted(pr_curve[cls])
        pr_curve[cls].insert(0, (0, pr_curve[cls][0][1]))
        for i in range(1, len(pr_curve[cls])):
            recall_diff = pr_curve[cls][i][0] - pr_curve[cls][i - 1][0]
            precision_max = max(precision[1] for precision in pr_curve[cls][i:])
            aps[cls] += recall_diff * precision_max
    map_ = sum(aps.values()) / len(aps)
    _prc_curve(confidence_per_cls, recall_per_cls, precision_per_cls, classes)
    _pr_curve(pr_curve, classes, map_)
    return map_, aps

pad ¶

pad(img: NDArray[uint8], shape: tuple[int, int], color: tuple[int, int, int] = (255, 255, 255), poly: NDArray[DTypeLike] | None = None) -> tuple[NDArray[uint8], tuple[float, int, int] | NDArray[DTypeLike]]

입력 image를 원하는 shape로 resize 및 pad

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`shape`	`tuple[int, int]`	출력의 shape `(H, W)`	required
`color`	`tuple[int, int, int]`	Padding의 색	`(255, 255, 255)`
`poly`	`NDArray[DTypeLike] \| None`	Padding에 따라 변형될 좌표 (`[N, 2]`)	`None`

Returns:

Type	Description
`tuple[NDArray[uint8], tuple[float, int, int] \| NDArray[DTypeLike]]`	출력 image (`[H, W, C]`) 및 padding에 따른 정보 또는 변형된 좌표값

Note

poly 를 입력하지 않을 시 (ratio, left, top) 가 출력되며 poly * ratio + (left, top) 와 같이 차후에 변환 가능

Examples:

GRAY:

>>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
>>> res1 = cv2.resize(img, (500, 1000))
>>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))

BGR:

>>> res2 = cv2.resize(img, (1000, 500))
>>> res2, _ = zz.vision.pad(res2, (1000, 1000))

BGRA:

>>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
>>> res3 = cv2.resize(img, (500, 1000))
>>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))

Poly:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res4 = cv2.resize(img, (2000, 1000))
>>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
>>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
>>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))

Transformation:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res5 = cv2.resize(img, (2000, 1000))
>>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
>>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
>>> poly = poly * info[0] + info[1:]
>>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))

Source code in zerohertzLib/vision/transform.py

def pad(
    img: NDArray[np.uint8],
    shape: tuple[int, int],
    color: tuple[int, int, int] = (255, 255, 255),
    poly: NDArray[DTypeLike] | None = None,
) -> tuple[NDArray[np.uint8], tuple[float, int, int] | NDArray[DTypeLike]]:
    """입력 image를 원하는 shape로 resize 및 pad

    Args:
        img: 입력 image (`[H, W, C]`)
        shape: 출력의 shape `(H, W)`
        color: Padding의 색
        poly: Padding에 따라 변형될 좌표 (`[N, 2]`)

    Returns:
        출력 image (`[H, W, C]`) 및 padding에 따른 정보 또는 변형된 좌표값

    Note:
        `poly` 를 입력하지 않을 시 `(ratio, left, top)` 가 출력되며 `poly * ratio + (left, top)` 와 같이 차후에 변환 가능

    Examples:
        GRAY:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
            >>> res1 = cv2.resize(img, (500, 1000))
            >>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))
            ```
        BGR:
            ```python
            >>> res2 = cv2.resize(img, (1000, 500))
            >>> res2, _ = zz.vision.pad(res2, (1000, 1000))
            ```
        BGRA:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
            >>> res3 = cv2.resize(img, (500, 1000))
            >>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))
            ```
        Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res4 = cv2.resize(img, (2000, 1000))
            >>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
            >>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
            >>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))
            ```
        Transformation:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res5 = cv2.resize(img, (2000, 1000))
            >>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
            >>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
            >>> poly = poly * info[0] + info[1:]
            >>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))
            ```

        ![Image padding example](../../../assets/vision/pad.png){ width="700" }
    """
    if len(img.shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    if img.shape[2] == 4 and len(color) == 3:
        color = [*color, 255]
    img_height, img_width = img.shape[:2]
    tar_height, tar_width = shape
    if img_width / img_height > tar_width / tar_height:
        ratio = tar_width / img_width
        resize_width, resize_height = tar_width, int(img_height * ratio)
    elif img_width / img_height < tar_width / tar_height:
        ratio = tar_height / img_height
        resize_width, resize_height = int(img_width * ratio), tar_height
    else:
        ratio = 1
        (
            resize_width,
            resize_height,
        ) = (
            tar_width,
            tar_height,
        )
    img = cv2.resize(img, (resize_width, resize_height), interpolation=cv2.INTER_LINEAR)
    top, bottom = (
        (tar_height - resize_height) // 2,
        (tar_height - resize_height) // 2 + (tar_height - resize_height) % 2,
    )
    left, right = (
        (tar_width - resize_width) // 2,
        (tar_width - resize_width) // 2 + (tar_width - resize_width) % 2,
    )
    img = cv2.copyMakeBorder(
        img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
    )
    if poly is None:
        return img, (ratio, left, top)
    return img, poly * ratio + (left, top)

paste ¶

paste(img: NDArray[uint8], target: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], resize: bool = False, vis: bool = False, poly: NDArray[DTypeLike] | None = None, alpha: int | None = None, gaussian: int | None = None) -> NDArray[uint8] | tuple[NDArray[uint8], NDArray[DTypeLike]]

target image를 img 위에 투명도를 포함하여 병합

Note

PIL.Image.paste 를 numpy 와 cv2 기반으로 구현

>>> img = Image.open("test.png").convert("RGBA")
>>> target = Image.open("target.png").convert("RGBA")
>>> img.paste(target, (0, 0), target)

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`target`	`NDArray[uint8]`	Target image (`[H, W, 4]`)	required
`box`	`list[int \| float] \| NDArray[DTypeLike]`	병합될 영역 (`xyxy` 형식)	required
`resize`	`bool`	Target image의 resize 여부	`False`
`vis`	`bool`	지정한 영역 (`box`)의 시각화 여부	`False`
`poly`	`NDArray[DTypeLike] \| None`	변형된 좌표 (`[N, 2]`)	`None`
`alpha`	`int \| None`	`target` image의 투명도 변경	`None`
`gaussian`	`int \| None`	자연스러운 병합을 위해 `target` 의 alpha channel에 적용될 Gaussian blur의 kernel size	`None`

Returns:

Type	Description
`NDArray[uint8] \| tuple[NDArray[uint8], NDArray[DTypeLike]]`	시각화 결과 (`[H, W, 4]`) 및 `poly` 입력 시 변형된 좌표값

Examples:

Without Poly:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> target = zz.vision.cutout(img, poly, 200)
>>> res1 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, vis=True)
>>> res2 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, vis=True, alpha=255)

With Poly:

>>> poly -= zz.vision.poly2xyxy(poly)[:2]
>>> target = zz.vision.bbox(target, poly, color=(255, 0, 0), thickness=20)
>>> res3, poly3 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, poly=poly)
>>> poly3
array([[300.        , 200.        ],
       [557.14285714, 200.        ],
       [900.        , 628.57142857],
       [557.14285714, 800.        ],
       [300.        , 542.85714286]])
>>> res3 = zz.vision.bbox(res3, poly3)
>>> res4, poly4 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly)
>>> poly4
array([[ 200.        ,  200.        ],
       [ 542.85714286,  200.        ],
       [1000.        ,  628.57142857],
       [ 542.85714286,  800.        ],
       [ 200.        ,  542.85714286]])
>>> res4 = zz.vision.bbox(res4, poly4)

Gaussian Blur:

>>> res5, poly5 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly, gaussian=501)
>>> res5 = zz.vision.bbox(res5, poly5)

Source code in zerohertzLib/vision/visual.py

def paste(
    img: NDArray[np.uint8],
    target: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    resize: bool = False,
    vis: bool = False,
    poly: NDArray[DTypeLike] | None = None,
    alpha: int | None = None,
    gaussian: int | None = None,
) -> NDArray[np.uint8] | tuple[NDArray[np.uint8], NDArray[DTypeLike]]:
    """`target` image를 `img` 위에 투명도를 포함하여 병합

    Note:
        `PIL.Image.paste` 를 `numpy` 와 `cv2` 기반으로 구현

        ```python
        >>> img = Image.open("test.png").convert("RGBA")
        >>> target = Image.open("target.png").convert("RGBA")
        >>> img.paste(target, (0, 0), target)
        ```

    Args:
        img: 입력 image (`[H, W, C]`)
        target: Target image (`[H, W, 4]`)
        box: 병합될 영역 (`xyxy` 형식)
        resize: Target image의 resize 여부
        vis: 지정한 영역 (`box`)의 시각화 여부
        poly: 변형된 좌표 (`[N, 2]`)
        alpha: `target` image의 투명도 변경
        gaussian: 자연스러운 병합을 위해 `target` 의 alpha channel에 적용될 Gaussian blur의 kernel size

    Returns:
        시각화 결과 (`[H, W, 4]`) 및 `poly` 입력 시 변형된 좌표값

    Examples:
        Without Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> target = zz.vision.cutout(img, poly, 200)
            >>> res1 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, vis=True)
            >>> res2 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, vis=True, alpha=255)
            ```
        With Poly:
            ```python
            >>> poly -= zz.vision.poly2xyxy(poly)[:2]
            >>> target = zz.vision.bbox(target, poly, color=(255, 0, 0), thickness=20)
            >>> res3, poly3 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, poly=poly)
            >>> poly3
            array([[300.        , 200.        ],
                   [557.14285714, 200.        ],
                   [900.        , 628.57142857],
                   [557.14285714, 800.        ],
                   [300.        , 542.85714286]])
            >>> res3 = zz.vision.bbox(res3, poly3)
            >>> res4, poly4 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly)
            >>> poly4
            array([[ 200.        ,  200.        ],
                   [ 542.85714286,  200.        ],
                   [1000.        ,  628.57142857],
                   [ 542.85714286,  800.        ],
                   [ 200.        ,  542.85714286]])
            >>> res4 = zz.vision.bbox(res4, poly4)
            ```
        Gaussian Blur:
            ```python
            >>> res5, poly5 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly, gaussian=501)
            >>> res5 = zz.vision.bbox(res5, poly5)
            ```

        ![Image pasting example](../../../assets/vision/paste.png){ width="600" }
    """
    x_0, y_0, x_1, y_1 = map(int, box)
    box_height, box_width = y_1 - y_0, x_1 - x_0
    img = img.copy()
    img = _cvt_bgra(img)
    target = target.copy()
    tar_height, tar_width = target.shape[:2]
    if alpha is not None:
        target[:, :, 3][0 < target[:, :, 3]] = alpha
    if gaussian is not None:
        invisible = target[:, :, 3] == 0
        pad_gaussian = gaussian * 3
        target_alpha = cv2.copyMakeBorder(
            target[:, :, 3],
            pad_gaussian,
            pad_gaussian,
            pad_gaussian,
            pad_gaussian,
            cv2.BORDER_CONSTANT,
        )
        target[:, :, 3] = cv2.GaussianBlur(target_alpha, (gaussian, gaussian), 0)[
            pad_gaussian:-pad_gaussian, pad_gaussian:-pad_gaussian
        ]
        target[:, :, 3][invisible] = 0
    if resize:
        target = cv2.resize(
            target, (box_width, box_height), interpolation=cv2.INTER_LINEAR
        )
        if poly is not None:
            poly = poly * (box_width / tar_width, box_height / tar_height) + (x_0, y_0)
    else:
        if poly is None:
            target, _ = pad(target, (box_height, box_width), (0, 0, 0, 0))
        else:
            target, poly = pad(target, (box_height, box_width), (0, 0, 0, 0), poly)
            poly += (x_0, y_0)
    img[y_0:y_1, x_0:x_1, :] = _paste(img[y_0:y_1, x_0:x_1, :], target)
    if vis:
        box = np.array([[x_0, y_0], [x_0, y_1], [x_1, y_1], [x_1, y_0]])
        img = _bbox(img, box, (0, 0, 255, 255), 2)
    if poly is None:
        return img
    return img, poly

poly2area ¶

poly2area(poly: list[int | float] | NDArray[DTypeLike]) -> float

다각형의 면적을 산출하는 function

Parameters:

Name	Type	Description	Default
`poly`	`list[int \| float] \| NDArray[DTypeLike]`	다각형 (`[N, 2]`)	required

Returns:

Type	Description
`float`	다각형의 면적

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> zz.vision.poly2area(poly)
550.0
>>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
>>> zz.vision.poly2area(box)
880000.0

Source code in zerohertzLib/vision/convert.py

def poly2area(poly: list[int | float] | NDArray[DTypeLike]) -> float:
    """다각형의 면적을 산출하는 function

    Args:
        poly: 다각형 (`[N, 2]`)

    Returns:
        다각형의 면적

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> zz.vision.poly2area(poly)
        550.0
        >>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
        >>> zz.vision.poly2area(box)
        880000.0
    """
    poly = _list2np(poly)
    pts_x = poly[:, 0]
    pts_y = poly[:, 1]
    return 0.5 * np.abs(
        np.dot(pts_x, np.roll(pts_y, 1)) - np.dot(pts_y, np.roll(pts_x, 1))
    )

poly2cwh ¶

poly2cwh(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

Examples:

>>> zz.vision.poly2cwh([[10, 20], [30, 20], [30, 40], [10, 40]])
array([20, 30, 20, 20])
>>> zz.vision.poly2cwh(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
array([[20, 30, 20, 20],
       [50, 75, 40, 50]])

Source code in zerohertzLib/vision/convert.py

def poly2cwh(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Returns:
        `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.poly2cwh([[10, 20], [30, 20], [30, 40], [10, 40]])
        array([20, 30, 20, 20])
        >>> zz.vision.poly2cwh(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
        array([[20, 30, 20, 20],
               [50, 75, 40, 50]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        raise ValueError("The 'poly' must be of shape [4, 2], [N, 4, 2]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _poly2cwh(box_)
        return boxes
    return _poly2cwh(box)

poly2mask ¶

poly2mask(poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]], shape: tuple[int, int]) -> NDArray[bool]

다각형 좌표를 입력받아 mask로 변환

Parameters:

Name	Type	Description	Default
`poly`	`list[int \| float] \| NDArray[DTypeLike] \| list[NDArray[DTypeLike]]`	Mask의 꼭짓점 좌표 (`[M, 2]` or `[N, M, 2]`)	required
`shape`	`tuple[int, int]`	출력될 mask의 shape `(H, W)`	required

Returns:

Type	Description
`NDArray[bool]`	변환된 mask (`[H, W]` or `[N, H, W]`)

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> mask1 = zz.vision.poly2mask(poly, (70, 100))
>>> mask1.shape
(70, 100)
>>> mask1.dtype
dtype('bool')
>>> poly = np.array(poly)
>>> mask2 = zz.vision.poly2mask([poly, poly - 10, poly + 20], (70, 100))
>>> mask2.shape
(3, 70, 100)
>>> mask2.dtype
dtype('bool')

Source code in zerohertzLib/vision/convert.py

def poly2mask(
    poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]],
    shape: tuple[int, int],
) -> NDArray[bool]:
    """다각형 좌표를 입력받아 mask로 변환

    Args:
        poly: Mask의 꼭짓점 좌표 (`[M, 2]` or `[N, M, 2]`)
        shape: 출력될 mask의 shape `(H, W)`

    Returns:
        변환된 mask (`[H, W]` or `[N, H, W]`)

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> mask1 = zz.vision.poly2mask(poly, (70, 100))
        >>> mask1.shape
        (70, 100)
        >>> mask1.dtype
        dtype('bool')
        >>> poly = np.array(poly)
        >>> mask2 = zz.vision.poly2mask([poly, poly - 10, poly + 20], (70, 100))
        >>> mask2.shape
        (3, 70, 100)
        >>> mask2.dtype
        dtype('bool')

        ![Polygon to mask conversion example](../../../assets/vision/poly2mask.png){ width="300" }
    """
    if (isinstance(poly, list) and isinstance(poly[0], np.ndarray)) or (
        isinstance(poly, np.ndarray) and len(poly.shape) == 3
    ):
        mks = []
        for _poly in poly:
            mks.append(_poly2mask(_poly, shape))
        mks = np.array(mks)
    else:
        mks = _poly2mask(_list2np(poly), shape)
    return mks

poly2ratio ¶

poly2ratio(poly: list[int | float] | NDArray[DTypeLike]) -> float

다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function

Parameters:

Name	Type	Description	Default
`poly`	`list[int \| float] \| NDArray[DTypeLike]`	다각형 (`[N, 2]`)	required

Returns:

Type	Description
`float`	다각형의 bbox 대비 다각형의 비율

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> zz.vision.poly2ratio(poly)
0.55
>>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
>>> zz.vision.poly2ratio(box)
1.0

Source code in zerohertzLib/vision/convert.py

def poly2ratio(poly: list[int | float] | NDArray[DTypeLike]) -> float:
    """다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function

    Args:
        poly: 다각형 (`[N, 2]`)

    Returns:
        다각형의 bbox 대비 다각형의 비율

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> zz.vision.poly2ratio(poly)
        0.55
        >>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
        >>> zz.vision.poly2ratio(box)
        1.0
    """
    poly_area = poly2area(poly)
    _, _, height, width = poly2cwh(poly)
    bbox_area = height * width
    return poly_area / bbox_area

poly2xyxy ¶

poly2xyxy(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

Examples:

>>> zz.vision.poly2xyxy([[10, 20], [30, 20], [30, 40], [10, 40]])
array([10, 20, 30, 40])
>>> zz.vision.poly2xyxy(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
array([[ 10,  20,  30,  40],
       [ 30,  50,  70, 100]])

Source code in zerohertzLib/vision/convert.py

def poly2xyxy(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Returns:
        `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.poly2xyxy([[10, 20], [30, 20], [30, 40], [10, 40]])
        array([10, 20, 30, 40])
        >>> zz.vision.poly2xyxy(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
        array([[ 10,  20,  30,  40],
               [ 30,  50,  70, 100]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        raise ValueError("The 'poly' must be of shape [4, 2], [N, 4, 2]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _poly2xyxy(box_)
        return boxes
    return _poly2xyxy(box)

text ¶

text(img: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], txt: str | list[str], color: tuple[int, int, int] = (0, 0, 0), vis: bool = False, fontsize: int = 100) -> NDArray[uint8]

Text 시각화

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`box`	`list[int \| float] \| NDArray[DTypeLike]`	문자열이 존재할 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)	required
`txt`	`str \| list[str]`	Image에 추가할 문자열	required
`color`	`tuple[int, int, int]`	문자의 색	`(0, 0, 0)`
`vis`	`bool`	문자 영역의 시각화 여부	`False`
`fontsize`	`int`	문자의 크기	`100`

Returns:

Type	Description
`NDArray[uint8]`	시각화 결과 (`[H, W, 4]`)

Examples:

Bbox:

>>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
>>> box.shape
(4, 2)
>>> res1 = zz.vision.text(img, box, "먼지야")

Bboxes:

>>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
>>> boxes.shape
(3, 4)
>>> res2 = zz.vision.text(img, boxes, ["먼지야", "먼지야", "먼지야"], vis=True)

Source code in zerohertzLib/vision/visual.py

def text(
    img: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    txt: str | list[str],
    color: tuple[int, int, int] = (0, 0, 0),
    vis: bool = False,
    fontsize: int = 100,
) -> NDArray[np.uint8]:
    """Text 시각화

    Args:
        img: 입력 image (`[H, W, C]`)
        box: 문자열이 존재할 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)
        txt: Image에 추가할 문자열
        color: 문자의 색
        vis: 문자 영역의 시각화 여부
        fontsize: 문자의 크기

    Returns:
        시각화 결과 (`[H, W, 4]`)

    Examples:
        Bbox:
            ```python
            >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
            >>> box.shape
            (4, 2)
            >>> res1 = zz.vision.text(img, box, "먼지야")
            ```
        Bboxes:
            ```python
            >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
            >>> boxes.shape
            (3, 4)
            >>> res2 = zz.vision.text(img, boxes, ["먼지야", "먼지야", "먼지야"], vis=True)
            ```

        ![Text on image example](../../../assets/vision/text.png){ width="600" }
    """
    box = _list2np(box)
    img = img.copy()
    img = _cvt_bgra(img)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        box_poly = box
        box_cwh = poly2cwh(box)
    else:
        box_poly = cwh2poly(box)
        box_cwh = box
    if multi:
        if not shape[0] == len(txt):
            raise ValueError("'box.shape[0]' and 'len(txt)' must be equal")
        for b_poly, b_cwh, txt_ in zip(box_poly, box_cwh, txt):
            img = _text(img, b_cwh, txt_, color, fontsize)
            if vis:
                img = _bbox(img, b_poly, (0, 0, 255, 255), 2)
    else:
        img = _text(img, box_cwh, txt, color, fontsize)
        if vis:
            img = _bbox(img, box_poly, (0, 0, 255, 255), 2)
    return img

transparent ¶

transparent(img: NDArray[uint8], threshold: int = 128, reverse: bool = False) -> NDArray[uint8]

입력 image에 대해 threshold 미만의 pixel들을 투명화

Parameters:

Name	Type	Description	Default
`img`	`NDArray[uint8]`	입력 image (`[H, W, C]`)	required
`threshold`	`int`	Threshold	`128`
`reverse`	`bool`	`threshold` 이상의 pixel 투명화 여부	`False`

Returns:

Type	Description
`NDArray[uint8]`	출력 image (`[H, W, 4]`)

Examples:

>>> res1 = zz.vision.transparent(img)
>>> res2 = zz.vision.transparent(img, reverse=True)

Source code in zerohertzLib/vision/transform.py

def transparent(
    img: NDArray[np.uint8],
    threshold: int = 128,
    reverse: bool = False,
) -> NDArray[np.uint8]:
    """입력 image에 대해 `threshold` 미만의 pixel들을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        threshold: Threshold
        reverse: `threshold` 이상의 pixel 투명화 여부

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> res1 = zz.vision.transparent(img)
        >>> res2 = zz.vision.transparent(img, reverse=True)

        ![Transparent background example](../../../assets/vision/transparent.png){ width="600" }
    """
    img = img.copy()
    img = _cvt_bgra(img)
    img_alpha = img[:, :, 3]
    img_bin = threshold > cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
    if reverse:
        img_alpha[~img_bin] = 0
    else:
        img_alpha[img_bin] = 0
    return img

vert ¶

vert(imgs: list[NDArray[uint8]], height: int = 1000, file_name: str = 'tmp') -> None

여러 image를 입력받아 가로 image로 병합

Parameters:

Name	Type	Description	Default
`imgs`	`list[NDArray[uint8]]`	입력 image	required
`height`	`int`	출력 image의 높이	`1000`
`file_name`	`str`	저장될 file의 이름	`'tmp'`

Returns:

Type	Description
`None`	현재 directory에 바로 image 저장

Examples:

>>> imgs = [cv2.resize(img, (random.randrange(300, 600), random.randrange(300, 600))) for _ in range(5)]
>>> zz.vision.vert(imgs)

Source code in zerohertzLib/vision/compare.py

def vert(
    imgs: list[NDArray[np.uint8]],
    height: int = 1000,
    file_name: str = "tmp",
) -> None:
    """여러 image를 입력받아 가로 image로 병합

    Args:
        imgs: 입력 image
        height: 출력 image의 높이
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        >>> imgs = [cv2.resize(img, (random.randrange(300, 600), random.randrange(300, 600))) for _ in range(5)]
        >>> zz.vision.vert(imgs)

        ![Vertical image alignment example](../../../assets/vision/vert.png){ width="600" }
    """
    resized_imgs = []
    width = 0
    for img in imgs:
        shape = img.shape
        img = _cvt_bgra(img)
        if shape[0] != height:
            tar_width = int(height / shape[0] * shape[1])
            img = cv2.resize(img, (tar_width, height))
        else:
            tar_width = shape[1]
        width += tar_width
        resized_imgs.append(img)
    palette = np.full((height, width, 4), 255, dtype=np.uint8)
    width = 0
    for img in resized_imgs:
        img_height, img_width, _ = img.shape
        palette[:img_height, width : width + img_width, :] = img
        width += img_width
    cv2.imwrite(f"{file_name}.png", palette)

vid2gif ¶

vid2gif(path: str, file_name: str = 'tmp', quality: int = 100, fps: int = 15, speed: float = 1.0) -> None

동영상을 GIF로 변환

Parameters:

Name	Type	Description	Default
`path`	`str`	GIF로 변환할 동영상이 존재하는 경로	required
`file_name`	`str`	출력될 GIF file 이름	`'tmp'`
`quality`	`int`	출력될 GIF의 품질	`100`
`fps`	`int`	출력될 GIF의 FPS (Frames Per Second)	`15`
`speed`	`float`	출력될 GIF의 배속	`1.0`

Returns:

Type	Description
`None`	현재 directory에 바로 GIF 저장

Examples:

>>> zz.vision.vid2gif("test.mp4")

Source code in zerohertzLib/vision/gif.py

def vid2gif(
    path: str,
    file_name: str = "tmp",
    quality: int = 100,
    fps: int = 15,
    speed: float = 1.0,
) -> None:
    """동영상을 GIF로 변환

    Args:
        path: GIF로 변환할 동영상이 존재하는 경로
        file_name: 출력될 GIF file 이름
        quality: 출력될 GIF의 품질
        fps: 출력될 GIF의 FPS (Frames Per Second)
        speed: 출력될 GIF의 배속

    Returns:
        현재 directory에 바로 GIF 저장

    Examples:
        >>> zz.vision.vid2gif("test.mp4")

        ![Video to GIF conversion example](../../../assets/vision/vid2gif.gif){ width="300" }
    """
    frames = []
    cap = cv2.VideoCapture(path)
    original_fps = round(cap.get(cv2.CAP_PROP_FPS))
    fps = min(original_fps, fps)
    frame_count_speed = frame_count_fps = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        frame_count_speed += 1
        if round(frame_count_speed % speed) != 0:
            continue
        if frame_count_fps % (int(original_fps / fps)) == 0:
            frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            pil_img = Image.fromarray(frame_rgb)
            width, height = pil_img.size
            new_width = int(width * quality / 100)
            new_height = int(height * quality / 100)
            resized_img = pil_img.resize((new_width, new_height), Image.LANCZOS)
            frames.append(resized_img)
        frame_count_fps += 1
    cap.release()
    duration = int(1000 / fps)
    _create_gif_from_frames(frames, file_name, duration)

xyxy2cwh ¶

xyxy2cwh(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

Examples:

>>> zz.vision.xyxy2cwh([10, 20, 30, 40])
array([20, 30, 20, 20])
>>> zz.vision.xyxy2cwh(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
array([[20, 30, 20, 20],
       [50, 75, 40, 50]])

Source code in zerohertzLib/vision/convert.py

def xyxy2cwh(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.xyxy2cwh([10, 20, 30, 40])
        array([20, 30, 20, 20])
        >>> zz.vision.xyxy2cwh(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
        array([[20, 30, 20, 20],
               [50, 75, 40, 50]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'xyxy' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _xyxy2cwh(box_)
        return boxes
    return _xyxy2cwh(box)

xyxy2poly ¶

xyxy2poly(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name	Type	Description	Default
`box`	`list[int \| float] \| NDArray[DTypeLike]`	`[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)	required

Returns:

Type	Description
`NDArray[DTypeLike]`	`[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

Examples:

>>> zz.vision.xyxy2poly([10, 20, 30, 40])
array([[10, 20],
       [30, 20],
       [30, 40],
       [10, 40]])
>>> zz.vision.xyxy2poly(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
array([[[ 10,  20],
        [ 30,  20],
        [ 30,  40],
        [ 10,  40]],
       [[ 30,  50],
        [ 70,  50],
        [ 70, 100],
        [ 30, 100]]])

Source code in zerohertzLib/vision/convert.py

def xyxy2poly(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Examples:
        >>> zz.vision.xyxy2poly([10, 20, 30, 40])
        array([[10, 20],
               [30, 20],
               [30, 40],
               [10, 40]])
        >>> zz.vision.xyxy2poly(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
        array([[[ 10,  20],
                [ 30,  20],
                [ 30,  40],
                [ 10,  40]],
               [[ 30,  50],
                [ 70,  50],
                [ 70, 100],
                [ 30, 100]]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'xyxy' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4, 2), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _xyxy2poly(box_)
        return boxes
    return _xyxy2poly(box)

zerohertzLib.vision ¶

__all__ module-attribute ¶

CocoLoader ¶

annotations instance-attribute ¶

class_color instance-attribute ¶

classes instance-attribute ¶

data_path instance-attribute ¶

image2annotation instance-attribute ¶

images instance-attribute ¶

vis_path instance-attribute ¶

__call__ ¶

__getitem__ ¶

__len__ ¶

_visualization ¶

yolo ¶

ImageLoader ¶

cnt instance-attribute ¶

image_paths instance-attribute ¶

__getitem__ ¶

__len__ ¶

JsonImageLoader ¶

data_path instance-attribute ¶

json instance-attribute ¶

json_key instance-attribute ¶

json_path instance-attribute ¶

__getitem__ ¶

__len__ ¶

LabelStudio ¶

annotations instance-attribute ¶

data_path instance-attribute ¶

data_paths instance-attribute ¶

labels instance-attribute ¶

path instance-attribute ¶

type instance-attribute ¶

__getitem__ ¶

__len__ ¶

_dict2cwh ¶

_dict2poly ¶

classification ¶

coco ¶

json ¶

labelme ¶

yolo ¶

YoloLoader ¶

absolute instance-attribute ¶

class_color instance-attribute ¶

data_path instance-attribute ¶

data_paths instance-attribute ¶

poly instance-attribute ¶

txt_path instance-attribute ¶

vis_path instance-attribute ¶

__getitem__ ¶

__len__ ¶

_annotation ¶

_convert ¶

_value ¶

_visualization ¶

labelstudio ¶

bbox ¶

before_after ¶

cutout ¶

cwh2poly ¶

cwh2xyxy ¶

evaluation ¶

grid ¶

img2gif ¶

iou ¶

is_pts_in_poly ¶

mask ¶

meanap ¶

pad ¶

paste ¶

poly2area ¶

poly2cwh ¶

poly2mask ¶

poly2ratio ¶

poly2xyxy ¶

text ¶

transparent ¶

vert ¶

all `module-attribute` ¶

annotations `instance-attribute` ¶

class_color `instance-attribute` ¶

classes `instance-attribute` ¶

data_path `instance-attribute` ¶

image2annotation `instance-attribute` ¶

images `instance-attribute` ¶

vis_path `instance-attribute` ¶

call ¶

getitem ¶

len ¶

cnt `instance-attribute` ¶

image_paths `instance-attribute` ¶

getitem ¶

len ¶

data_path `instance-attribute` ¶

json `instance-attribute` ¶

json_key `instance-attribute` ¶

json_path `instance-attribute` ¶

getitem ¶

len ¶

annotations `instance-attribute` ¶

data_path `instance-attribute` ¶

data_paths `instance-attribute` ¶

labels `instance-attribute` ¶

path `instance-attribute` ¶

type `instance-attribute` ¶

getitem ¶

len ¶

absolute `instance-attribute` ¶

class_color `instance-attribute` ¶

data_path `instance-attribute` ¶

data_paths `instance-attribute` ¶

poly `instance-attribute` ¶

txt_path `instance-attribute` ¶

vis_path `instance-attribute` ¶

getitem ¶

len ¶