Skip to content

zerohertzLib.vision

Vision

다양한 image들을 handling하고 시각화하는 함수 및 class들

Important

Bbox의 types

  • cwh: [cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])
  • xyxy: [x0, y0, x1, y1] 로 구성된 bbox ([4] or [N, 4])
  • poly: [[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

Modules:

Name Description
cli
compare
convert
data
eval
gif
loader
transform
util
visual

Classes:

Name Description
CocoLoader

COCO format의 dataset을 읽고 시각화하는 class

ImageLoader

경로와 image의 수를 지정하여 경로 내 image를 return하는 class

JsonImageLoader

JSON file을 통해 image와 JSON file 내 정보를 불러오는 class

LabelStudio

Label Studio 관련 data를 handling하는 class

YoloLoader

YOLO format의 dataset을 읽고 시각화하는 class

Functions:

Name Description
bbox

여러 Bbox 시각화

before_after

두 image를 비교하는 image 생성

cutout

Image 내에서 지정한 좌표를 제외한 부분을 투명화

cwh2poly

Bbox 변환

cwh2xyxy

Bbox 변환

evaluation

단일 image 내 detection model의 추론 성능 평가

grid

여러 image를 입력받아 정방형 image로 병합

img2gif

Directory 내 image들을 GIF로 변환

iou

IoU (Intersection over Union)를 계산하는 function

is_pts_in_poly

지점들의 좌표 내 존재 여부 확인 function

mask

Mask 시각화

meanap

Detection model의 P-R curve 시각화 및 mAP 산출

pad

입력 image를 원하는 shape로 resize 및 pad

paste

target image를 img 위에 투명도를 포함하여 병합

poly2area

다각형의 면적을 산출하는 function

poly2cwh

Bbox 변환

poly2mask

다각형 좌표를 입력받아 mask로 변환

poly2ratio

다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function

poly2xyxy

Bbox 변환

text

Text 시각화

transparent

입력 image에 대해 threshold 미만의 pixel들을 투명화

vert

여러 image를 입력받아 가로 image로 병합

vid2gif

동영상을 GIF로 변환

xyxy2cwh

Bbox 변환

xyxy2poly

Bbox 변환

__all__ module-attribute

__all__ = ['img2gif', 'vid2gif', 'before_after', 'grid', 'bbox', 'mask', 'text', 'cwh2poly', 'cwh2xyxy', 'poly2cwh', 'poly2mask', 'poly2xyxy', 'xyxy2cwh', 'xyxy2poly', 'cutout', 'paste', 'is_pts_in_poly', 'JsonImageLoader', 'vert', 'pad', 'poly2area', 'poly2ratio', 'ImageLoader', 'transparent', 'YoloLoader', 'LabelStudio', 'iou', 'meanap', 'evaluation', 'CocoLoader']

CocoLoader

CocoLoader(data_path: str, vis_path: str | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None)

COCO format의 dataset을 읽고 시각화하는 class

Parameters:

Name Type Description Default
data_path str

Image 및 annotation이 존재하는 directory 경로

required
vis_path str | None

시각화 image들이 저장될 경로

None
class_color dict[int | str, tuple[int, int, int]] | None

시각화 결과에 적용될 class에 따른 색상

None

Examples:

>>> data_path = "train"
>>> class_color = {"label1": (0, 255, 0), "label2": (255, 0, 0)}
>>> coco = zz.vision.CocoLoader(data_path, vis_path="tmp", class_color=class_color)
>>> image, class_list, bboxes, polys = coco(0, False, True)
>>> type(image)
<class 'str'>
>>> image
'{IMAGE_PATH}.jpg'
>>> class_list
[0, 1]
>>> type(bboxes)
<class 'numpy.ndarray'>
>>> bboxes.shape
(2, 4)
>>> image, class_list, bboxes, polys = coco[0]
>>> type(image)
<class 'numpy.ndarray'>
>>> class_list
['label1', 'label2']
>>> type(bboxes)
<class 'numpy.ndarray'>
>>> bboxes.shape
(2, 4)
>>> type(polys)
<class 'list'>

Methods:

Name Description
__call__

Index에 따른 image와 annotation에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

__getitem__

Index에 따른 image와 annotation에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

__len__

Image 수를 반환

yolo

COCO format을 YOLO format으로 변환

Attributes:

Name Type Description
annotations
class_color
classes
data_path
image2annotation
images
vis_path
Source code in zerohertzLib/vision/loader.py
def __init__(
    self,
    data_path: str,
    vis_path: str | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
) -> None:
    self.data_path = data_path
    data = Json(f"{data_path}.json")
    self.images = data["images"]
    self.annotations = data["annotations"]
    self.images.sort(key=lambda x: x["id"])
    self.annotations.sort(key=lambda x: x["image_id"])
    self.image2annotation = defaultdict(list)
    for idx, annotation in enumerate(self.annotations):
        self.image2annotation[annotation["image_id"]].append(idx)
    self.classes = {}
    for idx, cls in enumerate(data["categories"]):
        self.classes[cls["id"]] = (idx, cls["name"])
    self.vis_path = vis_path
    if vis_path is not None:
        if class_color is None:
            raise ValueError(
                "Visualization requires the 'class_color' variable to be specified"
            )
        rmtree(vis_path)
        self.class_color = class_color

annotations instance-attribute

annotations = data['annotations']

class_color instance-attribute

class_color = class_color

classes instance-attribute

classes = {}

data_path instance-attribute

data_path = data_path

image2annotation instance-attribute

image2annotation = defaultdict(list)

images instance-attribute

images = data['images']

vis_path instance-attribute

vis_path = vis_path

__call__

__call__(idx: int, read: bool = False, int_class: bool = False) -> tuple[str | NDArray[uint8], list[int | str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]

Index에 따른 image와 annotation에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name Type Description Default
idx int

입력 index

required
read bool

Image 읽음 여부

False
int_class bool

출력될 class의 type 지정

False

Returns:

Type Description
tuple[str | NDArray[uint8], list[int | str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]

Image 경로 혹은 읽어온 image와 그에 따른 class_list, bboxes, polys

Source code in zerohertzLib/vision/loader.py
def __call__(
    self, idx: int, read: bool = False, int_class: bool = False
) -> tuple[
    str | NDArray[np.uint8],
    list[int | str],
    NDArray[DTypeLike],
    list[NDArray[DTypeLike]],
]:
    """
    Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index
        read: Image 읽음 여부
        int_class: 출력될 class의 type 지정

    Returns:
        Image 경로 혹은 읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`
    """
    img_path = os.path.join(
        self.data_path, os.path.basename(self.images[idx]["file_name"])
    )
    if read:
        img = cv2.imread(img_path)
    else:
        img = img_path
    class_list = []
    bboxes = []
    polys = []
    for idx_ in self.image2annotation[self.images[idx]["id"]]:
        annotation = self.annotations[idx_]
        if int_class:
            class_list.append(self.classes[annotation["category_id"]][0])
        else:
            class_list.append(self.classes[annotation["category_id"]][1])
        bboxes.append(
            [
                annotation["bbox"][0] + annotation["bbox"][2] / 2,
                annotation["bbox"][1] + annotation["bbox"][3] / 2,
                annotation["bbox"][2],
                annotation["bbox"][3],
            ]
        )
        if "segmentation" in annotation.keys():
            polys.append(np.array(annotation["segmentation"][0]).reshape(-1, 2))
    bboxes = np.array(bboxes)
    return img, class_list, bboxes, polys

__getitem__

Index에 따른 image와 annotation에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name Type Description Default
idx int

입력 index

required

Returns:

Type Description
tuple[NDArray[uint8], list[str], NDArray[DTypeLike], list[NDArray[DTypeLike]]]

읽어온 image와 그에 따른 class_list, bboxes, polys

Source code in zerohertzLib/vision/loader.py
def __getitem__(
    self, idx: int
) -> tuple[
    NDArray[np.uint8], list[str], NDArray[DTypeLike], list[NDArray[DTypeLike]]
]:
    """
    Index에 따른 image와 annotation에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index

    Returns:
        읽어온 image와 그에 따른 `class_list`, `bboxes`, `polys`
    """
    img, class_list, bboxes, polys = self(idx, read=True)
    if self.vis_path is not None:
        self._visualization(
            os.path.basename(self.images[idx]["file_name"]),
            img,
            class_list,
            bboxes,
            polys,
        )
    return img, class_list, bboxes, polys

__len__

__len__() -> int

Image 수를 반환

Returns:

Type Description
int

읽어온 image file들의 수

Source code in zerohertzLib/vision/loader.py
def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 image file들의 수
    """
    return len(self.images)

_visualization

_visualization(file_name: str, img: NDArray[uint8], class_list: list[str], bboxes: NDArray[DTypeLike], polys: list[NDArray[DTypeLike]]) -> None
Source code in zerohertzLib/vision/loader.py
def _visualization(
    self,
    file_name: str,
    img: NDArray[np.uint8],
    class_list: list[str],
    bboxes: NDArray[DTypeLike],
    polys: list[NDArray[DTypeLike]],
) -> None:
    for cls, box in zip(class_list, bboxes):
        img = bbox(img, box, self.class_color[cls])
    if polys:
        mks = np.zeros((len(polys), *img.shape[:2]), bool)
        for idx, poly in enumerate(polys):
            mks[idx] = poly2mask(poly, img.shape[:2])
        img = mask(img, mks, class_list=class_list, class_color=self.class_color)
    cv2.imwrite(os.path.join(self.vis_path, file_name), img)

yolo

yolo(target_path: str, label: list[str] | None = None, poly: bool = False) -> None

COCO format을 YOLO format으로 변환

Parameters:

Name Type Description Default
target_path str

YOLO format data가 저장될 경로

required
label list[str] | None

COCO에서 사용한 label을 정수로 변환하는 list (index 사용)

None
poly bool

Segmentation format 유무

False

Returns:

Type Description
None

{target_path}/images{target_path}/labels 에 image와 .txt file 저장

Examples:

>>> coco = zz.vision.CocoLoader(data_path)
>>> coco.yolo(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = ["label1", "label2"]
>>> cooc.yolo(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
Source code in zerohertzLib/vision/loader.py
def yolo(
    self,
    target_path: str,
    label: list[str] | None = None,
    poly: bool = False,
) -> None:
    """COCO format을 YOLO format으로 변환

    Args:
        target_path: YOLO format data가 저장될 경로
        label: COCO에서 사용한 label을 정수로 변환하는 list (index 사용)
        poly: Segmentation format 유무

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

    Examples:
        >>> coco = zz.vision.CocoLoader(data_path)
        >>> coco.yolo(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = ["label1", "label2"]
        >>> cooc.yolo(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for idx in tqdm(range(len(self))):
        img_path, class_list, bboxes, polys = self(
            idx, read=False, int_class=label is None
        )
        converted_gt = []
        if poly:
            for cls, poly_ in zip(class_list, polys):
                poly_ /= (self.images[idx]["width"], self.images[idx]["height"])
                if label:
                    cls = label.index(cls)
                converted_gt.append(
                    f"{cls} " + " ".join(map(str, poly_.reshape(-1)))
                )
        else:
            for cls, box in zip(class_list, bboxes):
                box /= (self.images[idx]["width"], self.images[idx]["height"]) * 2
                if label:
                    cls = label.index(cls)
                converted_gt.append(f"{cls} " + " ".join(map(str, box)))
        img_file_name = os.path.basename(img_path)
        txt_file_name = ".".join(img_file_name.split(".")[:-1]) + ".txt"
        try:
            shutil.copy(
                img_path, os.path.join(target_path, "images", img_file_name)
            )
            with open(
                os.path.join(target_path, "labels", txt_file_name),
                "w",
                encoding="utf-8",
            ) as file:
                file.writelines("\n".join(converted_gt))
        except FileNotFoundError:
            print(f"'{img_path}' is not found")

ImageLoader

ImageLoader(path: str = './', cnt: int = 1)

경로와 image의 수를 지정하여 경로 내 image를 return하는 class

Parameters:

Name Type Description Default
path str

Image들이 존재하는 경로

'./'
cnt int

호출 시 return 할 image의 수

1

Attributes:

Name Type Description
image_paths

지정한 경로 내 image들의 경로

Examples:

>>> il = zz.vision.ImageLoader()
>>> len(il)
510
>>> il[0][0]
'./1.2.410.200001.1.9999.1.20220513101953581.1.1.jpg'
>>> il[0][1].shape
(480, 640, 3)
>>> il = zz.vision.ImageLoader(cnt=4)
>>> len(il)
128
>>> il[0][0]
['./1.2.410.200001.1.9999.1.20220513101953581.1.1.jpg', '...', '...', '...']
>>> il[0][1][0].shape
(480, 640, 3)
>>> len(il[0][0])
4
>>> len(il[0][1])
4

Methods:

Name Description
__getitem__

Index에 따른 image 정보를 반환

__len__

Image 수를 반환

Source code in zerohertzLib/vision/loader.py
def __init__(self, path: str = "./", cnt: int = 1) -> None:
    self.cnt = cnt
    self.image_paths = _get_image_paths(path)
    self.image_paths.sort()

cnt instance-attribute

cnt = cnt

image_paths instance-attribute

image_paths = _get_image_paths(path)

__getitem__

__getitem__(idx: int) -> tuple[str, NDArray[uint8]] | tuple[list[str], list[NDArray[uint8]]]

Index에 따른 image 정보를 반환

Parameters:

Name Type Description Default
idx int

입력 index

required

Returns:

Type Description
tuple[str, NDArray[uint8]] | tuple[list[str], list[NDArray[uint8]]]

cnt 에 따른 file 경로 및 image 값

Source code in zerohertzLib/vision/loader.py
def __getitem__(
    self, idx: int
) -> tuple[str, NDArray[np.uint8]] | tuple[list[str], list[NDArray[np.uint8]]]:
    """Index에 따른 image 정보를 반환

    Args:
        idx: 입력 index

    Returns:
        `cnt` 에 따른 file 경로 및 image 값
    """
    if self.cnt == 1:
        return (
            self.image_paths[idx],
            cv2.imread(self.image_paths[idx], cv2.IMREAD_UNCHANGED),
        )
    return (
        self.image_paths[self.cnt * idx : self.cnt * (idx + 1)],
        [
            cv2.imread(path, cv2.IMREAD_UNCHANGED)
            for path in self.image_paths[self.cnt * idx : self.cnt * (idx + 1)]
        ],
    )

__len__

__len__() -> int

Image 수를 반환

Returns:

Type Description
int

cnt 에 해당하는 image들의 수

Source code in zerohertzLib/vision/loader.py
def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        `cnt` 에 해당하는 image들의 수
    """
    return math.ceil(len(self.image_paths) / self.cnt)

JsonImageLoader

JsonImageLoader(data_path: str, json_path: str, json_key: str)

JSON file을 통해 image와 JSON file 내 정보를 불러오는 class

Parameters:

Name Type Description Default
data_path str

목표 data가 존재하는 directory 경로

required
json_path str

목표 JSON file이 존재하는 directory 경로

required
json_key str

data_path 에서 data의 file 이름을 나타내는 key 값

required

Attributes:

Name Type Description
json

JSON file들을 읽어 data 구축 시 활용

Examples:

>>> jil = zz.vision.JsonImageLoader(data_path, json_path, json_key)
100%|█████████████| 17248/17248 [00:04<00:00, 3581.22it/s]
>>> img, js = jil[10]
>>> img.shape
(600, 800, 3)
>>> js.tree()
└─ info
    └─ name
    └─ date_created
...

Methods:

Name Description
__getitem__

읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return

__len__

Image 수를 반환

Source code in zerohertzLib/vision/loader.py
def __init__(
    self,
    data_path: str,
    json_path: str,
    json_key: str,
) -> None:
    self.data_path = data_path
    self.json_path = json_path
    self.json = JsonDir(json_path)
    self.json_key = self.json._get_key(json_key)

data_path instance-attribute

data_path = data_path

json instance-attribute

json = JsonDir(json_path)

json_key instance-attribute

json_key = _get_key(json_key)

json_path instance-attribute

json_path = json_path

__getitem__

__getitem__(idx: int) -> tuple[NDArray[uint8], Json]

읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return

Parameters:

Name Type Description Default
idx int

입력 index

required

Returns:

Type Description
tuple[NDArray[uint8], Json]

Image와 JSON 내 정보

Source code in zerohertzLib/vision/loader.py
def __getitem__(self, idx: int) -> tuple[NDArray[np.uint8], Json]:
    """
    읽어온 JSON file들을 list와 같이 indexing 후 해당하는 image return

    Args:
        idx: 입력 index

    Returns:
        Image와 JSON 내 정보
    """
    data_name = self.json[idx].get(self.json_key)
    img = cv2.imread(os.path.join(self.data_path, data_name), cv2.IMREAD_UNCHANGED)
    return img, self.json[idx]

__len__

__len__() -> int

Image 수를 반환

Returns:

Type Description
int

읽어온 JSON file들의 수

Source code in zerohertzLib/vision/loader.py
def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 JSON file들의 수
    """
    return len(self.json)

LabelStudio

LabelStudio(data_path: str, json_path: str | None = None)

Label Studio 관련 data를 handling하는 class

Parameters:

Name Type Description Default
data_path str

Image들이 존재하는 directory 경로

required
json_path str | None

Label Studio에서 다른 format으로 변환할 시 사용될 annotation 정보가 담긴 JSON file

None

Examples:

Without json_path:

>>> ls = zz.vision.LabelStudio(data_path)
>>> ls[0]
('0000007864.png', {'data': {'image': 'data/local-files/?d=/label-studio/data/local/tmp/0000007864.png'}})
>>> ls[1]
('0000008658.png', {'data': {'image': 'data/local-files/?d=/label-studio/data/local/tmp/0000008658.png'}})
With json_path: Bbox:
>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls[0]
>>> ls[0]
('/PATH/TO/IMAGE', {'labels': ['label1', ...], 'polys': [array([0.39471694, 0.30683403, 0.03749811, 0.0167364 ]), ...], 'whs': [(1660, 2349), ...]})
>>> ls[1]
('/PATH/TO/IMAGE', {'labels': ['label2', ...], 'polys': [array([0.29239837, 0.30149896, 0.04013469, 0.02736506]), ...], 'whs': [(1655, 2324), ...]})
>>> ls.labels
{'label1', 'label2'}
>>> ls.type
'rectanglelabels'
Poly:
>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls[0]
('/PATH/TO/IMAGE', {'labels': ['label1', ...], 'polys': [array([[0.4531892 , 0.32880674], ..., [0.46119428, 0.32580483]]), ...], 'whs': [(3024, 4032), ...]})
>>> ls[1]
('/PATH/TO/IMAGE', {'labels': ['label2', ...], 'polys': [array([[0.31973699, 0.14660367], ..., [0.29032053, 0.1484422 ]]), ...], 'whs': [(3024, 4032), ...]})
>>> ls.labels
{'label1', 'label2'}
>>> ls.type
'polygonlabels'

Methods:

Name Description
__getitem__

Args:

__len__

데이터 개수를 반환

classification

Label Studio로 annotation한 JSON data를 classification format으로 변환

coco

Label Studio로 annotation한 JSON data를 COCO format으로 변환

json

Label Studio에 mount된 data를 불러오기 위한 JSON file 생성

labelme

Label Studio로 annotation한 JSON data를 LabelMe format으로 변환

yolo

Label Studio로 annotation한 JSON data를 YOLO format으로 변환

Attributes:

Name Type Description
annotations
data_path
data_paths
labels
path
type
Source code in zerohertzLib/vision/data.py
def __init__(
    self,
    data_path: str,
    json_path: str | None = None,
) -> None:
    self.annotations = None
    if json_path is None:
        self.path = "/label-studio/data/local"
        self.data_paths = _get_image_paths(data_path)
    else:
        self.annotations = Json(json_path)
        self.type = self.annotations[0]["annotations"][0]["result"][0]["type"]
    self.data_path = data_path
    self.labels = set()

annotations instance-attribute

annotations = None

data_path instance-attribute

data_path = data_path

data_paths instance-attribute

data_paths = _get_image_paths(data_path)

labels instance-attribute

labels = set()

path instance-attribute

path = '/label-studio/data/local'

type instance-attribute

type = annotations[0]['annotations'][0]['result'][0]['type']

__getitem__

__getitem__(idx: int) -> tuple[str, dict[str, dict[str, str]]] | tuple[str, dict[str, list[Any]]]

Parameters:

Name Type Description Default
idx int

입력 index

required

Returns:

Type Description
tuple[str, dict[str, dict[str, str]]] | tuple[str, dict[str, list[Any]]]

Index에 따른 image file 이름 또는 경로와 JSON file에 포함될 dictionary 또는 annotation 정보

Source code in zerohertzLib/vision/data.py
def __getitem__(
    self, idx: int
) -> tuple[str, dict[str, dict[str, str]]] | tuple[str, dict[str, list[Any]]]:
    """
    Args:
        idx: 입력 index

    Returns:
        Index에 따른 image file 이름 또는 경로와 JSON file에 포함될 dictionary 또는 annotation 정보
    """
    if self.annotations is None:
        file_name = os.path.basename(self.data_paths[idx])
        return (
            file_name,
            {
                "data": {
                    "image": f"data/local-files/?d={self.path}/{self.data_paths[idx]}"
                }
            },
        )
    file_name = os.path.basename(self.annotations[idx]["data"]["image"])
    file_name = urllib.parse.unquote(file_name)
    if len(file_name) > 8 and "-" == file_name[8]:
        file_name = "-".join(file_name.split("-")[1:])
    file_path = os.path.join(self.data_path, file_name)
    if len(self.annotations[idx]["annotations"]) > 1:
        raise ValueError("The 'annotations' are plural")
    if self.type == "rectanglelabels":
        return (
            file_path,
            self._dict2cwh(self.annotations[idx]["annotations"][0]["result"]),
        )
    if self.type == "polygonlabels":
        return (
            file_path,
            self._dict2poly(self.annotations[idx]["annotations"][0]["result"]),
        )
    raise ValueError(f"Unknown annotation type: {self.type}")

__len__

__len__() -> int

데이터 개수를 반환

Returns:

Type Description
int

읽어온 image file 혹은 annotation들의 수

Source code in zerohertzLib/vision/data.py
def __len__(self) -> int:
    """데이터 개수를 반환

    Returns:
        읽어온 image file 혹은 annotation들의 수
    """
    if self.annotations is None:
        return len(self.data_paths)
    return len(self.annotations)

_dict2cwh

_dict2cwh(results: list[dict[str, Any]]) -> dict[str, Any]
Source code in zerohertzLib/vision/data.py
def _dict2cwh(self, results: list[dict[str, Any]]) -> dict[str, Any]:
    labels, polys, whs = [], [], []
    for result in results:
        width, height = result["original_width"], result["original_height"]
        box_cwh = (
            np.array(
                [
                    result["value"]["x"],
                    result["value"]["y"],
                    result["value"]["width"],
                    result["value"]["height"],
                ]
            )
            / 100
        )
        if len(result["value"]["rectanglelabels"]) > 1:
            raise ValueError("The 'rectanglelabels' are plural")
        label = result["value"]["rectanglelabels"][0]
        labels.append(label)
        self.labels.add(label)
        polys.append(box_cwh)
        whs.append((width, height))
    return {"labels": labels, "polys": polys, "whs": whs}

_dict2poly

_dict2poly(results: list[dict[str, Any]]) -> dict[str, Any]
Source code in zerohertzLib/vision/data.py
def _dict2poly(self, results: list[dict[str, Any]]) -> dict[str, Any]:
    labels, polys, whs = [], [], []
    for result in results:
        width, height = result["original_width"], result["original_height"]
        box_poly = np.array(result["value"]["points"]) / 100
        if len(result["value"]["polygonlabels"]) > 1:
            raise ValueError("The 'polygonlabels' are plural")
        label = result["value"]["polygonlabels"][0]
        labels.append(label)
        self.labels.add(label)
        polys.append(box_poly)
        whs.append((width, height))
    return {"labels": labels, "polys": polys, "whs": whs}

classification

classification(target_path: str, label: dict[str, Any] | None = None, rand: int = 0, shrink: bool = True, aug: int = 1) -> None

Label Studio로 annotation한 JSON data를 classification format으로 변환

Parameters:

Name Type Description Default
target_path str

Classification format data가 저장될 경로

required
label dict[str, Any] | None

Label Studio에서 사용한 label을 변경하는 dictionary

None
rand int

Image crop 시 random 범위 추가

0
shrink bool

rand 에 의한 crop 시 image의 수축 여부

True
aug int

한 annotation 당 저장할 image의 수

1

Returns:

Type Description
None

annotation의 index, i: rand 의 index)

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.classification(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = {"label1": "lab1", "label2": "lab2"}
>>> ls.classification(target_path, label, rand=10, aug=10, shrink=False)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
Source code in zerohertzLib/vision/data.py
def classification(
    self,
    target_path: str,
    label: dict[str, Any] | None = None,
    rand: int = 0,
    shrink: bool = True,
    aug: int = 1,
) -> None:
    """Label Studio로 annotation한 JSON data를 classification format으로 변환

    Args:
        target_path: Classification format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary
        rand: Image crop 시 random 범위 추가
        shrink: `rand` 에 의한 crop 시 image의 수축 여부
        aug: 한 annotation 당 저장할 image의 수

    Returns:
        annotation의 index, `i`: `rand` 의 index)

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.classification(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = {"label1": "lab1", "label2": "lab2"}
        >>> ls.classification(target_path, label, rand=10, aug=10, shrink=False)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = {}
    for file_path, result in tqdm(self):
        img = cv2.imread(file_path)
        if img is None:
            print(f"'{file_path}' is not found")
            continue
        img_file = file_path.split("/")[-1].split(".")
        img_file_name = ".".join(img_file[:-1])
        img_file_ext = img_file[-1]
        for idx, (lab, poly, wh) in enumerate(
            zip(result["labels"], result["polys"], result["whs"])
        ):
            if self.type == "rectanglelabels":
                box_xyxy = poly * (wh * 2)
                box_xyxy[2:] += box_xyxy[:2]
            elif self.type == "polygonlabels":
                box_poly = poly * wh
                box_xyxy = poly2xyxy(box_poly)
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            os.makedirs(
                os.path.join(target_path, label.get(lab, lab)), exist_ok=True
            )
            for i in range(aug):
                bias = (2 * rand * (np.random.rand(4) - 0.5)).astype(np.int32)
                if not shrink:
                    bias[:2] = -abs(bias[:2])
                    bias[2:] = abs(bias[2:])
                x_0, y_0, x_1, y_1 = box_xyxy.astype(np.int32) + bias
                try:
                    cv2.imwrite(
                        os.path.join(
                            target_path,
                            label.get(lab, lab),
                            f"{img_file_name}_{idx}_{i}.{img_file_ext}",
                        ),
                        img[y_0:y_1, x_0:x_1, :],
                    )
                except cv2.error:
                    print(
                        f"Impossible crop ('x_0': {x_0}, 'y_0': {y_0}, 'x_1': {x_1}, 'y_1': {y_1})"
                    )

coco

coco(target_path: str, label: dict[str, int]) -> None

Label Studio로 annotation한 JSON data를 COCO format으로 변환

Parameters:

Name Type Description Default
target_path str

COCO format data가 저장될 경로

required
label dict[str, int]

Label Studio에서 사용한 label을 변경하는 dictionary

required

Returns:

Type Description
None

{target_path}.json 에 JSON file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> label = {"label1": 1, "label2": 2}
>>> ls.coco(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
Source code in zerohertzLib/vision/data.py
def coco(self, target_path: str, label: dict[str, int]) -> None:
    """Label Studio로 annotation한 JSON data를 COCO format으로 변환

    Args:
        target_path: COCO format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary

    Returns:
        `{target_path}.json` 에 JSON file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> label = {"label1": 1, "label2": 2}
        >>> ls.coco(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    converted_gt = {
        "images": [],
        "annotations": [],
        "categories": [],
    }
    for lab, id_ in label.items():
        converted_gt["categories"].append({"id": id_, "name": lab})
    ant_id = 0
    for id_, (file_path, result) in enumerate(tqdm(self)):
        _images = {
            "file_name": os.path.basename(file_path),
            "height": result["whs"][0][1],
            "width": result["whs"][0][0],
            "id": id_,
        }
        _annotations = []
        for ant_id_, (lab, poly, wh) in enumerate(
            zip(result["labels"], result["polys"], result["whs"])
        ):
            # box_cwh is [x_0, y_0, width, height] not [cx, cy, width, height]
            if self.type == "rectanglelabels":
                poly = poly * (wh * 2)
                box_cwh = poly.copy()
                poly[2:] += poly[:2]
                poly = xyxy2poly(poly)
            elif self.type == "polygonlabels":
                poly = poly * wh
                box_cwh = poly2cwh(poly)
                box_cwh[:2] -= box_cwh[2:] / 2
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            box_cwh = box_cwh.tolist()
            _annotations.append(
                {
                    "segmentation": [poly.reshape(-1).tolist()],
                    "area": box_cwh[2] * box_cwh[3],
                    "iscrowd": 0,
                    "image_id": id_,
                    "bbox": box_cwh,
                    "category_id": label[lab],
                    "id": ant_id + ant_id_,
                }
            )
        converted_gt["images"].append(_images)
        converted_gt["annotations"] += _annotations
        ant_id += len(result["labels"]) + 1
    write_json(converted_gt, target_path)

json

json(path: str = '/label-studio/data/local', data_function: Callable[[str], dict[str, Any]] | None = None) -> None

Label Studio에 mount된 data를 불러오기 위한 JSON file 생성

Note

아래와 같이 환경 변수가 설정된 Label Studio image를 사용하면 LabelStudio class로 생성된 JSON file을 적용할 수 있다.

FROM heartexlabs/label-studio

ENV LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
docker run --name label-studio -p 8080:8080 -v ${PWD}/data:/label-studio/data label-studio

Projects{PROJECT_NAME}SettingsCloud StorageAdd Source Storage 클릭 후 아래와 같이 정보를 기재하고 Sync Storage 를 누른다.

  • Storage Type: Local files
  • Absolute local path: /label-studio/data/local/${PATH} (data_path: ${PWD}/data/local)
  • File Filter Regex: ^.*\.(jpe?g|JPE?G|png|PNG|tiff?|TIFF?)$
  • Treat every bucket object as a source file: True

Label Studio Setup 1

Sync 이후 LabelStudio class로 생성된 JSON file을 Label Studio에 import하면 아래와 같이 setup 할 수 있다.

Label Studio Setup 2

Parameters:

Name Type Description Default
path str

Local files의 경로

'/label-studio/data/local'
data_function Callable[[str], dict[str, Any]] | None

Label Studio에서 사용할 수 있는 data 항목 추가 method (예시 참고)

None

Returns:

Type Description
None

{data_path}.json 에 결과 저장

Examples:

Default:

>>> ls = zz.vision.LabelStudio(data_path)
>>> ls.json()
100%|█████████████| 476/476 [00:00<00:00, 259993.32it/s
[
    {
        "data": {
            "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png"
        }
    },
    {
        "data": {
            "...": "..."
        }
    },
]
With data_function:
def data_function:
    return data_store[file_name]
>>> ls = zz.vision.LabelStudio(data_path)
>>> ls.json(data_function)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
[
    {
        "data": {
            "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png",
            "Label": "...",
            "patient_id": "...",
            "...": "...",
        }
    },
    {
        "data": {
            "...": "..."
        }
    },
]

Source code in zerohertzLib/vision/data.py
def json(
    self,
    path: str = "/label-studio/data/local",
    data_function: Callable[[str], dict[str, Any]] | None = None,
) -> None:
    r"""Label Studio에 mount된 data를 불러오기 위한 JSON file 생성

    Note:
        아래와 같이 환경 변수가 설정된 Label Studio image를 사용하면 `LabelStudio` class로 생성된 JSON file을 적용할 수 있다.

        ```dockerfile
        FROM heartexlabs/label-studio

        ENV LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
        ```

        ```bash
        docker run --name label-studio -p 8080:8080 -v ${PWD}/data:/label-studio/data label-studio
        ```

        `Projects` → `{PROJECT_NAME}` → `Settings` → `Cloud Storage` → `Add Source Storage` 클릭 후 아래와 같이 정보를 기재하고 `Sync Storage` 를 누른다.

        + Storage Type: `Local files`
        + Absolute local path: `/label-studio/data/local/${PATH}` (`data_path`: `${PWD}/data/local`)
        + File Filter Regex: `^.*\.(jpe?g|JPE?G|png|PNG|tiff?|TIFF?)$`
        + Treat every bucket object as a source file: `True`

        ![Label Studio Setup 1](../../../assets/vision/LabelStudio.json.1.png)

        Sync 이후 `LabelStudio` class로 생성된 JSON file을 Label Studio에 import하면 아래와 같이 setup 할 수 있다.

        ![Label Studio Setup 2](../../../assets/vision/LabelStudio.json.2.png)

    Args:
        path: Local files의 경로
        data_function: Label Studio에서 사용할 수 있는 `data` 항목 추가 method (예시 참고)

    Returns:
        `{data_path}.json` 에 결과 저장

    Examples:
        Default:
            ```python
            >>> ls = zz.vision.LabelStudio(data_path)
            >>> ls.json()
            100%|█████████████| 476/476 [00:00<00:00, 259993.32it/s
            ```
            ```json
            [
                {
                    "data": {
                        "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png"
                    }
                },
                {
                    "data": {
                        "...": "..."
                    }
                },
            ]
            ```
        With `data_function`:
            ```python
            def data_function:
                return data_store[file_name]
            ```
            ```python
            >>> ls = zz.vision.LabelStudio(data_path)
            >>> ls.json(data_function)
            100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
            ```
            ```json
            [
                {
                    "data": {
                        "image": "data/local-files/?d=/label-studio/data/local/tmp/0000007864.png",
                        "Label": "...",
                        "patient_id": "...",
                        "...": "...",
                    }
                },
                {
                    "data": {
                        "...": "..."
                    }
                },
            ]
            ```
    """
    self.path = path
    json_data = []
    for file_name, data in tqdm(self):
        if "aug" in file_name:
            continue
        if data_function is not None:
            data["data"].update(data_function(file_name))
        json_data.append(data)
    write_json(json_data, self.data_path)

labelme

labelme(target_path: str, label: dict[str, Any] | None = None) -> None

Label Studio로 annotation한 JSON data를 LabelMe format으로 변환

Parameters:

Name Type Description Default
target_path str

LabelMe format data가 저장될 경로

required
label dict[str, Any] | None

Label Studio에서 사용한 label을 변경하는 dictionary

None

Returns:

Type Description
None

{target_path}/images{target_path}/labels 에 image와 JSON file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.labelme(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = {"label1": "lab1", "label2": "lab2"}
>>> ls.labelme(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
Source code in zerohertzLib/vision/data.py
def labelme(self, target_path: str, label: dict[str, Any] | None = None) -> None:
    """Label Studio로 annotation한 JSON data를 LabelMe format으로 변환

    Args:
        target_path: LabelMe format data가 저장될 경로
        label: Label Studio에서 사용한 label을 변경하는 dictionary

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 JSON file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.labelme(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = {"label1": "lab1", "label2": "lab2"}
        >>> ls.labelme(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = {}
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for file_path, result in tqdm(self):
        img_file_name = file_path.split("/")[-1]
        json_file_name = ".".join(img_file_name.split(".")[:-1])
        converted_gt = []
        for lab, poly, wh in zip(result["labels"], result["polys"], result["whs"]):
            if self.type == "rectanglelabels":
                box_xyxy = poly * (wh * 2)
                box_xyxy[2:] += box_xyxy[:2]
                box_poly = xyxy2poly(box_xyxy)
            elif self.type == "polygonlabels":
                box_poly = poly * wh
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            converted_gt.append(
                {
                    "label": label.get(lab, lab),
                    "points": box_poly.tolist(),
                    "shape_type": "polygon",
                }
            )
        try:
            shutil.copy(
                file_path, os.path.join(target_path, "images", img_file_name)
            )
            write_json(
                {"shapes": converted_gt},
                os.path.join(target_path, "labels", json_file_name),
            )
        except FileNotFoundError:
            print(f"'{file_path}' is not found")

yolo

yolo(target_path: str, label: list[str] | None = None) -> None

Label Studio로 annotation한 JSON data를 YOLO format으로 변환

Parameters:

Name Type Description Default
target_path str

YOLO format data가 저장될 경로

required
label list[str] | None

Label Studio에서 사용한 label을 정수로 변환하는 list (index 사용)

None

Returns:

Type Description
None

{target_path}/images{target_path}/labels 에 image와 .txt file 저장

Examples:

>>> ls = zz.vision.LabelStudio(data_path, json_path)
>>> ls.yolo(target_path)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
>>> label = ["label1", "label2"]
>>> ls.yolo(target_path, label)
100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
Source code in zerohertzLib/vision/data.py
def yolo(self, target_path: str, label: list[str] | None = None) -> None:
    """Label Studio로 annotation한 JSON data를 YOLO format으로 변환

    Args:
        target_path: YOLO format data가 저장될 경로
        label: Label Studio에서 사용한 label을 정수로 변환하는 list (index 사용)

    Returns:
        `{target_path}/images` 및 `{target_path}/labels` 에 image와 `.txt` file 저장

    Examples:
        >>> ls = zz.vision.LabelStudio(data_path, json_path)
        >>> ls.yolo(target_path)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
        >>> label = ["label1", "label2"]
        >>> ls.yolo(target_path, label)
        100%|█████████████| 476/476 [00:00<00:00, 78794.25it/s]
    """
    if label is None:
        label = []
    rmtree(os.path.join(target_path, "images"))
    rmtree(os.path.join(target_path, "labels"))
    for file_path, result in tqdm(self):
        img_file_name = os.path.basename(file_path)
        txt_file_name = ".".join(img_file_name.split(".")[:-1]) + ".txt"
        converted_gt = []
        for lab, poly in zip(result["labels"], result["polys"]):
            if self.type == "rectanglelabels":
                poly[:2] += poly[2:] / 2
                box_cwh = poly
            elif self.type == "polygonlabels":
                box_cwh = poly2cwh(poly)
            else:
                raise ValueError(f"Unknown annotation type: {self.type}")
            if lab not in label:
                label.append(lab)
            converted_gt.append(
                f"{label.index(lab)} " + " ".join(map(str, box_cwh)) + "\n"
            )
        try:
            shutil.copy(
                file_path, os.path.join(target_path, "images", img_file_name)
            )
            with open(
                os.path.join(target_path, "labels", txt_file_name),
                "w",
                encoding="utf-8",
            ) as file:
                file.writelines(converted_gt)
        except FileNotFoundError:
            print(f"'{file_path}' is not found")

YoloLoader

YoloLoader(data_path: str = 'images', txt_path: str = 'labels', poly: bool = False, absolute: bool = False, vis_path: str | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None)

YOLO format의 dataset을 읽고 시각화하는 class

Parameters:

Name Type Description Default
data_path str

Image가 존재하는 directory 경로

'images'
txt_path str

YOLO format의 .txt 가 존재하는 directory 경로

'labels'
poly bool

.txt file의 format (False: detection, True: segmentation)

False
absolute bool

.txt file의 절대 좌표계 여부 (False: relative coordinates, True: absolute coordinates)

False
vis_path str | None

시각화 image들이 저장될 경로

None
class_color dict[int | str, tuple[int, int, int]] | None

시각화 결과에 적용될 class에 따른 색상

None

Examples:

>>> data_path = ".../images"
>>> txt_path = ".../labels"
>>> class_color = {0: (0, 255, 0), 1: (255, 0, 0), 2: (0, 0, 255)}
>>> yolo = zz.vision.YoloLoader(data_path, txt_path, poly=True, absolute=False, vis_path="tmp", class_color=class_color)
>>> image, class_list, objects = yolo[0]
>>> type(image)
<class 'numpy.ndarray'>
>>> class_list
[1, 1]
>>> len(objects)
2

Methods:

Name Description
__getitem__

Index에 따른 image와 .txt file에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

__len__

Image 수를 반환

labelstudio

YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

Attributes:

Name Type Description
absolute
class_color
data_path
data_paths
poly
txt_path
vis_path
Source code in zerohertzLib/vision/loader.py
def __init__(
    self,
    data_path: str = "images",
    txt_path: str = "labels",
    poly: bool = False,
    absolute: bool = False,
    vis_path: str | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
) -> None:
    self.data_path = data_path
    self.data_paths = _get_image_paths(self.data_path)
    self.txt_path = txt_path
    self.poly = poly
    self.absolute = absolute
    self.vis_path = vis_path
    if vis_path is not None:
        if class_color is None:
            raise ValueError(
                "Visualization requires the 'class_color' variable to be specified"
            )
        rmtree(vis_path)
        self.class_color = class_color

absolute instance-attribute

absolute = absolute

class_color instance-attribute

class_color = class_color

data_path instance-attribute

data_path = data_path

data_paths instance-attribute

data_paths = _get_image_paths(data_path)

poly instance-attribute

poly = poly

txt_path instance-attribute

txt_path = txt_path

vis_path instance-attribute

vis_path = vis_path

__getitem__

__getitem__(idx: int) -> tuple[NDArray[uint8], list[int], list[NDArray[DTypeLike]]]

Index에 따른 image와 .txt file에 대한 정보 return (vis_pathclass_color 입력 시 시각화 image vis_path 에 저장)

Parameters:

Name Type Description Default
idx int

입력 index

required

Returns:

Type Description
tuple[NDArray[uint8], list[int], list[NDArray[DTypeLike]]]

읽어온 image와 그에 따른 class_listbbox 혹은 poly

Source code in zerohertzLib/vision/loader.py
def __getitem__(
    self, idx: int
) -> tuple[NDArray[np.uint8], list[int], list[NDArray[DTypeLike]]]:
    """
    Index에 따른 image와 `.txt` file에 대한 정보 return (`vis_path` 와 `class_color` 입력 시 시각화 image `vis_path` 에 저장)

    Args:
        idx: 입력 index

    Returns:
        읽어온 image와 그에 따른 `class_list` 및 `bbox` 혹은 `poly`
    """
    data_path = self.data_paths[idx]
    data_file_name = data_path.split("/")[-1]
    txt_path = os.path.join(
        self.txt_path, ".".join(data_file_name.split(".")[:-1]) + ".txt"
    )
    img = cv2.imread(data_path)
    try:
        class_list, objects = self._convert(txt_path, img)
    except FileNotFoundError:
        print(f"'{data_file_name}' is not found")
        return None, None, None
    if self.vis_path is not None:
        self._visualization(data_file_name, img, class_list, objects)
    return img, class_list, objects

__len__

__len__() -> int

Image 수를 반환

Returns:

Type Description
int

읽어온 image file들의 수

Source code in zerohertzLib/vision/loader.py
def __len__(self) -> int:
    """Image 수를 반환

    Returns:
        읽어온 image file들의 수
    """
    return len(self.data_paths)

_annotation

_annotation(args: list[int | str | list[str]]) -> dict[str, Any]
Source code in zerohertzLib/vision/loader.py
def _annotation(self, args: list[int | str | list[str]]) -> dict[str, Any]:
    idx, directory, labels = args
    img, class_list, objects = self[idx]
    data_path = self.data_paths[idx]
    data_file_name = data_path.split("/")[-1]
    annotation = {
        "data": {"image": f"data/local-files/?d={directory}/{data_file_name}"}
    }
    result_data = []
    for cls, obj in zip(class_list, objects):
        result_data.append(self._value(img, obj, labels, cls))
    annotation["annotations"] = [{"result": result_data}]
    return annotation

_convert

_convert(txt_path: str, img: NDArray[uint8]) -> tuple[list[int], list[NDArray[DTypeLike]]]
Source code in zerohertzLib/vision/loader.py
def _convert(
    self, txt_path: str, img: NDArray[np.uint8]
) -> tuple[list[int], list[NDArray[DTypeLike]]]:
    class_list = []
    objects = []
    with open(txt_path, "r", encoding="utf-8") as file:
        data_lines = file.readlines()
    for data_line in data_lines:
        data_str = data_line.strip().split(" ")
        class_list.append(int(data_str[0]))
        if self.poly:
            obj = np.array(list(map(float, data_str[1:]))).reshape(-1, 2)
            if not self.absolute:
                obj *= img.shape[:2][::-1]
        else:
            obj = np.array(list(map(float, data_str[1:])))
            if not self.absolute:
                obj *= img.shape[:2][::-1] * 2
        objects.append(obj)
    return class_list, objects

_value

_value(img: NDArray[uint8], obj: NDArray[DTypeLike], labels: list[str], cls: int)
Source code in zerohertzLib/vision/loader.py
def _value(
    self,
    img: NDArray[np.uint8],
    obj: NDArray[DTypeLike],
    labels: list[str],
    cls: int,
):
    original_height, original_width = img.shape[:2]
    obj *= 100
    if self.poly:
        obj /= (original_width, original_height)
        return {
            "original_width": original_width,
            "original_height": original_height,
            "image_rotation": 0,
            "value": {
                "points": obj.tolist(),
                "closed": True,
                "polygonlabels": [labels[cls]],
            },
            "from_name": "label",
            "to_name": "image",
            "type": "polygonlabels",
            "origin": "manual",
        }
    obj[:2] -= obj[2:] / 2
    obj /= (original_width, original_height) * 2
    obj = obj.tolist()
    return {
        "original_width": original_width,
        "original_height": original_height,
        "image_rotation": 0,
        "value": {
            "x": obj[0],
            "y": obj[1],
            "width": obj[2],
            "height": obj[3],
            "rectanglelabels": [labels[cls]],
        },
        "from_name": "label",
        "to_name": "image",
        "type": "rectanglelabels",
        "origin": "manual",
    }

_visualization

_visualization(file_name: str, img: NDArray[uint8], class_list: list[int], objects: list[NDArray[DTypeLike]]) -> None
Source code in zerohertzLib/vision/loader.py
def _visualization(
    self,
    file_name: str,
    img: NDArray[np.uint8],
    class_list: list[int],
    objects: list[NDArray[DTypeLike]],
) -> None:
    if self.poly:
        mks = np.zeros((len(objects), *img.shape[:2]), bool)
        for idx, poly in enumerate(objects):
            mks[idx] = poly2mask(poly, img.shape[:2])
        img = mask(img, mks, class_list=class_list, class_color=self.class_color)
    else:
        for cls, box in zip(class_list, objects):
            img = bbox(img, box, self.class_color[cls])
    cv2.imwrite(os.path.join(self.vis_path, file_name), img)

labelstudio

labelstudio(directory: str = 'image', labels: list[str | None] = None, mp_num: int = 0) -> None

YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

Parameters:

Name Type Description Default
directory str

Label Studio 내 /home/user/{directory} 의 이름

'image'
labels list[str | None]

YOLO format의 .txt 상에서 index에 따른 label의 이름

None
mp_num int

병렬 처리에 사용될 process의 수 (0: 직렬 처리)

0

Returns:

Type Description
None

{path}.json 으로 결과 저장

Examples >>> yolo.labelstudio("images", mp_num=10, labels=["t1", "t2", "t3", "t4"])

Source code in zerohertzLib/vision/loader.py
def labelstudio(
    self,
    directory: str = "image",
    labels: list[str | None] = None,
    mp_num: int = 0,
) -> None:
    """
    YOLO format의 data를 Label Studio에서 확인 및 수정할 수 있게 변환

    Args:
        directory: Label Studio 내 `/home/user/{directory}` 의 이름
        labels: YOLO format의 `.txt` 상에서 index에 따른 label의 이름
        mp_num: 병렬 처리에 사용될 process의 수 (`0`: 직렬 처리)

    Returns:
        `{path}.json` 으로 결과 저장

    Examples
        >>> yolo.labelstudio("images", mp_num=10, labels=["t1", "t2", "t3", "t4"])
    """
    if labels is None:
        labels = [str(i) for i in range(100)]
    json_data = []
    if mp_num == 0:
        for idx in range(len(self)):
            json_data.append(self._annotation([idx, directory, labels]))
    else:
        args = [[idx, directory, labels] for idx in range(len(self))]
        with mp.Pool(processes=mp_num) as pool:
            annotations = pool.map(self._annotation, args)
        for annotation in annotations:
            json_data.append(annotation)
    write_json(json_data, self.data_path)

bbox

bbox(img: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], color: tuple[int, int, int] = (0, 0, 255), thickness: int = 2) -> NDArray[uint8]

여러 Bbox 시각화

Parameters:

Name Type Description Default
img NDArray[uint8]

Input image ([H, W, C])

required
box list[int | float] | NDArray[DTypeLike]

하나 혹은 여러 개의 bbox ([4], [N, 4], [4, 2], [N, 4, 2])

required
color tuple[int, int, int]

bbox의 색

(0, 0, 255)
thickness int

bbox 선의 두께

2

Returns:

Type Description
NDArray[uint8]

시각화 결과 ([H, W, C])

Examples:

Bbox: >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]]) >>> box.shape (4, 2) >>> res1 = zz.vision.bbox(img, box, thickness=10)

Bboxes: >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]]) >>> boxes.shape (3, 4) >>> res2 = zz.vision.bbox(img, boxes, (0, 255, 0), thickness=10)

Bounding box visualization example

Source code in zerohertzLib/vision/visual.py
def bbox(
    img: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    color: tuple[int, int, int] = (0, 0, 255),
    thickness: int = 2,
) -> NDArray[np.uint8]:
    """여러 Bbox 시각화

    Args:
        img: Input image (`[H, W, C]`)
        box: 하나 혹은 여러 개의 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)
        color: bbox의 색
        thickness: bbox 선의 두께

    Returns:
        시각화 결과 (`[H, W, C]`)

    Examples:
        Bbox:
            >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
            >>> box.shape
            (4, 2)
            >>> res1 = zz.vision.bbox(img, box, thickness=10)

        Bboxes:
            >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
            >>> boxes.shape
            (3, 4)
            >>> res2 = zz.vision.bbox(img, boxes, (0, 255, 0), thickness=10)

        ![Bounding box visualization example](../../../assets/vision/bbox.png){ width="600" }
    """
    box = _list2np(box)
    img = img.copy()
    shape = img.shape
    if len(shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif shape[2] == 4 and len(color) == 3:
        color = (*color, 255)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        box = cwh2poly(box)
    if multi:
        for box_ in box:
            img = _bbox(img, box_, color, thickness)
    else:
        img = _bbox(img, box, color, thickness)
    return img

before_after

before_after(before: NDArray[uint8], after: NDArray[uint8], area: list[int | float] | None = None, per: bool = True, quality: int = 100, file_name: str = 'tmp') -> None

두 image를 비교하는 image 생성

Parameters:

Name Type Description Default
before NDArray[uint8]

원본 image

required
after NDArray[uint8]

영상 처리 혹은 모델 추론 후 image

required
area list[int | float] | None

비교할 좌표 ([x_0, y_0, x_1, y_1])

None
per bool

area 의 백분율 여부

True
quality int

출력 image의 quality (단위: %)

100
file_name str

저장될 file의 이름

'tmp'

Returns:

Type Description
None

현재 directory에 바로 image 저장

Examples:

BGR, GRAY:

>>> after = cv2.GaussianBlur(before, (0, 0), 25)
>>> after = cv2.cvtColor(after, cv2.COLOR_BGR2GRAY)
>>> zz.vision.before_after(before, after, quality=10)
Before after comparison 1 BGR, Resize:
>>> after = cv2.resize(before, (100, 100))
>>> zz.vision.before_after(before, after, [20, 40, 30, 60])
Before after comparison 2

Source code in zerohertzLib/vision/compare.py
def before_after(
    before: NDArray[np.uint8],
    after: NDArray[np.uint8],
    area: list[int | float] | None = None,
    per: bool = True,
    quality: int = 100,
    file_name: str = "tmp",
) -> None:
    """두 image를 비교하는 image 생성

    Args:
        before: 원본 image
        after: 영상 처리 혹은 모델 추론 후 image
        area: 비교할 좌표 (`[x_0, y_0, x_1, y_1]`)
        per: `area` 의 백분율 여부
        quality: 출력 image의 quality (단위: %)
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        BGR, GRAY:
            ```python
            >>> after = cv2.GaussianBlur(before, (0, 0), 25)
            >>> after = cv2.cvtColor(after, cv2.COLOR_BGR2GRAY)
            >>> zz.vision.before_after(before, after, quality=10)
            ```
        ![Before after comparison 1](../../../assets/vision/before_after.1.png){ width="300" }
        BGR, Resize:
            ```python
            >>> after = cv2.resize(before, (100, 100))
            >>> zz.vision.before_after(before, after, [20, 40, 30, 60])
            ```
        ![Before after comparison 2](../../../assets/vision/before_after.2.png){ width="300" }
    """
    before_shape = before.shape
    if area is None:
        if per:
            area = [0.0, 0.0, 100.0, 100.0]
        else:
            raise ValueError("'area' not provided while 'per' is False")
    if per:
        x_0, y_0, x_1, y_1 = _rel2abs(*area, *before_shape[:2])
    else:
        x_0, y_0, x_1, y_1 = area
    before = _cvt_bgra(before)
    before_shape = before.shape
    after = _cvt_bgra(after)
    after_shape = after.shape
    if not before_shape == after_shape:
        after = cv2.resize(after, before_shape[:2][::-1])
        after_shape = after.shape
    before, after = before[x_0:x_1, y_0:y_1, :], after[x_0:x_1, y_0:y_1, :]
    before_shape = before.shape
    height, width, channel = before_shape
    palette = np.zeros((height, 2 * width, channel), dtype=np.uint8)
    palette[:, :width, :] = before
    palette[:, width:, :] = after
    palette = cv2.resize(palette, (0, 0), fx=quality / 100, fy=quality / 100)
    cv2.imwrite(f"{file_name}.png", palette)

cutout

cutout(img: NDArray[uint8], poly: list[int | float] | NDArray[DTypeLike], alpha: int = 255, crop: bool = True, background: int = 0) -> NDArray[uint8]

Image 내에서 지정한 좌표를 제외한 부분을 투명화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
poly list[int | float] | NDArray[DTypeLike]

지정할 좌표 ([N, 2])

required
alpha int

지정한 좌표 영역의 투명도

255
crop bool

출력 image의 Crop 여부

True
background int

지정한 좌표 외 배경의 투명도

0

Returns:

Type Description
NDArray[uint8]

출력 image ([H, W, 4])

Examples:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res1 = zz.vision.cutout(img, poly)
>>> res2 = zz.vision.cutout(img, poly, 128, False)
>>> res3 = zz.vision.cutout(img, poly, background=128)

Image cutout example

Source code in zerohertzLib/vision/transform.py
def cutout(
    img: NDArray[np.uint8],
    poly: list[int | float] | NDArray[DTypeLike],
    alpha: int = 255,
    crop: bool = True,
    background: int = 0,
) -> NDArray[np.uint8]:
    """Image 내에서 지정한 좌표를 제외한 부분을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        poly: 지정할 좌표 (`[N, 2]`)
        alpha: 지정한 좌표 영역의 투명도
        crop: 출력 image의 Crop 여부
        background: 지정한 좌표 외 배경의 투명도

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
        >>> res1 = zz.vision.cutout(img, poly)
        >>> res2 = zz.vision.cutout(img, poly, 128, False)
        >>> res3 = zz.vision.cutout(img, poly, background=128)

        ![Image cutout example](../../../assets/vision/cutout.png){ width="600" }
    """
    shape = img.shape[:2]
    poly = _list2np(poly)
    poly = poly.astype(np.int32)
    x_0, x_1 = poly[:, 0].min(), poly[:, 0].max()
    y_0, y_1 = poly[:, 1].min(), poly[:, 1].max()
    mask = poly2mask(poly, shape)
    if background == 0:
        mask = (mask * alpha).astype(np.uint8)
    else:
        mask = mask.astype(np.uint8)
        mask[mask == 0] = background
        mask[mask == 1] = alpha
    img = Image.fromarray(img)
    mask = Image.fromarray(mask)
    img.putalpha(mask)
    if crop:
        return np.array(img)[y_0:y_1, x_0:x_1, :]
    return np.array(img)

cwh2poly

cwh2poly(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])

required

Returns:

Type Description
NDArray[DTypeLike]

[[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

Examples:

>>> zz.vision.cwh2poly([20, 30, 20, 20])
array([[10, 20],
       [30, 20],
       [30, 40],
       [10, 40]])
>>> zz.vision.cwh2poly(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
array([[[ 10,  20],
        [ 30,  20],
        [ 30,  40],
        [ 10,  40]],
       [[ 30,  50],
        [ 70,  50],
        [ 70, 100],
        [ 30, 100]]])
Source code in zerohertzLib/vision/convert.py
def cwh2poly(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Examples:
        >>> zz.vision.cwh2poly([20, 30, 20, 20])
        array([[10, 20],
               [30, 20],
               [30, 40],
               [10, 40]])
        >>> zz.vision.cwh2poly(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
        array([[[ 10,  20],
                [ 30,  20],
                [ 30,  40],
                [ 10,  40]],
               [[ 30,  50],
                [ 70,  50],
                [ 70, 100],
                [ 30, 100]]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'cwh' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4, 2), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _cwh2poly(box_)
        return boxes
    return _cwh2poly(box)

cwh2xyxy

cwh2xyxy(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])

required

Returns:

Type Description
NDArray[DTypeLike]

[x0, y0, x1, y1] 로 구성된 bbox ([4] or `[N, 4])

Examples:

>>> zz.vision.cwh2xyxy([20, 30, 20, 20])
array([10, 20, 30, 40])
>>> zz.vision.cwh2xyxy(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
array([[ 10,  20,  30,  40],
       [ 30,  50,  70, 100]])
Source code in zerohertzLib/vision/convert.py
def cwh2xyxy(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4])

    Examples:
        >>> zz.vision.cwh2xyxy([20, 30, 20, 20])
        array([10, 20, 30, 40])
        >>> zz.vision.cwh2xyxy(np.array([[20, 30, 20, 20], [50, 75, 40, 50]]))
        array([[ 10,  20,  30,  40],
               [ 30,  50,  70, 100]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'cwh' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _cwh2xyxy(box_)
        return boxes
    return _cwh2xyxy(box)

evaluation

evaluation(ground_truths: NDArray[DTypeLike], inferences: NDArray[DTypeLike], confidences: list[float], gt_classes: list[str] | None = None, inf_classes: list[str] | None = None, file_name: str | None = None, threshold: float = 0.5) -> DataFrame

단일 image 내 detection model의 추론 성능 평가

Parameters:

Name Type Description Default
ground_truths NDArray[DTypeLike]

Ground truth object들의 polygon ([N, 4, 2], [[[x_0, y_0], [x_1, y_1], ...], ...])

required
inferences NDArray[DTypeLike]

Model이 추론한 각 object들의 polygon ([M, 4, 2], [[[x_0, y_0], [x_1, y_1], ...], ...])

required
confidences list[float]

Model이 추론한 각 object들의 confidence([M])

required
gt_classes list[str] | None

Ground truth object들의 class ([N])

None
inf_classes list[str] | None

Model이 추론한 각 object들의 class ([M])

None
file_name str | None

평가 image의 이름

None
threshold float

IoU의 threshold

0.5
Note
  • N: 한 image의 ground truth 내 존재하는 object의 수
  • M: 한 image의 inference 결과 내 존재하는 object의 수

Model evaluation visualization

Returns:

Type Description
DataFrame

단일 image의 model 성능 평가 결과

Examples:

>>> poly = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
>>> ground_truths = np.array([poly, poly + 20, poly + 40])
>>> inferences = np.array([poly, poly + 19, poly + 80])
>>> confidences = np.array([0.6, 0.7, 0.8])
>>> zz.vision.evaluation(ground_truths, inferences, confidences, file_name="test.png")
  file_name  instance  confidence  class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
0  test.png         0         0.8    0.0  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
1  test.png         1         0.7    0.0  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
2  test.png         2         0.6    0.0  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
3  test.png         3         0.0    0.0  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
>>> gt_classes = np.array(["cat", "dog", "cat"])
>>> inf_classes = np.array(["cat", "dog", "cat"])
>>> zz.vision.evaluation(ground_truths, inferences, confidences, gt_classes, inf_classes)
   instance  confidence class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
0         0         0.8   cat  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
1         1         0.6   cat  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
2         2         0.0   cat  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
3         3         0.7   dog  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
Source code in zerohertzLib/vision/eval.py
def evaluation(
    ground_truths: NDArray[DTypeLike],
    inferences: NDArray[DTypeLike],
    confidences: list[float],
    gt_classes: list[str] | None = None,
    inf_classes: list[str] | None = None,
    file_name: str | None = None,
    threshold: float = 0.5,
) -> pd.DataFrame:
    """단일 image 내 detection model의 추론 성능 평가

    Args:
        ground_truths: Ground truth object들의 polygon (`[N, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)
        inferences: Model이 추론한 각 object들의 polygon (`[M, 4, 2]`, `[[[x_0, y_0], [x_1, y_1], ...], ...]`)
        confidences: Model이 추론한 각 object들의 confidence(`[M]`)
        gt_classes: Ground truth object들의 class (`[N]`)
        inf_classes: Model이 추론한 각 object들의 class (`[M]`)
        file_name: 평가 image의 이름
        threshold: IoU의 threshold

    Note:
        - `N`: 한 image의 ground truth 내 존재하는 object의 수
        - `M`: 한 image의 inference 결과 내 존재하는 object의 수

        ![Model evaluation visualization](../../../assets/vision/evaluation.png){ width="600" }

    Returns:
        단일 image의 model 성능 평가 결과

    Examples:
        >>> poly = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
        >>> ground_truths = np.array([poly, poly + 20, poly + 40])
        >>> inferences = np.array([poly, poly + 19, poly + 80])
        >>> confidences = np.array([0.6, 0.7, 0.8])
        >>> zz.vision.evaluation(ground_truths, inferences, confidences, file_name="test.png")
          file_name  instance  confidence  class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
        0  test.png         0         0.8    0.0  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
        1  test.png         1         0.7    0.0  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
        2  test.png         2         0.6    0.0  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
        3  test.png         3         0.0    0.0  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
        >>> gt_classes = np.array(["cat", "dog", "cat"])
        >>> inf_classes = np.array(["cat", "dog", "cat"])
        >>> zz.vision.evaluation(ground_truths, inferences, confidences, gt_classes, inf_classes)
           instance  confidence class       IoU results  gt_x0  gt_y0  gt_x1  gt_y1  gt_x2  gt_y2  gt_x3  gt_y3  inf_x0  inf_y0  inf_x1  inf_y1  inf_x2  inf_y2  inf_x3  inf_y3
        0         0         0.8   cat  0.000000      FP    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    80.0    80.0    90.0    80.0    90.0    90.0    80.0    90.0
        1         1         0.6   cat  1.000000      TP    0.0    0.0   10.0    0.0   10.0   10.0    0.0   10.0     0.0     0.0    10.0     0.0    10.0    10.0     0.0    10.0
        2         2         0.0   cat  0.000000      FN   40.0   40.0   50.0   40.0   50.0   50.0   40.0   50.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
        3         3         0.7   dog  0.680672      TP   20.0   20.0   30.0   20.0   30.0   30.0   20.0   30.0    19.0    19.0    29.0    19.0    29.0    29.0    19.0    29.0
    """
    logs = defaultdict(list)
    if gt_classes is None and inf_classes is None:
        gt_classes = np.zeros(len(ground_truths))
        inf_classes = np.zeros(len(inferences))
    instance = 0
    for cls in set(gt_classes).union(set(inf_classes)):
        cls_gt = ground_truths[np.where(gt_classes == cls)]
        cls_inf = inferences[np.where(inf_classes == cls)]
        cls_conf = confidences[np.where(inf_classes == cls)]
        sorted_indices = np.argsort(-cls_conf)
        cls_inf = cls_inf[sorted_indices]
        cls_conf = cls_conf[sorted_indices]
        matched = set()
        for confidence, inf in zip(cls_conf, cls_inf):
            best_iou = 0
            best_gt_idx = -1
            for gt_idx, gt in enumerate(cls_gt):
                if gt_idx in matched:
                    continue
                iou_ = iou(gt, inf)
                if iou_ > best_iou:
                    best_iou = iou_
                    best_gt_idx = gt_idx
            if best_iou >= threshold:
                matched.add(best_gt_idx)
                _append(
                    logs,
                    instance,
                    confidence,
                    cls,
                    best_iou,
                    "TP",
                    cls_gt[best_gt_idx],
                    inf,
                )
                instance += 1
            else:
                _append(logs, instance, confidence, cls, 0.0, "FP", None, inf)
                instance += 1
        for gt_idx, gt in enumerate(cls_gt):
            if gt_idx not in matched:
                _append(logs, instance, 0.0, cls, 0.0, "FN", gt, None)
                instance += 1
    logs = pd.DataFrame(logs)
    if file_name is not None:
        logs["file_name"] = file_name
        logs = logs[["file_name"] + [col for col in logs.columns if col != "file_name"]]
    return logs

grid

grid(imgs: list[NDArray[uint8]], size: int = 1000, color: tuple[int, int, int] = (255, 255, 255), file_name: str = 'tmp') -> None

여러 image를 입력받아 정방형 image로 병합

Parameters:

Name Type Description Default
imgs list[NDArray[uint8]]

입력 image

required
size int

출력 image의 크기

1000
color tuple[int, int, int]

Padding의 색

(255, 255, 255)
file_name str

저장될 file의 이름

'tmp'

Returns:

Type Description
None

현재 directory에 바로 image 저장

Examples:

>>> imgs = [cv2.resize(img, (random.randrange(300, 1000), random.randrange(300, 1000))) for _ in range(8)]
>>> imgs[2] = cv2.cvtColor(imgs[2], cv2.COLOR_BGR2GRAY)
>>> imgs[3] = cv2.cvtColor(imgs[3], cv2.COLOR_BGR2BGRA)
>>> zz.vision.grid(imgs)
>>> zz.vision.grid(imgs, color=(0, 255, 0))
>>> zz.vision.grid(imgs, color=(0, 0, 0, 0))

Image grid example

Source code in zerohertzLib/vision/compare.py
def grid(
    imgs: list[NDArray[np.uint8]],
    size: int = 1000,
    color: tuple[int, int, int] = (255, 255, 255),
    file_name: str = "tmp",
) -> None:
    """여러 image를 입력받아 정방형 image로 병합

    Args:
        imgs: 입력 image
        size: 출력 image의 크기
        color: Padding의 색
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        >>> imgs = [cv2.resize(img, (random.randrange(300, 1000), random.randrange(300, 1000))) for _ in range(8)]
        >>> imgs[2] = cv2.cvtColor(imgs[2], cv2.COLOR_BGR2GRAY)
        >>> imgs[3] = cv2.cvtColor(imgs[3], cv2.COLOR_BGR2BGRA)
        >>> zz.vision.grid(imgs)
        >>> zz.vision.grid(imgs, color=(0, 255, 0))
        >>> zz.vision.grid(imgs, color=(0, 0, 0, 0))

        ![Image grid example](../../../assets/vision/grid.png){ width="600" }
    """
    cnt = math.ceil(math.sqrt(len(imgs)))
    length = size // cnt
    size = int(length * cnt)
    palette = np.full((size, size, 4), 0, dtype=np.uint8)
    for idx, img in enumerate(imgs):
        d_y, d_x = divmod(idx, cnt)
        x_0, y_0, x_1, y_1 = (
            d_x * length,
            d_y * length,
            (d_x + 1) * length,
            (d_y + 1) * length,
        )
        img = _cvt_bgra(img)
        palette[y_0:y_1, x_0:x_1, :], _ = pad(img, (length, length), color)
    cv2.imwrite(f"{file_name}.png", palette)

img2gif

img2gif(path: str, file_name: str = 'tmp', duration: int = 500) -> None

Directory 내 image들을 GIF로 변환

Parameters:

Name Type Description Default
path str

GIF로 변환할 image들이 존재하는 경로

required
file_name str

출력될 GIF file 이름

'tmp'
duration int

ms 단위의 사진 간 간격

500

Returns:

Type Description
None

현재 directory에 바로 GIF 저장

Examples:

>>> zz.vision.img2gif("./")

Images to GIF conversion example

Source code in zerohertzLib/vision/gif.py
def img2gif(
    path: str,
    file_name: str = "tmp",
    duration: int = 500,
) -> None:
    """Directory 내 image들을 GIF로 변환

    Args:
        path: GIF로 변환할 image들이 존재하는 경로
        file_name: 출력될 GIF file 이름
        duration: ms 단위의 사진 간 간격

    Returns:
        현재 directory에 바로 GIF 저장

    Examples:
        >>> zz.vision.img2gif("./")

        ![Images to GIF conversion example](../../../assets/vision/img2gif.gif){ width="200" }
    """
    ext = (
        "jpg",
        "JPG",
        "jpeg",
        "JPEG",
        "png",
        "PNG",
        "tif",
        "TIF",
        "tiff",
        "TIFF",
    )
    image_files = [f for f in os.listdir(path) if f.endswith(ext)]
    image_files.sort()
    images = [Image.open(os.path.join(path, image_file)) for image_file in image_files]
    _create_gif_from_frames(images, file_name, duration)

iou

iou(poly1: NDArray[DTypeLike], poly2: NDArray[DTypeLike]) -> float

IoU (Intersection over Union)를 계산하는 function

Parameters:

Name Type Description Default
poly1 NDArray[DTypeLike]

IoU를 계산할 polygon ([S1, 2], [[x_0, y_0], [x_1, y_1], ...])

required
poly2 NDArray[DTypeLike]

IoU를 계산할 polygon ([S2, 2], [[x_0, y_0], [x_1, y_1], ...])

required

Returns:

Type Description
float

IoU 값

Examples:

>>> poly1 = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
>>> poly2 = poly1 + (5, 0)
>>> poly2
array([[ 5,  0],
       [15,  0],
       [15, 10],
       [ 5, 10]])
>>> zz.vision.iou(poly1, poly2)
0.3333333333333333
Source code in zerohertzLib/vision/eval.py
def iou(poly1: NDArray[DTypeLike], poly2: NDArray[DTypeLike]) -> float:
    """IoU (Intersection over Union)를 계산하는 function

    Args:
        poly1: IoU를 계산할 polygon (`[S1, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)
        poly2: IoU를 계산할 polygon (`[S2, 2]`, `[[x_0, y_0], [x_1, y_1], ...]`)

    Returns:
        IoU 값

    Examples:
        >>> poly1 = np.array([[0, 0], [10, 0], [10, 10], [0, 10]])
        >>> poly2 = poly1 + (5, 0)
        >>> poly2
        array([[ 5,  0],
               [15,  0],
               [15, 10],
               [ 5, 10]])
        >>> zz.vision.iou(poly1, poly2)
        0.3333333333333333
    """
    polygon1 = Polygon(poly1)
    polygon2 = Polygon(poly2)
    return polygon1.intersection(polygon2).area / polygon1.union(polygon2).area

is_pts_in_poly

is_pts_in_poly(poly: NDArray[DTypeLike], pts: list[int | float] | NDArray[DTypeLike]) -> bool | NDArray[bool]

지점들의 좌표 내 존재 여부 확인 function

Parameters:

Name Type Description Default
poly NDArray[DTypeLike]

다각형 ([N, 2])

required
pts list[int | float] | NDArray[DTypeLike]

point ([2] or [N, 2])

required

Returns:

Type Description
bool | NDArray[bool]

입력 point 의 다각형 poly 내부 존재 여부

Examples:

>>> poly = np.array([[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]])
>>> zz.vision.is_pts_in_poly(poly, [20, 20])
True
>>> zz.vision.is_pts_in_poly(poly, [[20, 20], [100, 100]])
array([ True, False])
>>> zz.vision.is_pts_in_poly(poly, np.array([20, 20]))
True
>>> zz.vision.is_pts_in_poly(poly, np.array([[20, 20], [100, 100]]))
array([ True, False])
Source code in zerohertzLib/vision/util.py
def is_pts_in_poly(
    poly: NDArray[DTypeLike], pts: list[int | float] | NDArray[DTypeLike]
) -> bool | NDArray[bool]:
    """지점들의 좌표 내 존재 여부 확인 function

    Args:
        poly: 다각형 (`[N, 2]`)
        pts: point (`[2]` or `[N, 2]`)

    Returns:
        입력 `point` 의 다각형 `poly` 내부 존재 여부

    Examples:
        >>> poly = np.array([[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]])
        >>> zz.vision.is_pts_in_poly(poly, [20, 20])
        True
        >>> zz.vision.is_pts_in_poly(poly, [[20, 20], [100, 100]])
        array([ True, False])
        >>> zz.vision.is_pts_in_poly(poly, np.array([20, 20]))
        True
        >>> zz.vision.is_pts_in_poly(poly, np.array([[20, 20], [100, 100]]))
        array([ True, False])
    """
    poly = Path(poly)
    if isinstance(pts, list):
        if isinstance(pts[0], list):
            return poly.contains_points(pts)
        return poly.contains_point(pts)
    if isinstance(pts, np.ndarray):
        shape = pts.shape
        if len(shape) == 1:
            return poly.contains_point(pts)
        if len(shape) == 2:
            return poly.contains_points(pts)
        raise ValueError("The 'pts' must be of shape [2], [N, 2]")
    raise TypeError("The 'pts' must be 'list' or 'np.ndarray'")

mask

mask(img: NDArray[uint8], mks: NDArray[bool] | None = None, poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]] | None = None, color: tuple[int, int, int] = (0, 0, 255), class_list: list[int | str] | None = None, class_color: dict[int | str, tuple[int, int, int]] | None = None, border: bool = True, alpha: float = 0.5) -> NDArray[uint8]

Mask 시각화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
mks NDArray[bool] | None

입력 image 위에 병합할 mask ([H, W] or [N, H, W])

None
poly list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]] | None

입력 image 위에 병합할 mask ([M, 2] or [N, M, 2])

None
color tuple[int, int, int]

Mask의 색

(0, 0, 255)
class_list list[int | str] | None

mks 의 index에 따른 class

None
class_color dict[int | str, tuple[int, int, int]] | None

Class에 따른 색 (color 무시)

None
border bool

Mask의 경계선 표시 여부

True
alpha float

Mask의 투명도

0.5

Returns:

Type Description
NDArray[uint8]

시각화 결과 ([H, W, C])

Examples:

Mask:

>>> H, W, _ = img.shape
>>> cnt = 30
>>> mks = np.zeros((cnt, H, W), np.uint8)
>>> for mks_ in mks:
>>>     center_x = random.randint(0, W)
>>>     center_y = random.randint(0, H)
>>>     radius = random.randint(30, 200)
>>>     cv2.circle(mks_, (center_x, center_y), radius, (True), -1)
>>> mks = mks.astype(bool)
>>> res1 = zz.vision.mask(img, mks)
Mask:
>>> cls = [i for i in range(cnt)]
>>> class_list = [cls[random.randint(0, 5)] for _ in range(cnt)]
>>> class_color = {}
>>> for c in cls:
>>>     class_color[c] = [random.randint(0, 255) for _ in range(3)]
>>> res2 = zz.vision.mask(img, mks, class_list=class_list, class_color=class_color)
Poly:
>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res3 = zz.vision.mask(img, poly=poly)
Poly:
>>> poly = zz.vision.xyxy2poly(zz.vision.poly2xyxy((np.random.rand(cnt, 4, 2) * (W, H))))
>>> res4 = zz.vision.mask(img, poly=poly, class_list=class_list, class_color=class_color)

Mask visualization example

Source code in zerohertzLib/vision/visual.py
def mask(
    img: NDArray[np.uint8],
    mks: NDArray[bool] | None = None,
    poly: (
        list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]] | None
    ) = None,
    color: tuple[int, int, int] = (0, 0, 255),
    class_list: list[int | str] | None = None,
    class_color: dict[int | str, tuple[int, int, int]] | None = None,
    border: bool = True,
    alpha: float = 0.5,
) -> NDArray[np.uint8]:
    """Mask 시각화

    Args:
        img: 입력 image (`[H, W, C]`)
        mks: 입력 image 위에 병합할 mask (`[H, W]` or `[N, H, W]`)
        poly: 입력 image 위에 병합할 mask (`[M, 2]` or `[N, M, 2]`)
        color: Mask의 색
        class_list: `mks` 의 index에 따른 class
        class_color: Class에 따른 색 (`color` 무시)
        border: Mask의 경계선 표시 여부
        alpha: Mask의 투명도

    Returns:
        시각화 결과 (`[H, W, C]`)

    Examples:
        Mask:
            ```python
            >>> H, W, _ = img.shape
            >>> cnt = 30
            >>> mks = np.zeros((cnt, H, W), np.uint8)
            >>> for mks_ in mks:
            >>>     center_x = random.randint(0, W)
            >>>     center_y = random.randint(0, H)
            >>>     radius = random.randint(30, 200)
            >>>     cv2.circle(mks_, (center_x, center_y), radius, (True), -1)
            >>> mks = mks.astype(bool)
            >>> res1 = zz.vision.mask(img, mks)
            ```
        Mask:
            ```python
            >>> cls = [i for i in range(cnt)]
            >>> class_list = [cls[random.randint(0, 5)] for _ in range(cnt)]
            >>> class_color = {}
            >>> for c in cls:
            >>>     class_color[c] = [random.randint(0, 255) for _ in range(3)]
            >>> res2 = zz.vision.mask(img, mks, class_list=class_list, class_color=class_color)
            ```
        Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res3 = zz.vision.mask(img, poly=poly)
            ```
        Poly:
            ```python
            >>> poly = zz.vision.xyxy2poly(zz.vision.poly2xyxy((np.random.rand(cnt, 4, 2) * (W, H))))
            >>> res4 = zz.vision.mask(img, poly=poly, class_list=class_list, class_color=class_color)
            ```

        ![Mask visualization example](../../../assets/vision/mask.png){ width="600" }
    """
    assert (mks is None) ^ (poly is None)
    shape = img.shape
    if len(shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif shape[2] == 4 and len(color) == 3:
        color = (*color, 255)
        if class_list is not None and class_color is not None:
            for key, value in class_color.items():
                if len(value) == 3:
                    class_color[key] = [*value, 255]
    if poly is not None:
        mks = poly2mask(poly, (shape[:2]))
    shape = mks.shape
    overlay = img.copy()
    cumulative_mask = np.zeros(img.shape[:2], dtype=bool)
    if len(shape) == 2:
        overlay[mks] = color
        if border:
            edges = cv2.Canny(mks.astype(np.uint8) * 255, 100, 200)
            overlay[edges > 0] = color
    elif len(shape) == 3:
        for idx, mks_ in enumerate(mks):
            if class_list is not None and class_color is not None:
                color = class_color[class_list[idx]]
            overlapping = cumulative_mask & mks_
            non_overlapping = mks_ & ~cumulative_mask
            cumulative_mask |= mks_
            if overlapping.any():
                overlapping_color = overlay[overlapping].astype(np.float32)
                mixed_color = ((overlapping_color + color) / 2).astype(np.uint8)
                overlay[overlapping] = mixed_color
            if non_overlapping.any():
                overlay[non_overlapping] = color
            if border:
                edges = cv2.Canny(mks_.astype(np.uint8) * 255, 100, 200)
                overlay[edges > 0] = color
    else:
        raise ValueError("The 'mks' must be of shape [H, W] or [N, H, W]")
    return cv2.addWeighted(img, 1 - alpha, overlay, alpha, 0)

meanap

meanap(logs: DataFrame) -> tuple[float, dict[str, float]]

Detection model의 P-R curve 시각화 및 mAP 산출

Parameters:

Name Type Description Default
logs DataFrame

zz.vision.evaluation function을 통해 평가된 결과

required

Returns:

Type Description
tuple[float, dict[str, float]]

mAP 값 및 class에 따른 AP 값 (시각화 결과는 prc_curve.png, pr_curve.png 로 현재 directory에 저장)

Examples:

>>> logs1 = zz.vision.evaluation(ground_truths_1, inferences_1, confidences_1, gt_classes, inf_classes, file_name="test_1.png")
>>> logs2 = zz.vision.evaluation(ground_truths_2, inferences_2, confidences_2, gt_classes, inf_classes, file_name="test_2.png")
>>> logs = pd.concat([logs1, logs2], ignore_index=True)
>>> zz.vision.meanap(logs)
(0.7030629916206652, defaultdict(<class 'float'>, {'dog': 0.7177078883735305, 'cat': 0.6884180948677999}))

Mean Average Precision curves

Source code in zerohertzLib/vision/eval.py
def meanap(logs: pd.DataFrame) -> tuple[float, dict[str, float]]:
    """Detection model의 P-R curve 시각화 및 mAP 산출

    Args:
        logs: `zz.vision.evaluation` function을 통해 평가된 결과

    Returns:
        mAP 값 및 class에 따른 AP 값 (시각화 결과는 `prc_curve.png`, `pr_curve.png` 로 현재 directory에 저장)

    Examples:
        >>> logs1 = zz.vision.evaluation(ground_truths_1, inferences_1, confidences_1, gt_classes, inf_classes, file_name="test_1.png")
        >>> logs2 = zz.vision.evaluation(ground_truths_2, inferences_2, confidences_2, gt_classes, inf_classes, file_name="test_2.png")
        >>> logs = pd.concat([logs1, logs2], ignore_index=True)
        >>> zz.vision.meanap(logs)
        (0.7030629916206652, defaultdict(<class 'float'>, {'dog': 0.7177078883735305, 'cat': 0.6884180948677999}))

        ![Mean Average Precision curves](../../../assets/vision/meanap.png){ width="600" }
    """
    logs = logs.sort_values(by="confidence", ascending=False)
    confidence_per_cls = defaultdict(list)
    recall_per_cls = defaultdict(list)
    precision_per_cls = defaultdict(list)
    pr_curve = defaultdict(list)
    aps = defaultdict(float)
    classes = set(logs["class"])
    for cls in classes:
        gt = len(
            logs[
                (logs["class"] == cls)
                & ((logs["results"] == "TP") | (logs["results"] == "FN"))
            ]
        )
        for confidence in set(logs[logs["class"] == cls]["confidence"]):
            true_positive = len(
                logs[
                    (logs["class"] == cls)
                    & (logs["confidence"] >= confidence)
                    & (logs["results"] == "TP")
                ]
            )
            false_positive = len(
                logs[
                    (logs["class"] == cls)
                    & (logs["confidence"] >= confidence)
                    & (logs["results"] == "FP")
                ]
            )
            if true_positive + false_positive == 0:
                precision = 0
            else:
                precision = true_positive / (true_positive + false_positive)
            if gt == 0:
                recall = 0
            else:
                recall = true_positive / gt  # (true_positive + false_negative)
            pr_curve[cls].append((recall, precision))
            confidence_per_cls[cls].append(confidence)
            recall_per_cls[cls].append(recall)
            precision_per_cls[cls].append(precision)
        pr_curve[cls] = sorted(pr_curve[cls])
        pr_curve[cls].insert(0, (0, pr_curve[cls][0][1]))
        for i in range(1, len(pr_curve[cls])):
            recall_diff = pr_curve[cls][i][0] - pr_curve[cls][i - 1][0]
            precision_max = max(precision[1] for precision in pr_curve[cls][i:])
            aps[cls] += recall_diff * precision_max
    map_ = sum(aps.values()) / len(aps)
    _prc_curve(confidence_per_cls, recall_per_cls, precision_per_cls, classes)
    _pr_curve(pr_curve, classes, map_)
    return map_, aps

pad

pad(img: NDArray[uint8], shape: tuple[int, int], color: tuple[int, int, int] = (255, 255, 255), poly: NDArray[DTypeLike] | None = None) -> tuple[NDArray[uint8], tuple[float, int, int] | NDArray[DTypeLike]]

입력 image를 원하는 shape로 resize 및 pad

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
shape tuple[int, int]

출력의 shape (H, W)

required
color tuple[int, int, int]

Padding의 색

(255, 255, 255)
poly NDArray[DTypeLike] | None

Padding에 따라 변형될 좌표 ([N, 2])

None

Returns:

Type Description
tuple[NDArray[uint8], tuple[float, int, int] | NDArray[DTypeLike]]

출력 image ([H, W, C]) 및 padding에 따른 정보 또는 변형된 좌표값

Note

poly 를 입력하지 않을 시 (ratio, left, top) 가 출력되며 poly * ratio + (left, top) 와 같이 차후에 변환 가능

Examples:

GRAY:

>>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
>>> res1 = cv2.resize(img, (500, 1000))
>>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))
BGR:
>>> res2 = cv2.resize(img, (1000, 500))
>>> res2, _ = zz.vision.pad(res2, (1000, 1000))
BGRA:
>>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
>>> res3 = cv2.resize(img, (500, 1000))
>>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))
Poly:
>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res4 = cv2.resize(img, (2000, 1000))
>>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
>>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
>>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))
Transformation:
>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res5 = cv2.resize(img, (2000, 1000))
>>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
>>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
>>> poly = poly * info[0] + info[1:]
>>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))

Image padding example

Source code in zerohertzLib/vision/transform.py
def pad(
    img: NDArray[np.uint8],
    shape: tuple[int, int],
    color: tuple[int, int, int] = (255, 255, 255),
    poly: NDArray[DTypeLike] | None = None,
) -> tuple[NDArray[np.uint8], tuple[float, int, int] | NDArray[DTypeLike]]:
    """입력 image를 원하는 shape로 resize 및 pad

    Args:
        img: 입력 image (`[H, W, C]`)
        shape: 출력의 shape `(H, W)`
        color: Padding의 색
        poly: Padding에 따라 변형될 좌표 (`[N, 2]`)

    Returns:
        출력 image (`[H, W, C]`) 및 padding에 따른 정보 또는 변형된 좌표값

    Note:
        `poly` 를 입력하지 않을 시 `(ratio, left, top)` 가 출력되며 `poly * ratio + (left, top)` 와 같이 차후에 변환 가능

    Examples:
        GRAY:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
            >>> res1 = cv2.resize(img, (500, 1000))
            >>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))
            ```
        BGR:
            ```python
            >>> res2 = cv2.resize(img, (1000, 500))
            >>> res2, _ = zz.vision.pad(res2, (1000, 1000))
            ```
        BGRA:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
            >>> res3 = cv2.resize(img, (500, 1000))
            >>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))
            ```
        Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res4 = cv2.resize(img, (2000, 1000))
            >>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
            >>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
            >>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))
            ```
        Transformation:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res5 = cv2.resize(img, (2000, 1000))
            >>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
            >>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
            >>> poly = poly * info[0] + info[1:]
            >>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))
            ```

        ![Image padding example](../../../assets/vision/pad.png){ width="700" }
    """
    if len(img.shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    if img.shape[2] == 4 and len(color) == 3:
        color = [*color, 255]
    img_height, img_width = img.shape[:2]
    tar_height, tar_width = shape
    if img_width / img_height > tar_width / tar_height:
        ratio = tar_width / img_width
        resize_width, resize_height = tar_width, int(img_height * ratio)
    elif img_width / img_height < tar_width / tar_height:
        ratio = tar_height / img_height
        resize_width, resize_height = int(img_width * ratio), tar_height
    else:
        ratio = 1
        (
            resize_width,
            resize_height,
        ) = (
            tar_width,
            tar_height,
        )
    img = cv2.resize(img, (resize_width, resize_height), interpolation=cv2.INTER_LINEAR)
    top, bottom = (
        (tar_height - resize_height) // 2,
        (tar_height - resize_height) // 2 + (tar_height - resize_height) % 2,
    )
    left, right = (
        (tar_width - resize_width) // 2,
        (tar_width - resize_width) // 2 + (tar_width - resize_width) % 2,
    )
    img = cv2.copyMakeBorder(
        img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
    )
    if poly is None:
        return img, (ratio, left, top)
    return img, poly * ratio + (left, top)

paste

paste(img: NDArray[uint8], target: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], resize: bool = False, vis: bool = False, poly: NDArray[DTypeLike] | None = None, alpha: int | None = None, gaussian: int | None = None) -> NDArray[uint8] | tuple[NDArray[uint8], NDArray[DTypeLike]]

target image를 img 위에 투명도를 포함하여 병합

Note

PIL.Image.pastenumpycv2 기반으로 구현

>>> img = Image.open("test.png").convert("RGBA")
>>> target = Image.open("target.png").convert("RGBA")
>>> img.paste(target, (0, 0), target)

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
target NDArray[uint8]

Target image ([H, W, 4])

required
box list[int | float] | NDArray[DTypeLike]

병합될 영역 (xyxy 형식)

required
resize bool

Target image의 resize 여부

False
vis bool

지정한 영역 (box)의 시각화 여부

False
poly NDArray[DTypeLike] | None

변형된 좌표 ([N, 2])

None
alpha int | None

target image의 투명도 변경

None
gaussian int | None

자연스러운 병합을 위해 target 의 alpha channel에 적용될 Gaussian blur의 kernel size

None

Returns:

Type Description
NDArray[uint8] | tuple[NDArray[uint8], NDArray[DTypeLike]]

시각화 결과 ([H, W, 4]) 및 poly 입력 시 변형된 좌표값

Examples:

Without Poly:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> target = zz.vision.cutout(img, poly, 200)
>>> res1 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, vis=True)
>>> res2 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, vis=True, alpha=255)
With Poly:
>>> poly -= zz.vision.poly2xyxy(poly)[:2]
>>> target = zz.vision.bbox(target, poly, color=(255, 0, 0), thickness=20)
>>> res3, poly3 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, poly=poly)
>>> poly3
array([[300.        , 200.        ],
       [557.14285714, 200.        ],
       [900.        , 628.57142857],
       [557.14285714, 800.        ],
       [300.        , 542.85714286]])
>>> res3 = zz.vision.bbox(res3, poly3)
>>> res4, poly4 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly)
>>> poly4
array([[ 200.        ,  200.        ],
       [ 542.85714286,  200.        ],
       [1000.        ,  628.57142857],
       [ 542.85714286,  800.        ],
       [ 200.        ,  542.85714286]])
>>> res4 = zz.vision.bbox(res4, poly4)
Gaussian Blur:
>>> res5, poly5 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly, gaussian=501)
>>> res5 = zz.vision.bbox(res5, poly5)

Image pasting example

Source code in zerohertzLib/vision/visual.py
def paste(
    img: NDArray[np.uint8],
    target: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    resize: bool = False,
    vis: bool = False,
    poly: NDArray[DTypeLike] | None = None,
    alpha: int | None = None,
    gaussian: int | None = None,
) -> NDArray[np.uint8] | tuple[NDArray[np.uint8], NDArray[DTypeLike]]:
    """`target` image를 `img` 위에 투명도를 포함하여 병합

    Note:
        `PIL.Image.paste` 를 `numpy` 와 `cv2` 기반으로 구현

        ```python
        >>> img = Image.open("test.png").convert("RGBA")
        >>> target = Image.open("target.png").convert("RGBA")
        >>> img.paste(target, (0, 0), target)
        ```

    Args:
        img: 입력 image (`[H, W, C]`)
        target: Target image (`[H, W, 4]`)
        box: 병합될 영역 (`xyxy` 형식)
        resize: Target image의 resize 여부
        vis: 지정한 영역 (`box`)의 시각화 여부
        poly: 변형된 좌표 (`[N, 2]`)
        alpha: `target` image의 투명도 변경
        gaussian: 자연스러운 병합을 위해 `target` 의 alpha channel에 적용될 Gaussian blur의 kernel size

    Returns:
        시각화 결과 (`[H, W, 4]`) 및 `poly` 입력 시 변형된 좌표값

    Examples:
        Without Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> target = zz.vision.cutout(img, poly, 200)
            >>> res1 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, vis=True)
            >>> res2 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, vis=True, alpha=255)
            ```
        With Poly:
            ```python
            >>> poly -= zz.vision.poly2xyxy(poly)[:2]
            >>> target = zz.vision.bbox(target, poly, color=(255, 0, 0), thickness=20)
            >>> res3, poly3 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=False, poly=poly)
            >>> poly3
            array([[300.        , 200.        ],
                   [557.14285714, 200.        ],
                   [900.        , 628.57142857],
                   [557.14285714, 800.        ],
                   [300.        , 542.85714286]])
            >>> res3 = zz.vision.bbox(res3, poly3)
            >>> res4, poly4 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly)
            >>> poly4
            array([[ 200.        ,  200.        ],
                   [ 542.85714286,  200.        ],
                   [1000.        ,  628.57142857],
                   [ 542.85714286,  800.        ],
                   [ 200.        ,  542.85714286]])
            >>> res4 = zz.vision.bbox(res4, poly4)
            ```
        Gaussian Blur:
            ```python
            >>> res5, poly5 = zz.vision.paste(img, target, [200, 200, 1000, 800], resize=True, poly=poly, gaussian=501)
            >>> res5 = zz.vision.bbox(res5, poly5)
            ```

        ![Image pasting example](../../../assets/vision/paste.png){ width="600" }
    """
    x_0, y_0, x_1, y_1 = map(int, box)
    box_height, box_width = y_1 - y_0, x_1 - x_0
    img = img.copy()
    img = _cvt_bgra(img)
    target = target.copy()
    tar_height, tar_width = target.shape[:2]
    if alpha is not None:
        target[:, :, 3][0 < target[:, :, 3]] = alpha
    if gaussian is not None:
        invisible = target[:, :, 3] == 0
        pad_gaussian = gaussian * 3
        target_alpha = cv2.copyMakeBorder(
            target[:, :, 3],
            pad_gaussian,
            pad_gaussian,
            pad_gaussian,
            pad_gaussian,
            cv2.BORDER_CONSTANT,
        )
        target[:, :, 3] = cv2.GaussianBlur(target_alpha, (gaussian, gaussian), 0)[
            pad_gaussian:-pad_gaussian, pad_gaussian:-pad_gaussian
        ]
        target[:, :, 3][invisible] = 0
    if resize:
        target = cv2.resize(
            target, (box_width, box_height), interpolation=cv2.INTER_LINEAR
        )
        if poly is not None:
            poly = poly * (box_width / tar_width, box_height / tar_height) + (x_0, y_0)
    else:
        if poly is None:
            target, _ = pad(target, (box_height, box_width), (0, 0, 0, 0))
        else:
            target, poly = pad(target, (box_height, box_width), (0, 0, 0, 0), poly)
            poly += (x_0, y_0)
    img[y_0:y_1, x_0:x_1, :] = _paste(img[y_0:y_1, x_0:x_1, :], target)
    if vis:
        box = np.array([[x_0, y_0], [x_0, y_1], [x_1, y_1], [x_1, y_0]])
        img = _bbox(img, box, (0, 0, 255, 255), 2)
    if poly is None:
        return img
    return img, poly

poly2area

poly2area(poly: list[int | float] | NDArray[DTypeLike]) -> float

다각형의 면적을 산출하는 function

Parameters:

Name Type Description Default
poly list[int | float] | NDArray[DTypeLike]

다각형 ([N, 2])

required

Returns:

Type Description
float

다각형의 면적

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> zz.vision.poly2area(poly)
550.0
>>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
>>> zz.vision.poly2area(box)
880000.0
Source code in zerohertzLib/vision/convert.py
def poly2area(poly: list[int | float] | NDArray[DTypeLike]) -> float:
    """다각형의 면적을 산출하는 function

    Args:
        poly: 다각형 (`[N, 2]`)

    Returns:
        다각형의 면적

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> zz.vision.poly2area(poly)
        550.0
        >>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
        >>> zz.vision.poly2area(box)
        880000.0
    """
    poly = _list2np(poly)
    pts_x = poly[:, 0]
    pts_y = poly[:, 1]
    return 0.5 * np.abs(
        np.dot(pts_x, np.roll(pts_y, 1)) - np.dot(pts_y, np.roll(pts_x, 1))
    )

poly2cwh

poly2cwh(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

required

Returns:

Type Description
NDArray[DTypeLike]

[cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])

Examples:

>>> zz.vision.poly2cwh([[10, 20], [30, 20], [30, 40], [10, 40]])
array([20, 30, 20, 20])
>>> zz.vision.poly2cwh(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
array([[20, 30, 20, 20],
       [50, 75, 40, 50]])
Source code in zerohertzLib/vision/convert.py
def poly2cwh(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Returns:
        `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.poly2cwh([[10, 20], [30, 20], [30, 40], [10, 40]])
        array([20, 30, 20, 20])
        >>> zz.vision.poly2cwh(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
        array([[20, 30, 20, 20],
               [50, 75, 40, 50]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        raise ValueError("The 'poly' must be of shape [4, 2], [N, 4, 2]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _poly2cwh(box_)
        return boxes
    return _poly2cwh(box)

poly2mask

poly2mask(poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]], shape: tuple[int, int]) -> NDArray[bool]

다각형 좌표를 입력받아 mask로 변환

Parameters:

Name Type Description Default
poly list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]]

Mask의 꼭짓점 좌표 ([M, 2] or [N, M, 2])

required
shape tuple[int, int]

출력될 mask의 shape (H, W)

required

Returns:

Type Description
NDArray[bool]

변환된 mask ([H, W] or [N, H, W])

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> mask1 = zz.vision.poly2mask(poly, (70, 100))
>>> mask1.shape
(70, 100)
>>> mask1.dtype
dtype('bool')
>>> poly = np.array(poly)
>>> mask2 = zz.vision.poly2mask([poly, poly - 10, poly + 20], (70, 100))
>>> mask2.shape
(3, 70, 100)
>>> mask2.dtype
dtype('bool')

Polygon to mask conversion example

Source code in zerohertzLib/vision/convert.py
def poly2mask(
    poly: list[int | float] | NDArray[DTypeLike] | list[NDArray[DTypeLike]],
    shape: tuple[int, int],
) -> NDArray[bool]:
    """다각형 좌표를 입력받아 mask로 변환

    Args:
        poly: Mask의 꼭짓점 좌표 (`[M, 2]` or `[N, M, 2]`)
        shape: 출력될 mask의 shape `(H, W)`

    Returns:
        변환된 mask (`[H, W]` or `[N, H, W]`)

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> mask1 = zz.vision.poly2mask(poly, (70, 100))
        >>> mask1.shape
        (70, 100)
        >>> mask1.dtype
        dtype('bool')
        >>> poly = np.array(poly)
        >>> mask2 = zz.vision.poly2mask([poly, poly - 10, poly + 20], (70, 100))
        >>> mask2.shape
        (3, 70, 100)
        >>> mask2.dtype
        dtype('bool')

        ![Polygon to mask conversion example](../../../assets/vision/poly2mask.png){ width="300" }
    """
    if (isinstance(poly, list) and isinstance(poly[0], np.ndarray)) or (
        isinstance(poly, np.ndarray) and len(poly.shape) == 3
    ):
        mks = []
        for _poly in poly:
            mks.append(_poly2mask(_poly, shape))
        mks = np.array(mks)
    else:
        mks = _poly2mask(_list2np(poly), shape)
    return mks

poly2ratio

poly2ratio(poly: list[int | float] | NDArray[DTypeLike]) -> float

다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function

Parameters:

Name Type Description Default
poly list[int | float] | NDArray[DTypeLike]

다각형 ([N, 2])

required

Returns:

Type Description
float

다각형의 bbox 대비 다각형의 비율

Examples:

>>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
>>> zz.vision.poly2ratio(poly)
0.55
>>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
>>> zz.vision.poly2ratio(box)
1.0
Source code in zerohertzLib/vision/convert.py
def poly2ratio(poly: list[int | float] | NDArray[DTypeLike]) -> float:
    """다각형의 bbox 대비 다각형의 면적 비율을 산출하는 function

    Args:
        poly: 다각형 (`[N, 2]`)

    Returns:
        다각형의 bbox 대비 다각형의 비율

    Examples:
        >>> poly = [[10, 10], [20, 10], [30, 40], [20, 60], [10, 20]]
        >>> zz.vision.poly2ratio(poly)
        0.55
        >>> box = np.array([[100, 200], [1200, 200], [1200, 1000], [100, 1000]])
        >>> zz.vision.poly2ratio(box)
        1.0
    """
    poly_area = poly2area(poly)
    _, _, height, width = poly2cwh(poly)
    bbox_area = height * width
    return poly_area / bbox_area

poly2xyxy

poly2xyxy(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

required

Returns:

Type Description
NDArray[DTypeLike]

[x0, y0, x1, y1] 로 구성된 bbox ([4] or [N, 4])

Examples:

>>> zz.vision.poly2xyxy([[10, 20], [30, 20], [30, 40], [10, 40]])
array([10, 20, 30, 40])
>>> zz.vision.poly2xyxy(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
array([[ 10,  20,  30,  40],
       [ 30,  50,  70, 100]])
Source code in zerohertzLib/vision/convert.py
def poly2xyxy(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Returns:
        `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.poly2xyxy([[10, 20], [30, 20], [30, 40], [10, 40]])
        array([10, 20, 30, 40])
        >>> zz.vision.poly2xyxy(np.array([[[10, 20], [30, 20], [30, 40], [10, 40]], [[30, 50], [70, 50], [70, 100], [30, 100]]]))
        array([[ 10,  20,  30,  40],
               [ 30,  50,  70, 100]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if not poly:
        raise ValueError("The 'poly' must be of shape [4, 2], [N, 4, 2]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _poly2xyxy(box_)
        return boxes
    return _poly2xyxy(box)

text

text(img: NDArray[uint8], box: list[int | float] | NDArray[DTypeLike], txt: str | list[str], color: tuple[int, int, int] = (0, 0, 0), vis: bool = False, fontsize: int = 100) -> NDArray[uint8]

Text 시각화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
box list[int | float] | NDArray[DTypeLike]

문자열이 존재할 bbox ([4], [N, 4], [4, 2], [N, 4, 2])

required
txt str | list[str]

Image에 추가할 문자열

required
color tuple[int, int, int]

문자의 색

(0, 0, 0)
vis bool

문자 영역의 시각화 여부

False
fontsize int

문자의 크기

100

Returns:

Type Description
NDArray[uint8]

시각화 결과 ([H, W, 4])

Examples:

Bbox:

>>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
>>> box.shape
(4, 2)
>>> res1 = zz.vision.text(img, box, "먼지야")
Bboxes:
>>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
>>> boxes.shape
(3, 4)
>>> res2 = zz.vision.text(img, boxes, ["먼지야", "먼지야", "먼지야"], vis=True)

Text on image example

Source code in zerohertzLib/vision/visual.py
def text(
    img: NDArray[np.uint8],
    box: list[int | float] | NDArray[DTypeLike],
    txt: str | list[str],
    color: tuple[int, int, int] = (0, 0, 0),
    vis: bool = False,
    fontsize: int = 100,
) -> NDArray[np.uint8]:
    """Text 시각화

    Args:
        img: 입력 image (`[H, W, C]`)
        box: 문자열이 존재할 bbox (`[4]`, `[N, 4]`, `[4, 2]`, `[N, 4, 2]`)
        txt: Image에 추가할 문자열
        color: 문자의 색
        vis: 문자 영역의 시각화 여부
        fontsize: 문자의 크기

    Returns:
        시각화 결과 (`[H, W, 4]`)

    Examples:
        Bbox:
            ```python
            >>> box = np.array([[100, 200], [100, 1000], [1200, 1000], [1200, 200]])
            >>> box.shape
            (4, 2)
            >>> res1 = zz.vision.text(img, box, "먼지야")
            ```
        Bboxes:
            ```python
            >>> boxes = np.array([[250, 200, 100, 100], [600, 600, 800, 200], [900, 300, 300, 400]])
            >>> boxes.shape
            (3, 4)
            >>> res2 = zz.vision.text(img, boxes, ["먼지야", "먼지야", "먼지야"], vis=True)
            ```

        ![Text on image example](../../../assets/vision/text.png){ width="600" }
    """
    box = _list2np(box)
    img = img.copy()
    img = _cvt_bgra(img)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        box_poly = box
        box_cwh = poly2cwh(box)
    else:
        box_poly = cwh2poly(box)
        box_cwh = box
    if multi:
        if not shape[0] == len(txt):
            raise ValueError("'box.shape[0]' and 'len(txt)' must be equal")
        for b_poly, b_cwh, txt_ in zip(box_poly, box_cwh, txt):
            img = _text(img, b_cwh, txt_, color, fontsize)
            if vis:
                img = _bbox(img, b_poly, (0, 0, 255, 255), 2)
    else:
        img = _text(img, box_cwh, txt, color, fontsize)
        if vis:
            img = _bbox(img, box_poly, (0, 0, 255, 255), 2)
    return img

transparent

transparent(img: NDArray[uint8], threshold: int = 128, reverse: bool = False) -> NDArray[uint8]

입력 image에 대해 threshold 미만의 pixel들을 투명화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
threshold int

Threshold

128
reverse bool

threshold 이상의 pixel 투명화 여부

False

Returns:

Type Description
NDArray[uint8]

출력 image ([H, W, 4])

Examples:

>>> res1 = zz.vision.transparent(img)
>>> res2 = zz.vision.transparent(img, reverse=True)

Transparent background example

Source code in zerohertzLib/vision/transform.py
def transparent(
    img: NDArray[np.uint8],
    threshold: int = 128,
    reverse: bool = False,
) -> NDArray[np.uint8]:
    """입력 image에 대해 `threshold` 미만의 pixel들을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        threshold: Threshold
        reverse: `threshold` 이상의 pixel 투명화 여부

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> res1 = zz.vision.transparent(img)
        >>> res2 = zz.vision.transparent(img, reverse=True)

        ![Transparent background example](../../../assets/vision/transparent.png){ width="600" }
    """
    img = img.copy()
    img = _cvt_bgra(img)
    img_alpha = img[:, :, 3]
    img_bin = threshold > cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
    if reverse:
        img_alpha[~img_bin] = 0
    else:
        img_alpha[img_bin] = 0
    return img

vert

vert(imgs: list[NDArray[uint8]], height: int = 1000, file_name: str = 'tmp') -> None

여러 image를 입력받아 가로 image로 병합

Parameters:

Name Type Description Default
imgs list[NDArray[uint8]]

입력 image

required
height int

출력 image의 높이

1000
file_name str

저장될 file의 이름

'tmp'

Returns:

Type Description
None

현재 directory에 바로 image 저장

Examples:

>>> imgs = [cv2.resize(img, (random.randrange(300, 600), random.randrange(300, 600))) for _ in range(5)]
>>> zz.vision.vert(imgs)

Vertical image alignment example

Source code in zerohertzLib/vision/compare.py
def vert(
    imgs: list[NDArray[np.uint8]],
    height: int = 1000,
    file_name: str = "tmp",
) -> None:
    """여러 image를 입력받아 가로 image로 병합

    Args:
        imgs: 입력 image
        height: 출력 image의 높이
        file_name: 저장될 file의 이름

    Returns:
        현재 directory에 바로 image 저장

    Examples:
        >>> imgs = [cv2.resize(img, (random.randrange(300, 600), random.randrange(300, 600))) for _ in range(5)]
        >>> zz.vision.vert(imgs)

        ![Vertical image alignment example](../../../assets/vision/vert.png){ width="600" }
    """
    resized_imgs = []
    width = 0
    for img in imgs:
        shape = img.shape
        img = _cvt_bgra(img)
        if shape[0] != height:
            tar_width = int(height / shape[0] * shape[1])
            img = cv2.resize(img, (tar_width, height))
        else:
            tar_width = shape[1]
        width += tar_width
        resized_imgs.append(img)
    palette = np.full((height, width, 4), 255, dtype=np.uint8)
    width = 0
    for img in resized_imgs:
        img_height, img_width, _ = img.shape
        palette[:img_height, width : width + img_width, :] = img
        width += img_width
    cv2.imwrite(f"{file_name}.png", palette)

vid2gif

vid2gif(path: str, file_name: str = 'tmp', quality: int = 100, fps: int = 15, speed: float = 1.0) -> None

동영상을 GIF로 변환

Parameters:

Name Type Description Default
path str

GIF로 변환할 동영상이 존재하는 경로

required
file_name str

출력될 GIF file 이름

'tmp'
quality int

출력될 GIF의 품질

100
fps int

출력될 GIF의 FPS (Frames Per Second)

15
speed float

출력될 GIF의 배속

1.0

Returns:

Type Description
None

현재 directory에 바로 GIF 저장

Examples:

>>> zz.vision.vid2gif("test.mp4")

Video to GIF conversion example

Source code in zerohertzLib/vision/gif.py
def vid2gif(
    path: str,
    file_name: str = "tmp",
    quality: int = 100,
    fps: int = 15,
    speed: float = 1.0,
) -> None:
    """동영상을 GIF로 변환

    Args:
        path: GIF로 변환할 동영상이 존재하는 경로
        file_name: 출력될 GIF file 이름
        quality: 출력될 GIF의 품질
        fps: 출력될 GIF의 FPS (Frames Per Second)
        speed: 출력될 GIF의 배속

    Returns:
        현재 directory에 바로 GIF 저장

    Examples:
        >>> zz.vision.vid2gif("test.mp4")

        ![Video to GIF conversion example](../../../assets/vision/vid2gif.gif){ width="300" }
    """
    frames = []
    cap = cv2.VideoCapture(path)
    original_fps = round(cap.get(cv2.CAP_PROP_FPS))
    fps = min(original_fps, fps)
    frame_count_speed = frame_count_fps = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        frame_count_speed += 1
        if round(frame_count_speed % speed) != 0:
            continue
        if frame_count_fps % (int(original_fps / fps)) == 0:
            frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            pil_img = Image.fromarray(frame_rgb)
            width, height = pil_img.size
            new_width = int(width * quality / 100)
            new_height = int(height * quality / 100)
            resized_img = pil_img.resize((new_width, new_height), Image.LANCZOS)
            frames.append(resized_img)
        frame_count_fps += 1
    cap.release()
    duration = int(1000 / fps)
    _create_gif_from_frames(frames, file_name, duration)

xyxy2cwh

xyxy2cwh(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[x0, y0, x1, y1] 로 구성된 bbox ([4] or [N, 4])

required

Returns:

Type Description
NDArray[DTypeLike]

[cx, cy, w, h] 로 구성된 bbox ([4] or [N, 4])

Examples:

>>> zz.vision.xyxy2cwh([10, 20, 30, 40])
array([20, 30, 20, 20])
>>> zz.vision.xyxy2cwh(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
array([[20, 30, 20, 20],
       [50, 75, 40, 50]])
Source code in zerohertzLib/vision/convert.py
def xyxy2cwh(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[cx, cy, w, h]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Examples:
        >>> zz.vision.xyxy2cwh([10, 20, 30, 40])
        array([20, 30, 20, 20])
        >>> zz.vision.xyxy2cwh(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
        array([[20, 30, 20, 20],
               [50, 75, 40, 50]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'xyxy' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _xyxy2cwh(box_)
        return boxes
    return _xyxy2cwh(box)

xyxy2poly

xyxy2poly(box: list[int | float] | NDArray[DTypeLike]) -> NDArray[DTypeLike]

Bbox 변환

Parameters:

Name Type Description Default
box list[int | float] | NDArray[DTypeLike]

[x0, y0, x1, y1] 로 구성된 bbox ([4] or [N, 4])

required

Returns:

Type Description
NDArray[DTypeLike]

[[x0, y0], [x1, y1], [x2, y2], [x3, y3]] 로 구성된 bbox ([4, 2] or [N, 4, 2])

Examples:

>>> zz.vision.xyxy2poly([10, 20, 30, 40])
array([[10, 20],
       [30, 20],
       [30, 40],
       [10, 40]])
>>> zz.vision.xyxy2poly(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
array([[[ 10,  20],
        [ 30,  20],
        [ 30,  40],
        [ 10,  40]],
       [[ 30,  50],
        [ 70,  50],
        [ 70, 100],
        [ 30, 100]]])
Source code in zerohertzLib/vision/convert.py
def xyxy2poly(
    box: list[int | float] | NDArray[DTypeLike],
) -> NDArray[DTypeLike]:
    """Bbox 변환

    Args:
        box: `[x0, y0, x1, y1]` 로 구성된 bbox (`[4]` or `[N, 4]`)

    Returns:
        `[[x0, y0], [x1, y1], [x2, y2], [x3, y3]]` 로 구성된 bbox (`[4, 2]` or `[N, 4, 2]`)

    Examples:
        >>> zz.vision.xyxy2poly([10, 20, 30, 40])
        array([[10, 20],
               [30, 20],
               [30, 40],
               [10, 40]])
        >>> zz.vision.xyxy2poly(np.array([[10, 20, 30, 40], [30, 50, 70, 100]]))
        array([[[ 10,  20],
                [ 30,  20],
                [ 30,  40],
                [ 10,  40]],
               [[ 30,  50],
                [ 70,  50],
                [ 70, 100],
                [ 30, 100]]])
    """
    box = _list2np(box)
    shape = box.shape
    multi, poly = _is_bbox(shape)
    if poly:
        raise ValueError("The 'xyxy' must be of shape [4], [N, 4]")
    if multi:
        boxes = np.zeros((shape[0], 4, 2), dtype=box.dtype)
        for i, box_ in enumerate(box):
            boxes[i] = _xyxy2poly(box_)
        return boxes
    return _xyxy2poly(box)