Skip to content

zerohertzLib.vision.transform

Functions:

Name Description
cutout

Image 내에서 지정한 좌표를 제외한 부분을 투명화

pad

입력 image를 원하는 shape로 resize 및 pad

transparent

입력 image에 대해 threshold 미만의 pixel들을 투명화

cutout

cutout(img: NDArray[uint8], poly: list[int | float] | NDArray[DTypeLike], alpha: int = 255, crop: bool = True, background: int = 0) -> NDArray[uint8]

Image 내에서 지정한 좌표를 제외한 부분을 투명화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
poly list[int | float] | NDArray[DTypeLike]

지정할 좌표 ([N, 2])

required
alpha int

지정한 좌표 영역의 투명도

255
crop bool

출력 image의 Crop 여부

True
background int

지정한 좌표 외 배경의 투명도

0

Returns:

Type Description
NDArray[uint8]

출력 image ([H, W, 4])

Examples:

>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res1 = zz.vision.cutout(img, poly)
>>> res2 = zz.vision.cutout(img, poly, 128, False)
>>> res3 = zz.vision.cutout(img, poly, background=128)

Image cutout example

Source code in zerohertzLib/vision/transform.py
def cutout(
    img: NDArray[np.uint8],
    poly: list[int | float] | NDArray[DTypeLike],
    alpha: int = 255,
    crop: bool = True,
    background: int = 0,
) -> NDArray[np.uint8]:
    """Image 내에서 지정한 좌표를 제외한 부분을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        poly: 지정할 좌표 (`[N, 2]`)
        alpha: 지정한 좌표 영역의 투명도
        crop: 출력 image의 Crop 여부
        background: 지정한 좌표 외 배경의 투명도

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
        >>> res1 = zz.vision.cutout(img, poly)
        >>> res2 = zz.vision.cutout(img, poly, 128, False)
        >>> res3 = zz.vision.cutout(img, poly, background=128)

        ![Image cutout example](../../../assets/vision/cutout.png){ width="600" }
    """
    shape = img.shape[:2]
    poly = _list2np(poly)
    poly = poly.astype(np.int32)
    x_0, x_1 = poly[:, 0].min(), poly[:, 0].max()
    y_0, y_1 = poly[:, 1].min(), poly[:, 1].max()
    mask = poly2mask(poly, shape)
    if background == 0:
        mask = (mask * alpha).astype(np.uint8)
    else:
        mask = mask.astype(np.uint8)
        mask[mask == 0] = background
        mask[mask == 1] = alpha
    img = Image.fromarray(img)
    mask = Image.fromarray(mask)
    img.putalpha(mask)
    if crop:
        return np.array(img)[y_0:y_1, x_0:x_1, :]
    return np.array(img)

pad

pad(img: NDArray[uint8], shape: tuple[int, int], color: tuple[int, int, int] = (255, 255, 255), poly: NDArray[DTypeLike] | None = None) -> tuple[NDArray[uint8], tuple[float, int, int] | NDArray[DTypeLike]]

입력 image를 원하는 shape로 resize 및 pad

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
shape tuple[int, int]

출력의 shape (H, W)

required
color tuple[int, int, int]

Padding의 색

(255, 255, 255)
poly NDArray[DTypeLike] | None

Padding에 따라 변형될 좌표 ([N, 2])

None

Returns:

Type Description
tuple[NDArray[uint8], tuple[float, int, int] | NDArray[DTypeLike]]

출력 image ([H, W, C]) 및 padding에 따른 정보 또는 변형된 좌표값

Note

poly 를 입력하지 않을 시 (ratio, left, top) 가 출력되며 poly * ratio + (left, top) 와 같이 차후에 변환 가능

Examples:

GRAY:

>>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
>>> res1 = cv2.resize(img, (500, 1000))
>>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))
BGR:
>>> res2 = cv2.resize(img, (1000, 500))
>>> res2, _ = zz.vision.pad(res2, (1000, 1000))
BGRA:
>>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
>>> res3 = cv2.resize(img, (500, 1000))
>>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))
Poly:
>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res4 = cv2.resize(img, (2000, 1000))
>>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
>>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
>>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))
Transformation:
>>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
>>> res5 = cv2.resize(img, (2000, 1000))
>>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
>>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
>>> poly = poly * info[0] + info[1:]
>>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))

Image padding example

Source code in zerohertzLib/vision/transform.py
def pad(
    img: NDArray[np.uint8],
    shape: tuple[int, int],
    color: tuple[int, int, int] = (255, 255, 255),
    poly: NDArray[DTypeLike] | None = None,
) -> tuple[NDArray[np.uint8], tuple[float, int, int] | NDArray[DTypeLike]]:
    """입력 image를 원하는 shape로 resize 및 pad

    Args:
        img: 입력 image (`[H, W, C]`)
        shape: 출력의 shape `(H, W)`
        color: Padding의 색
        poly: Padding에 따라 변형될 좌표 (`[N, 2]`)

    Returns:
        출력 image (`[H, W, C]`) 및 padding에 따른 정보 또는 변형된 좌표값

    Note:
        `poly` 를 입력하지 않을 시 `(ratio, left, top)` 가 출력되며 `poly * ratio + (left, top)` 와 같이 차후에 변환 가능

    Examples:
        GRAY:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
            >>> res1 = cv2.resize(img, (500, 1000))
            >>> res1, _ = zz.vision.pad(res1, (1000, 1000), color=(0, 255, 0))
            ```
        BGR:
            ```python
            >>> res2 = cv2.resize(img, (1000, 500))
            >>> res2, _ = zz.vision.pad(res2, (1000, 1000))
            ```
        BGRA:
            ```python
            >>> img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
            >>> res3 = cv2.resize(img, (500, 1000))
            >>> res3, _ = zz.vision.pad(res3, (1000, 1000), color=(0, 0, 255, 128))
            ```
        Poly:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res4 = cv2.resize(img, (2000, 1000))
            >>> res4 = zz.vision.bbox(res4, poly, color=(255, 0, 0), thickness=20)
            >>> res4, poly = zz.vision.pad(res4, (1000, 1000), poly=poly)
            >>> res4 = zz.vision.bbox(res4, poly, color=(0, 0, 255))
            ```
        Transformation:
            ```python
            >>> poly = np.array([[100, 400], [400, 400], [800, 900], [400, 1100], [100, 800]])
            >>> res5 = cv2.resize(img, (2000, 1000))
            >>> res5 = zz.vision.bbox(res5, poly, color=(255, 0, 0), thickness=20)
            >>> res5, info = zz.vision.pad(res5, (1000, 1000), color=(128, 128, 128))
            >>> poly = poly * info[0] + info[1:]
            >>> res5 = zz.vision.bbox(res5, poly, color=(0, 0, 255))
            ```

        ![Image padding example](../../../assets/vision/pad.png){ width="700" }
    """
    if len(img.shape) == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    if img.shape[2] == 4 and len(color) == 3:
        color = [*color, 255]
    img_height, img_width = img.shape[:2]
    tar_height, tar_width = shape
    if img_width / img_height > tar_width / tar_height:
        ratio = tar_width / img_width
        resize_width, resize_height = tar_width, int(img_height * ratio)
    elif img_width / img_height < tar_width / tar_height:
        ratio = tar_height / img_height
        resize_width, resize_height = int(img_width * ratio), tar_height
    else:
        ratio = 1
        (
            resize_width,
            resize_height,
        ) = (
            tar_width,
            tar_height,
        )
    img = cv2.resize(img, (resize_width, resize_height), interpolation=cv2.INTER_LINEAR)
    top, bottom = (
        (tar_height - resize_height) // 2,
        (tar_height - resize_height) // 2 + (tar_height - resize_height) % 2,
    )
    left, right = (
        (tar_width - resize_width) // 2,
        (tar_width - resize_width) // 2 + (tar_width - resize_width) % 2,
    )
    img = cv2.copyMakeBorder(
        img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
    )
    if poly is None:
        return img, (ratio, left, top)
    return img, poly * ratio + (left, top)

transparent

transparent(img: NDArray[uint8], threshold: int = 128, reverse: bool = False) -> NDArray[uint8]

입력 image에 대해 threshold 미만의 pixel들을 투명화

Parameters:

Name Type Description Default
img NDArray[uint8]

입력 image ([H, W, C])

required
threshold int

Threshold

128
reverse bool

threshold 이상의 pixel 투명화 여부

False

Returns:

Type Description
NDArray[uint8]

출력 image ([H, W, 4])

Examples:

>>> res1 = zz.vision.transparent(img)
>>> res2 = zz.vision.transparent(img, reverse=True)

Transparent background example

Source code in zerohertzLib/vision/transform.py
def transparent(
    img: NDArray[np.uint8],
    threshold: int = 128,
    reverse: bool = False,
) -> NDArray[np.uint8]:
    """입력 image에 대해 `threshold` 미만의 pixel들을 투명화

    Args:
        img: 입력 image (`[H, W, C]`)
        threshold: Threshold
        reverse: `threshold` 이상의 pixel 투명화 여부

    Returns:
        출력 image (`[H, W, 4]`)

    Examples:
        >>> res1 = zz.vision.transparent(img)
        >>> res2 = zz.vision.transparent(img, reverse=True)

        ![Transparent background example](../../../assets/vision/transparent.png){ width="600" }
    """
    img = img.copy()
    img = _cvt_bgra(img)
    img_alpha = img[:, :, 3]
    img_bin = threshold > cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
    if reverse:
        img_alpha[~img_bin] = 0
    else:
        img_alpha[img_bin] = 0
    return img