Dataset

class text_renderer.dataset.Dataset(data_dir: str, jpg_quality: int = 95)[source]
read(name) → Dict[source]
Parameters

name (str) – 000000001

Returns

{
    "image": ndarray,
    "label": "label",
    "size": [int_width, int_height]
}

Return type

dict

class text_renderer.dataset.LmdbDataset(data_dir: str)[source]

Save generated image into lmdb. Compatible with https://github.com/PaddlePaddle/PaddleOCR Keys in lmdb:

  • image-000000001: image raw bytes

  • label-000000001: string

  • size-000000001: “width,height”

class text_renderer.dataset.ImgDataset(data_dir: str)[source]

Save generated image as jpg file, save label and meta in json json file format:

{
     "labels": {
        "000000000": "test",
        "000000001": "text2"
     },
     "sizes": {
        "000000000": [width, height],
        "000000001": [width, height],
     }
     "num-samples": 2,
}