improutils.recognition package

Submodules

improutils.recognition.image_features module

class improutils.recognition.image_features.ShapeDescriptors[source]

Bases: object

An internal class for computing shape descriptors. Not to be used by the programmer.

aspect_ratio(max_diameter)[source]: Compute the aspect ratio of a shape as the ratio of minimum to maximum diameter.

compactness(max_diameter)[source]: Compute the compactness of a shape based on area and maximum diameter.

convexity(convex_perimeter)[source]: Compute the convexity of a shape as the ratio of convex perimeter to actual perimeter.

extent(bounding_rectangle_area)[source]: Compute the extent of a shape as the ratio of area to bounding rectangle area.

form_factor(perimeter)[source]: Compute the form factor of a shape based on area and perimeter.

roundness(max_diameter)[source]: Compute the roundness of a shape based on area and maximum diameter.

solidity(convex_area)[source]: Compute the solidity of a shape as the ratio of area to its convex hull area.

improutils.recognition.image_features.aspect_ratio(contour)[source]

Determine the contour’s aspect ratio.

Aka “poměr stran”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.compactness(contour)[source]

Determine the contour’s compactness.

Aka “kompaktnost, hutnost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.convexity(contour)[source]

Determine the contour’s convexity.

Aka “konvexita, vypouklost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.extent(contour)[source]

Determine the contour’s extent.

Aka “dosah, rozměrnost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.form_factor(contour)[source]

Determine the contour’s form factor.

Aka “špičatost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.roundness(contour)[source]

Determine the contour’s roundness.

Aka “kulatost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.image_features.solidity(contour)[source]

Determine the contour’s solidity.

Aka “plnost, celistvost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.ocr module

improutils.recognition.ocr.ocr(img_bin, config='', lang=None)[source]

Detect text in the image.

Parameters:

img_bin (ndarray) – Input binary image. White objects on black background.
config (str) – Model config, refer to: https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html, https://muthu.co/all-tesseract-ocr-options/ for correct use. Defaults to ‘’.
lang (str | None) – Language code, e.g. eng for English and ces for Czech. For list of language codes, refer to: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html. Selected language must be installed using sudo apt-get install tesseract-ocr-langcode where langcode is the language code. English is installed by default. Defaults to None.

Return type:

The recognized text in the image.

improutils.recognition.qr module

improutils.recognition.qr.qr_decode(img, detection_result, is_bgr=True, reader=<qreader.QReader object>)[source]

Decode a single QR code on the given image, described by a detection_result.

For further info refer to: https://github.com/Eric-Canas/qreader

Internally, this method will run the pyzbar decoder, using the information of the detection_result, to apply different image preprocessing techniques that heavily increase the decoding rate.

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3). Can also be grayscale (HxW), in which case it will be converted to BGR.
detection_result (dict[str, np.ndarray|float|tuple[float|int, float|int]]) – One of the detection dicts returned by the detect method. Note that qr_detect() returns a tuple of these dicts. This method expects just one of them.
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

decoded_strings – The decoded content of the QR code or None if it can not be read.

Return type:

str OR None

improutils.recognition.qr.qr_detect(img, is_bgr=True, reader=<qreader.QReader object>)[source]

Detect QR codes in the image and returns a tuple of dictionaries with all the detection information.

For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3).
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

detections –

A tuple of dictionaries containing the following keys:

’confidence’: float. The confidence of the detection.
’bbox_xyxy’: np.ndarray. The bounding box of the detection in the format [x1, y1, x2, y2].
’cxcy’: tuple[float, float]. The center of the bounding box in the format (x, y).
’wh’: tuple[float, float]. The width and height of the bounding box in the format (w, h).
’polygon_xy’: np.ndarray. The accurate polygon that surrounds the QR code, with shape (N, 2).
’quad_xy’: np.ndarray. The quadrilateral that surrounds the QR code, with shape (4, 2). Fitted from the polygon.
’padded_quad_xy’: np.ndarray. An expanded version of quad_xy, with shape (4, 2), that always include all the points within polygon_xy.

All these keys (except ‘confidence’) have a ‘n’ (normalized) version. For example, ‘bbox_xyxy’ is the bounding box in absolute coordinates, while ‘bbox_xyxyn’ is the bounding box in normalized coordinates (from 0. to 1.).

Return type:

tuple[dict[str, np.ndarray|float|tuple[float|int, float|int]]]:

improutils.recognition.qr.qr_detect_and_decode(img, return_detections=False, is_bgr=True, reader=<qreader.QReader object>)[source]

Detect and decodes QR codes in the given image and return the decoded strings (or None, if any of them was detected but not decoded).

For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3). Can also be grayscale (HxW), in which case it will be converted to BGR.
return_detections (bool) – If True, it will return the full detection results together with the decoded QRs. If False, it will return only the decoded content of the QR codes. Defaults to False.
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

detections – If return_detections is True, it will return a tuple of tuples. Each tuple will contain the detection result (a dictionary with the keys ‘confidence’, ‘bbox_xyxy’, ‘polygon_xy’…) and the decoded QR code (or None if it can not be decoded). If return_detections is False, it will return a tuple of strings with the decoded QR codes (or None if it can not be decoded).

Return type:

improutils.recognition.qr.qr_init_reader(model_size='s', min_confidence=0.5, reencode_to='shift_jis')[source]

Initialize a QR code reader.

If you want to use the detect or decode QR codes on multiple occasions, it is recommended to initialize the reader once and then pass it to the functions. For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

model_size (str) – The size of the model to use. It can be ‘n’ (nano), ‘s’ (small), ‘m’ (medium) or ‘l’ (large). Larger models are more accurate but slower. Defaults to ‘s’.
min_confidence (float) – The minimum confidence of the QR detection to be considered valid. Values closer to 0.0 can get more False Positives, while values closer to 1.0 can lose difficult QRs. Default (and recommended): 0.5.
reencode_to (str | None) – The encoding to reencode the utf-8 decoded QR string. If None, it won’t re-encode. If you find some characters being decoded incorrectly, try to set a Code Page (https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers) that matches your specific charset. Recommendations that have been found useful: ‘shift-jis’ for Germanic languages ‘cp65001’ for Asian languages Defaults to ‘shift_jis’.

Returns:

reader – Initialized QR code reader.

Return type:

QReader

Module contents

class improutils.recognition.ShapeDescriptors[source]

Bases: object

An internal class for computing shape descriptors. Not to be used by the programmer.

aspect_ratio(max_diameter)[source]: Compute the aspect ratio of a shape as the ratio of minimum to maximum diameter.

compactness(max_diameter)[source]: Compute the compactness of a shape based on area and maximum diameter.

convexity(convex_perimeter)[source]: Compute the convexity of a shape as the ratio of convex perimeter to actual perimeter.

extent(bounding_rectangle_area)[source]: Compute the extent of a shape as the ratio of area to bounding rectangle area.

form_factor(perimeter)[source]: Compute the form factor of a shape based on area and perimeter.

roundness(max_diameter)[source]: Compute the roundness of a shape based on area and maximum diameter.

solidity(convex_area)[source]: Compute the solidity of a shape as the ratio of area to its convex hull area.

improutils.recognition.aspect_ratio(contour)[source]

Determine the contour’s aspect ratio.

Aka “poměr stran”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.compactness(contour)[source]

Determine the contour’s compactness.

Aka “kompaktnost, hutnost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.convexity(contour)[source]

Determine the contour’s convexity.

Aka “konvexita, vypouklost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.extent(contour)[source]

Determine the contour’s extent.

Aka “dosah, rozměrnost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.form_factor(contour)[source]

Determine the contour’s form factor.

Aka “špičatost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.ocr(img_bin, config='', lang=None)[source]

Detect text in the image.

Parameters:

img_bin (ndarray) – Input binary image. White objects on black background.
config (str) – Model config, refer to: https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html, https://muthu.co/all-tesseract-ocr-options/ for correct use. Defaults to ‘’.
lang (str | None) – Language code, e.g. eng for English and ces for Czech. For list of language codes, refer to: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html. Selected language must be installed using sudo apt-get install tesseract-ocr-langcode where langcode is the language code. English is installed by default. Defaults to None.

Return type:

The recognized text in the image.

improutils.recognition.qr_decode(img, detection_result, is_bgr=True, reader=<qreader.QReader object>)[source]

Decode a single QR code on the given image, described by a detection_result.

For further info refer to: https://github.com/Eric-Canas/qreader

Internally, this method will run the pyzbar decoder, using the information of the detection_result, to apply different image preprocessing techniques that heavily increase the decoding rate.

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3). Can also be grayscale (HxW), in which case it will be converted to BGR.
detection_result (dict[str, np.ndarray|float|tuple[float|int, float|int]]) – One of the detection dicts returned by the detect method. Note that qr_detect() returns a tuple of these dicts. This method expects just one of them.
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

decoded_strings – The decoded content of the QR code or None if it can not be read.

Return type:

str OR None

improutils.recognition.qr_detect(img, is_bgr=True, reader=<qreader.QReader object>)[source]

Detect QR codes in the image and returns a tuple of dictionaries with all the detection information.

For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3).
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

detections –

A tuple of dictionaries containing the following keys:

’confidence’: float. The confidence of the detection.
’bbox_xyxy’: np.ndarray. The bounding box of the detection in the format [x1, y1, x2, y2].
’cxcy’: tuple[float, float]. The center of the bounding box in the format (x, y).
’wh’: tuple[float, float]. The width and height of the bounding box in the format (w, h).
’polygon_xy’: np.ndarray. The accurate polygon that surrounds the QR code, with shape (N, 2).
’quad_xy’: np.ndarray. The quadrilateral that surrounds the QR code, with shape (4, 2). Fitted from the polygon.
’padded_quad_xy’: np.ndarray. An expanded version of quad_xy, with shape (4, 2), that always include all the points within polygon_xy.

All these keys (except ‘confidence’) have a ‘n’ (normalized) version. For example, ‘bbox_xyxy’ is the bounding box in absolute coordinates, while ‘bbox_xyxyn’ is the bounding box in normalized coordinates (from 0. to 1.).

Return type:

tuple[dict[str, np.ndarray|float|tuple[float|int, float|int]]]:

improutils.recognition.qr_detect_and_decode(img, return_detections=False, is_bgr=True, reader=<qreader.QReader object>)[source]

Detect and decodes QR codes in the given image and return the decoded strings (or None, if any of them was detected but not decoded).

For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

img (np.ndarray) – The image to be read. It is expected to be RGB or BGR (uint8). Format (HxWx3). Can also be grayscale (HxW), in which case it will be converted to BGR.
return_detections (bool) – If True, it will return the full detection results together with the decoded QRs. If False, it will return only the decoded content of the QR codes. Defaults to False.
is_bgr (bool) – If True, the received image is expected to be BGR instead of RGB. Defaults to True.
reader (QReader) – Initialized QReader class, use qr_init_reader(model_size, min_confidence, reencode_to) with different parameters if you wish. Defaults to qr_init_reader().

Returns:

detections – If return_detections is True, it will return a tuple of tuples. Each tuple will contain the detection result (a dictionary with the keys ‘confidence’, ‘bbox_xyxy’, ‘polygon_xy’…) and the decoded QR code (or None if it can not be decoded). If return_detections is False, it will return a tuple of strings with the decoded QR codes (or None if it can not be decoded).

Return type:

improutils.recognition.qr_init_reader(model_size='s', min_confidence=0.5, reencode_to='shift_jis')[source]

Initialize a QR code reader.

If you want to use the detect or decode QR codes on multiple occasions, it is recommended to initialize the reader once and then pass it to the functions. For further info refer to: https://github.com/Eric-Canas/qreader

Parameters:

model_size (str) – The size of the model to use. It can be ‘n’ (nano), ‘s’ (small), ‘m’ (medium) or ‘l’ (large). Larger models are more accurate but slower. Defaults to ‘s’.
min_confidence (float) – The minimum confidence of the QR detection to be considered valid. Values closer to 0.0 can get more False Positives, while values closer to 1.0 can lose difficult QRs. Default (and recommended): 0.5.
reencode_to (str | None) – The encoding to reencode the utf-8 decoded QR string. If None, it won’t re-encode. If you find some characters being decoded incorrectly, try to set a Code Page (https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers) that matches your specific charset. Recommendations that have been found useful: ‘shift-jis’ for Germanic languages ‘cp65001’ for Asian languages Defaults to ‘shift_jis’.

Returns:

reader – Initialized QR code reader.

Return type:

QReader

improutils.recognition.roundness(contour)[source]

Determine the contour’s roundness.

Aka “kulatost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property

improutils.recognition.solidity(contour)[source]

Determine the contour’s solidity.

Aka “plnost, celistvost”.

Parameters:: contour (ndarray) – The contour for the calculation.
Return type:: The number, describing the contour’s property