Text Recognition Pane (Options Dialog Box > GUI Testing Tab)

Relevant for: GUI tests and components

This pane enables you to configure how UFT identifies text in your application. You can use this pane to modify the default text capture mechanism, OCR (optical character recognition) mechanism mode, and the language dictionaries the OCR mechanism uses to identify text.

To access

Select Tools > Options > GUI Testing tab > Text Recognition node.

Important information

The Restore Factory Defaults button resets all Options dialog box options to their defaults.

Related tasks

See also

Options are described below. The options differ depending on the text recognition engine you select.

Note: The performance of the Tesseract OCR engine is slower than the Abby OCR engine. If your test has a significant use of text recognition steps (such as GetVisibleText), note that the total time required to run these tests will increase.

UI Element Description
Abby OCR Text Recognition engine (default)
Text recognition mode

The manner in which to recognize text in the application:

  • Single text block mode: The single text block mode instructs the OCR mechanism to focus on the area and treat it as a single text block. This is especially useful when trying to capture text on small objects or in a small text area. Select this radio button if the text on the object is uniform in font, size, color, and background. For example:

  • Multiple text block mode: The multiple text block mode instructs the OCR mechanism to handle each text area in the object that has a different background font and size. The OCR mechanism decides where to divide the text blocks according to an internal algorithm. Select this radio button only if the text on the object comprises different fonts, font sizes, colors, and/or backgrounds. For example:

Available languages

Lists all of the language dictionaries that the OCR mechanism can potentially use when retrieving text from the object.

To specify the language dictionaries used by the OCR mechanism: Move a language to the Supported languages list box by selecting a language and clicking the right arrow button (>).

Supported languages

Lists the language dictionaries that the OCR mechanism uses when capturing text. The Supported languages list box can contain either:

  • One CJK (Chinese, Japanese, Korean) language.

    By default, English is also supported when capturing text in CJK languages.)

  • One or more non-CJK languages.
Preprocess the image before using text recognition Enables the text recognition to identify image elements before identifying the text in the specified object or area.
Tesseract OCR Text Recognition engine
Text recognition mode

The manner in which to recognize text in the application:

  • Single text block mode: The single text block mode instructs the OCR mechanism to focus on the area and treat it as a single text block. This is especially useful when trying to capture text on small objects or in a small text area. Select this radio button if the text on the object is uniform in font, size, color, and background. For example:

  • Multiple text block mode: The multiple text block mode instructs the OCR mechanism to handle each text area in the object that has a different background font and size. The OCR mechanism decides where to divide the text blocks according to an internal algorithm. Select this radio button only if the text on the object comprises different fonts, font sizes, colors, and/or backgrounds. For example:

Symbols for text recognition

Enables you to restrict text recognition to specific characters.

This option is supported for English only.

Current language pack

The current language to use in text recognition. Only one language pack can be used at a time.

To download and install a new language pack, you can visit the Tesseract OCR language pack download site: https://sourceforge.net/projects/tesseract-ocr-alt/files/?source=navbar or by clicking the link in the pane.

Once you download the language pack, you can add the files to the OCR engine folder, found at <UFT installation directory>/dat/tessdata.

Fast mode Instructs UFT to maximize performance (at the expense of text recognition accuracy) to improve test run speed.
Use default Tesseract configuration Instructs UFT to use the standard Tesseract configuration, as noted in the language data file
Use configuration from file

Enables you to load configuration settings from an externally defined file.

Not all configuration options are supported for use in UFT. The Output pane displays a list of parameters that are ignored when running a test( e.g., the interactive_display_mode parameter):

  • % parameter_name% parameter is not supported

  • The Tesseract OCR engine has stopped due to an error. Check your Tesseract configuration and try again.

  • The value type for the % parameter_name% parameter is incorrect. The parameter was ignored during the test run

  • The %parameter_name% parameter is not supported by the Tesseract OCR engine

For details on how to create your own configuration file, see http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version.

Preprocess the image before using text recognition

Enables the text recognition to identify image elements before identifying the text in the specified object or area.

When you use this option, UFT converts the image to a black and white image and resizes it. However, this slows the performance of UFT when performing text recognition. Therefore, expect additional time in test runs when performing text recognition with this option.

This option should be used when your application uses very small font size (10 pt. and lower)