UiBot : Robots for Everyone!

Your Location: Training Center> Documents > 21.OCR Command
21.OCR Command

Using the Image Class command, you can find the exact location of the 56 operation even if the interface elements cannot be retrieved. However, you can't read the contents of the interface elements like with a targeted command. Let's take WeChat Windows as an example. We can rely on image commands to log into WeChat and switch through different contacts but the content of the chat itself is still not available (although it's easy to see with the naked eye). You can see this in the following figure. At this point, the "OCR" class command for the UiBot is required.


(imgsTarget/8.png){width="50%"} is hard to get directly from WeChat.


OCR stands for "Optical Character Recognition", a time-honored technique that has enabled OCR to scan and retrieve text from paper books in the last century. Now, OCR technology is evolving and has been integrated into deep learning technology. We now use OCR to identify and distinguish different texts from a screen. The recognition rate is very high since photos are mostly digital now instead of from a poorly lit paper book/document.


OCR meshes well with RPA; however, OCR integration is a bit more technical. Usually RPA vendors do not do OCR themselves but directly access and use third-party OCR services. For UiBot, the default is Baidu’s Cloud OCR service because Baidu’s Cloud OCR technology is relatively 57 powerful among domestic manufacturers. It can not only recognize the words and figures on the interface but also optimize the images of documents and parchments such as invoices, ID cards, train tickets, etc. It can accurately identify the key contents such as invoice number, invoice value, name, etc.


In order to access Baidu Cloud OCR, the following three requirements must be satisfied first:


• You need access to the Internet. Baidu Cloud is based on the Internet cloud service, not the local running software; therefore, for personal use, it must be connected to the Internet. If is for enterprise use, it cannot access the Internet and may need negotiation with Baidu Cloud for purchasing its offline service.

• You may need to pay Baidu. Baidu's cloud OCR service is free but offers a free quota of 5,000 times a day (500 times a day for generic character recognition, certification, etc.). For personal use, the free amount is usually enough. However, Baidu may change their free quota and charge price policy at any time so the cost is subject to change.


Since Baidu Cloud is charging, it is impossible for UiBot users to share 58 a single account. Consequently, each user must apply for their own Baidu Cloud account to use the Baidu Cloud OCR service (commonly known as Access Key and Secret Key). The application method is simple, please click [See our online tutorial] (https://forum.uibot.com.cn/thread-192.htm).


UiBot contains the following OCR commands:


[UiBot's OCR command] (imgsTarget/9.png){width="30%"}


The commands in the red box look like Click Image, Move Mouse to Image, and Find Image, but you don't need to ever pass in the image. You just need to mark the text in the property to find it.


The commands in the blue box are similar to the commands in the green box except that the former need to provide an image file. The latter one needs to provide a window and an area. UiBot will automatically take a screenshot of the specified area of the window and save it as a file when the process runs and will execute it in the same way as before.


Let's try the "screen OCR" command first. Double-click or drag to insert a Screen OCR command and click the Find Target button on the command 59 (when the UiBot Creator window is temporarily hidden). Next, move the mouse to the WeChat window, which will be obscured by the red frame blue cover. Then drag the mouse to mark out an area for character recognition, which will be indicated by a purple box. This is illustrated in the figure below.


! [Select OCR NoTarget] (imgsTarget/10.png) {width="60%"}


This command will automatically find a window on WeChat during run time and will take a screen shot at the location specified in the purple box (relative to the location of the WeChat window). It will then identify the sText in the screenshot and save the recognized sText in the variable sText.


At this point, we are almost finished, but first we need to select the command to fill in the Access Key and Secret Key that we applied for on Baidu Cloud in the "Properties.” Note that both the Access Key and Secret Key are strings (a set of characters/text), so you need to keep the double quotation marks on the left and right sides of the text. After the OCR command is completed, you can see the effect by adding an Output to the Debug Window command, specifying the output variable sText. Note that sText is a variable name, not a string, so there do not put double quotation marks on either side of the variable.


! [Complete an OCR command] (imgsTarget/11.png) {width="80%"}


You can see the effect of this by running this process block. As long as the WeChat window exists and the size of the window has not changed, the name of the current WeChat (or WeChat group) can be identified.


Next, test the specific Image OCR Recognition command again for the I.D., train ticket, etc. and keep the image in the following format:


``D:\1.png```document:'


! [Image to do a special OCR] (imgsTarget/12.png){width="40%"}


Then use this Image OCR Recognition command to modify its properties as illustrated before. In addition to the Access Key and Secret Key mentioned above, you need to specify the file name of the image to be recognized and choose the OCR engine for "Train Ticket Identification.” The other properties remain as the default; therefore, after running, you can see the results identified in the output bar. This result is actually a JSON document that requires further processing with the JSON class commands provided by UiBot. However, it is not relevant to this chapter.


!! [Property settings for special OCR] (imgsTarget/13.png){width="80%"}


Suggested reading
Are You Ready to Explore the RPA World?
GET STARTED