admin管理员组文章数量:1297044
I've been tring to make a puzzle solving program. The game is 'fruit box' and you can play it through the link below.
/
To do that, I have to extract numbers from game screen
fruit box game screen shot
I found 'pytesseract' which is able to identify characters from image, and almost finish extracting with using it. but the result value wasn't satisfied for me.
- threshold
At first, I used threshold function. I had to erase most of it because the background was the same white color as the numbers I was aiming for. The code and image are like this.
import pytesseract
import os
import cv2
image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,img_binary = cv2.threshold(grayImage, 246, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(img_binary, config='--psm 6')
# text = pytesseract.image_to_string(img_binary, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
print(text)
cv2.imshow('Image', img_binary)
cv2.waitKey(0)
cv2.destroyAllWindows()
threshold result
The 'image_to_string' function returns numbers like this
41233366429816415
412567594457956471
3572263437133946
68241491629765459
73278354155567666
7796565142328726
15349752855757571
31221174825264255
83517514412317216
1957899195693134
It almost same! but there are some wrong number.(for example, at second line, 412567594457956471 should be just 41256759445796471)
So I had to find other way.
- inrange, floodFill
This tring is simple. Recognizing apples first, floodfill back ground second. the code and result is below.
import pytesseract
import os
import cv2
import numpy as np
image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#find apple color
dst1 = cv2.inRange(hsv, (0, 100, 20), (10, 255, 255))
rows, cols = dst1.shape[:2]
mask = np.zeros((rows+2, cols+2), np.uint8)
loDiff, upDiff = (10,10,10), (10,10,10)
retval = cv2.floodFill(dst1, mask, (1,1), (255,255,255), loDiff, upDiff)
text = pytesseract.image_to_string(dst1, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
# text = pytesseract.image_to_string(img_gradient, config='--psm 6')
print(text)
cv2.imshow('Image', dst1)
cv2.waitKey(0)
cv2.destroyAllWindows()
floodFill result
the result is this.
412333664298164215
412567594457964721
3957722634237619946
68241491629765458
732783542195567666
779685651412328726
15349752855757571
3912214174825264255
835175144121313217281216
15179191956988322134
But there were still wrong numbers added.
I guess it comes from quality of number(or image), so I implemented many preprocessing functions(sharpening, Erosion, Dilation, blur) but couldn't see perfect correct number list.
I don't know what should do more from here. Can you advise me to solve this situation?
I've been tring to make a puzzle solving program. The game is 'fruit box' and you can play it through the link below.
https://en.gamesaien/game/fruit_box/
To do that, I have to extract numbers from game screen
fruit box game screen shot
I found 'pytesseract' which is able to identify characters from image, and almost finish extracting with using it. but the result value wasn't satisfied for me.
- threshold
At first, I used threshold function. I had to erase most of it because the background was the same white color as the numbers I was aiming for. The code and image are like this.
import pytesseract
import os
import cv2
image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,img_binary = cv2.threshold(grayImage, 246, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(img_binary, config='--psm 6')
# text = pytesseract.image_to_string(img_binary, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
print(text)
cv2.imshow('Image', img_binary)
cv2.waitKey(0)
cv2.destroyAllWindows()
threshold result
The 'image_to_string' function returns numbers like this
41233366429816415
412567594457956471
3572263437133946
68241491629765459
73278354155567666
7796565142328726
15349752855757571
31221174825264255
83517514412317216
1957899195693134
It almost same! but there are some wrong number.(for example, at second line, 412567594457956471 should be just 41256759445796471)
So I had to find other way.
- inrange, floodFill
This tring is simple. Recognizing apples first, floodfill back ground second. the code and result is below.
import pytesseract
import os
import cv2
import numpy as np
image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#find apple color
dst1 = cv2.inRange(hsv, (0, 100, 20), (10, 255, 255))
rows, cols = dst1.shape[:2]
mask = np.zeros((rows+2, cols+2), np.uint8)
loDiff, upDiff = (10,10,10), (10,10,10)
retval = cv2.floodFill(dst1, mask, (1,1), (255,255,255), loDiff, upDiff)
text = pytesseract.image_to_string(dst1, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
# text = pytesseract.image_to_string(img_gradient, config='--psm 6')
print(text)
cv2.imshow('Image', dst1)
cv2.waitKey(0)
cv2.destroyAllWindows()
floodFill result
the result is this.
412333664298164215
412567594457964721
3957722634237619946
68241491629765458
732783542195567666
779685651412328726
15349752855757571
3912214174825264255
835175144121313217281216
15179191956988322134
But there were still wrong numbers added.
I guess it comes from quality of number(or image), so I implemented many preprocessing functions(sharpening, Erosion, Dilation, blur) but couldn't see perfect correct number list.
I don't know what should do more from here. Can you advise me to solve this situation?
Share Improve this question edited Feb 12 at 5:50 eunsang asked Feb 12 at 5:48 eunsangeunsang 234 bronze badges1 Answer
Reset to default 1My guess is that page segmentation mode 6 expects a true "block" of text and gets a bit nervous when seeing so much whitespace, so it decides to hallucinate a bit.
Let's give it a hand by removing the whitespace and leave no more room for hallucinations:
# [your code up to flood fill]
# let the letters bleed out a bit to extract
# the whole character with some padding
blurred = cv2.blur(dst1,(5,5))
# crop out the white space
text_space = blurred.mean(axis=0) != 255
dst1 = dst1[:,text_space]
cfg = '--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 '
text = pytesseract.image_to_string(dst1, config=cfg)
print(text)
# 41233366429816415
# 41256759445796471
# 35772263437619946
# 68241491629765459
# 73278354195567666
# 77968565141328726
# 15349752855757571
# 31221174825264255
# 83517514411317116
# 15179191956983134
本文标签: pythonImage Preprocessing to extract 2D number listStack Overflow
版权声明:本文标题:python - Image Preprocessing to extract 2D number list - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741619604a2388730.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论