python - Image Preprocessing to extract 2D number list - Stack Overflow

IT技术

更新时间：2025-03-101

admin管理员组
文章数量:1297044

I've been tring to make a puzzle solving program. The game is 'fruit box' and you can play it through the link below.

/

To do that, I have to extract numbers from game screen

fruit box game screen shot

I found 'pytesseract' which is able to identify characters from image, and almost finish extracting with using it. but the result value wasn't satisfied for me.

threshold

At first, I used threshold function. I had to erase most of it because the background was the same white color as the numbers I was aiming for. The code and image are like this.

import pytesseract
import os
import cv2

image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,img_binary = cv2.threshold(grayImage, 246, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(img_binary, config='--psm 6')
# text = pytesseract.image_to_string(img_binary, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
print(text)
cv2.imshow('Image', img_binary)
cv2.waitKey(0)
cv2.destroyAllWindows()

threshold result

The 'image_to_string' function returns numbers like this

41233366429816415
412567594457956471
3572263437133946
68241491629765459
73278354155567666
7796565142328726
15349752855757571
31221174825264255
83517514412317216
1957899195693134

It almost same! but there are some wrong number.(for example, at second line, 412567594457956471 should be just 41256759445796471)

So I had to find other way.

inrange, floodFill

This tring is simple. Recognizing apples first, floodfill back ground second. the code and result is below.

import pytesseract
import os
import cv2
import numpy as np

image = os.getcwd() + '\\appletest.png'

img=cv2.imread(image)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#find apple color
dst1 = cv2.inRange(hsv, (0, 100, 20), (10, 255, 255))
rows, cols = dst1.shape[:2]
mask = np.zeros((rows+2, cols+2), np.uint8)
loDiff, upDiff = (10,10,10), (10,10,10)
retval = cv2.floodFill(dst1, mask, (1,1), (255,255,255), loDiff, upDiff)
text = pytesseract.image_to_string(dst1, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
# text = pytesseract.image_to_string(img_gradient, config='--psm 6')
print(text)
cv2.imshow('Image', dst1)
cv2.waitKey(0)
cv2.destroyAllWindows()

floodFill result

the result is this.

412333664298164215
412567594457964721
3957722634237619946
68241491629765458
732783542195567666
779685651412328726
15349752855757571
3912214174825264255
835175144121313217281216
15179191956988322134

But there were still wrong numbers added.

I guess it comes from quality of number(or image), so I implemented many preprocessing functions(sharpening, Erosion, Dilation, blur) but couldn't see perfect correct number list.

I don't know what should do more from here. Can you advise me to solve this situation?

I've been tring to make a puzzle solving program. The game is 'fruit box' and you can play it through the link below.

https://en.gamesaien/game/fruit_box/

To do that, I have to extract numbers from game screen

fruit box game screen shot

I found 'pytesseract' which is able to identify characters from image, and almost finish extracting with using it. but the result value wasn't satisfied for me.

threshold

At first, I used threshold function. I had to erase most of it because the background was the same white color as the numbers I was aiming for. The code and image are like this.

import pytesseract
import os
import cv2

image = os.getcwd() + '\\appletest.png'
img=cv2.imread(image)
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,img_binary = cv2.threshold(grayImage, 246, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(img_binary, config='--psm 6')
# text = pytesseract.image_to_string(img_binary, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
print(text)
cv2.imshow('Image', img_binary)
cv2.waitKey(0)
cv2.destroyAllWindows()

threshold result

The 'image_to_string' function returns numbers like this

41233366429816415
412567594457956471
3572263437133946
68241491629765459
73278354155567666
7796565142328726
15349752855757571
31221174825264255
83517514412317216
1957899195693134

It almost same! but there are some wrong number.(for example, at second line, 412567594457956471 should be just 41256759445796471)

So I had to find other way.

inrange, floodFill

This tring is simple. Recognizing apples first, floodfill back ground second. the code and result is below.

import pytesseract
import os
import cv2
import numpy as np

image = os.getcwd() + '\\appletest.png'

img=cv2.imread(image)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#find apple color
dst1 = cv2.inRange(hsv, (0, 100, 20), (10, 255, 255))
rows, cols = dst1.shape[:2]
mask = np.zeros((rows+2, cols+2), np.uint8)
loDiff, upDiff = (10,10,10), (10,10,10)
retval = cv2.floodFill(dst1, mask, (1,1), (255,255,255), loDiff, upDiff)
text = pytesseract.image_to_string(dst1, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 ')
# text = pytesseract.image_to_string(img_gradient, config='--psm 6')
print(text)
cv2.imshow('Image', dst1)
cv2.waitKey(0)
cv2.destroyAllWindows()

floodFill result

the result is this.

412333664298164215
412567594457964721
3957722634237619946
68241491629765458
732783542195567666
779685651412328726
15349752855757571
3912214174825264255
835175144121313217281216
15179191956988322134

But there were still wrong numbers added.

I guess it comes from quality of number(or image), so I implemented many preprocessing functions(sharpening, Erosion, Dilation, blur) but couldn't see perfect correct number list.

I don't know what should do more from here. Can you advise me to solve this situation?

Share Improve this question edited Feb 12 at 5:50 asked Feb 12 at 5:48 eunsang 234 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

My guess is that page segmentation mode 6 expects a true "block" of text and gets a bit nervous when seeing so much whitespace, so it decides to hallucinate a bit.

Let's give it a hand by removing the whitespace and leave no more room for hallucinations:

# [your code up to flood fill]

# let the letters bleed out a bit to extract
# the whole character with some padding
blurred = cv2.blur(dst1,(5,5))
# crop out the white space
text_space = blurred.mean(axis=0) != 255
dst1 = dst1[:,text_space]

cfg = '--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789 '
text = pytesseract.image_to_string(dst1, config=cfg)
print(text)

# 41233366429816415
# 41256759445796471
# 35772263437619946
# 68241491629765459
# 73278354195567666
# 77968565141328726
# 15349752855757571
# 31221174825264255
# 83517514411317116
# 15179191956983134

本文标签： pythonImage Preprocessing to extract 2D number listStack Overflow

版权声明：本文标题：python - Image Preprocessing to extract 2D number list - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741619604a2388730.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Image Preprocessing to extract 2D number list - Stack Overflow

1 Answer 1

更多相关文章