javascript - Detect a frame with squares in corners with OpenCV.js - Stack Overflow

IT技术

更新时间：2025-04-204

admin管理员组
文章数量:1414613

I have been playing with creating a filled form scanner with Javascript and OpenCV.js. What I basically want to do is to take a photo of a piece of paper with a filled form on it and be able to scan the photo and analyze the answers in the form. The first step is to actually find the form in the picture and apply perspective transform to get the "top-down view" of the paper. What I have done is I managed to get the script to detect the piece of paper and apply the transform to get it nicely scanned. I did it by applying grayscale, then Canny edge detection, iterated over the found edges and found the largest one with 4 corners and assumed this must be my paper.

This works relatively well, but every now and then the script gets confused as to what the paper actually is - sometimes there are other rectangles that are detected and assumed to be paper, sometimes the background on which the paper is photographed is very light and edges aren't clear (not enough contrast). That really destroys my flow later when the script thinks it has found the paper, but really it's something else. I would like to improve on this paper detection part so I can be always sure that the right thing has been detected. I thought - let's add a custom frame around the form, which will be easier to detect and add some squares in the corners (to double check if the found frame is 100% the one I'm looking for).

So I have created something like this:

Now I would like to be able to detect the corners of the frame and make sure the "filled" squares are there in the corners to be sure that this is 100% the frame I am looking for. Can you please advice on how to achieve it with openCV? Is this the right way to go? Thanks!

I have been playing with creating a filled form scanner with Javascript and OpenCV.js. What I basically want to do is to take a photo of a piece of paper with a filled form on it and be able to scan the photo and analyze the answers in the form. The first step is to actually find the form in the picture and apply perspective transform to get the "top-down view" of the paper. What I have done is I managed to get the script to detect the piece of paper and apply the transform to get it nicely scanned. I did it by applying grayscale, then Canny edge detection, iterated over the found edges and found the largest one with 4 corners and assumed this must be my paper.

This works relatively well, but every now and then the script gets confused as to what the paper actually is - sometimes there are other rectangles that are detected and assumed to be paper, sometimes the background on which the paper is photographed is very light and edges aren't clear (not enough contrast). That really destroys my flow later when the script thinks it has found the paper, but really it's something else. I would like to improve on this paper detection part so I can be always sure that the right thing has been detected. I thought - let's add a custom frame around the form, which will be easier to detect and add some squares in the corners (to double check if the found frame is 100% the one I'm looking for).

So I have created something like this:

Now I would like to be able to detect the corners of the frame and make sure the "filled" squares are there in the corners to be sure that this is 100% the frame I am looking for. Can you please advice on how to achieve it with openCV? Is this the right way to go? Thanks!

Share Improve this question asked Jan 17, 2020 at 15:45 furry12 8821 gold badge18 silver badges36 bronze badges

Add a ment |

1 Answer 1

Sorted by: Reset to default 7

I've worked on a similar problem before. I work with the C++ implementation of OpenCV, but I have some tips for you.

Segmenting the paper

To achieve a better segmentation, consider trying Image Quantization. This technique segments the image in N clusters, that is, it groups pixels of similar colors into a group. This group is then represented by one color.

The advantage of this technique over other, say, pure binary thresholding, is that it can identify multiple color distributions – those that will be grouped in N clusters. Check it out (Sorry for the links, I'm not allowed -yet- to post direct images):

This will help you get a better segmentation of your paper. The implementation uses the clustering algorithm known as “K-means” (more of this later). In my example, I tried 3 clusters and 5 algorithms “runs” (or attempts, as K-means is often run more than one time).

cv::Mat imageQuantization( cv::Mat inputImage, int numberOfClusters = 3, int iterations = 5 ){

        //step 1 : map the src to the samples
        cv::Mat samples(inputImage.total(), 3, CV_32F);
        auto samples_ptr = samples.ptr<float>(0);
        for( int row = 0; row != inputImage.rows; ++row){
            auto src_begin = inputImage.ptr<uchar>(row);
            auto src_end = src_begin + inputImage.cols * inputImage.channels();
            //auto samples_ptr = samples.ptr<float>(row * src.cols);
            while(src_begin != src_end){
                samples_ptr[0] = src_begin[0];
                samples_ptr[1] = src_begin[1];
                samples_ptr[2] = src_begin[2];
                samples_ptr += 3; src_begin +=3;
            }
        }

        //step 2 : apply kmeans to find labels and centers
        int clusterCount = numberOfClusters; //Number of clusters to split the set by
        cv::Mat labels;
        int attempts = iterations; //Number of times the algorithm is executed using different initial labels
        cv::Mat centers;
        int flags = cv::KMEANS_PP_CENTERS;
        cv::TermCriteria criteria = cv::TermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS,
                                                      10, 0.01 );

        //the call to kmeans:
        cv::kmeans( samples, clusterCount, labels, criteria, attempts, flags, centers );

        //step 3 : map the centers to the output
        cv::Mat clusteredImage( inputImage.size(), inputImage.type() );
        for( int row = 0; row != inputImage.rows; ++row ){
            auto clusteredImageBegin = clusteredImage.ptr<uchar>(row);
            auto clusteredImageEnd = clusteredImageBegin + clusteredImage.cols * 3;
            auto labels_ptr = labels.ptr<int>(row * inputImage.cols);

            while( clusteredImageBegin != clusteredImageEnd ){
                int const cluster_idx = *labels_ptr;
                auto centers_ptr = centers.ptr<float>(cluster_idx);
                clusteredImageBegin[0] = centers_ptr[0];
                clusteredImageBegin[1] = centers_ptr[1];
                clusteredImageBegin[2] = centers_ptr[2];
                clusteredImageBegin += 3; ++labels_ptr;
            }
        }   

        //return the output:
        return clusteredImage;
}

Note that the algorithm also produces two additional matrices. "Labels" are the actual pixels labeled with an integer that identifies their cluster. "Centers" are the mean values of each cluster.

Detecting the edges

Now, it is trivial to run an Edge Detector on this segmented image. Let’s try Canny. The parameters, of course, can be adjusted by you. Here, I tried a Lower Threshold 0f 30, and an Upper Threshold of 90. Pretty standard, just make sure the Upper Threshold follows the condition that = 3 * LowerThreshold, as per Canny suggestions. This is the result:

    cv::Mat testEdges;
    float lowerThreshold = 30;
    float upperThreshold = 3 * lowerThreshold;
    cv::Canny( testSegmented, testEdges, lowerThreshold, upperThreshold );

Detecting the lines

Nice. Want to detect the lines produced by the edge detector? Here, there are at least 2 options. The first and most straightforward: Use Hough’s Line Detector. However, as you surely have seen, tuning Hough to find the lines you are actually looking for could be difficult.

One possible solution to filter the lines returned by Hough is to run an “angle filter”, as we are looking for only (close to) vertical and horizontal lines. You can also filter the lines by length.

This code snippet gives out the idea, you need to actually implement the filter: // Run Hough's Line Detector: cv::HoughLinesP(grad, linesP, 1, CV_PI/180, minVotes, minLineLength, maxLineGap );

    // Process the points (lines)
    for( size_t i = 0; i < linesP.size(); i++ ) //points are stored in linesP
    {
        //get the line
        cv::Vec4i l = linesP[i]; //get the line

        //get the points:
        cv::Point startPoint = cv::Point( l[0], l[1] );
        cv::Point endPoint = cv::Point( l[2], l[3] );

        //filter horizontal & vertical:
        float dx = abs(startPoint.x - endPoint.x);
        float dy = abs(startPoint.y - endPoint.y);

        //angle filtering, delta y and delta x
        if ( (dy < maxDy) || (dx < maxDx) ){
          //got my target lines!
        }
    }

In the code above, I'm actually working with line ponents, instead of angles. So, my "angle" restrictions are defined by 2 minimum ponent lengths: maxDy - the maximum "delta" length in the y axis, as well as maxDx for the x axis.

The other solution for line detection is to exploit the fact you are only looking lines that have CORNERS or about 90 degrees between them. You can run a morphological filter to detect these “patterns” via a hit or miss operation :)

Anyway, back to Hough, this is the detection I get without too much parameter tuning and after applying the angle/length line filter:

Cool. The green dots represent the start and endpoints of the lines. As you see, there’s a bunch of them. How can we “bine” them? What if we pute the mean of those points? Ok, but we should get the mean of the lines PER “quadrant”. As you see in the following figure, I’ve divided the input image into 4 quadrants (the yellow lines):

Each of the quadrants -hopefully- will contain the points that describe the corner of the paper. For each of these quadrants, check which points fall on a given quadrant and pute the mean of them. That’s the general idea.

That’s quite some code to write. Fortunately, if we study the problem for a bit, we can see that all the green dots tend to CLUSTER in some very defined regions (or, as we said earlier, the “quadrants”.) Enter K-means again.

K-means will group data of similar value no matter what. It can be pixels, it can be spatial points, it can be whatever, just give it the data set and the number of clusters you want, and it will spit out the clusters found and THE MEANS OF SAID CLUSTERS – NICE!

If I run K-means with the line points returned by Hough I get the result shown in the last image. I've also discarded points that are too far from the mean. The means of the points are returned via the "centers" matrix and here they are rendered in orange- that’s quite close!

Hope that some of this helps you! :)

本文标签： javascriptDetect a frame with squares in corners with OpenCVjsStack Overflow

版权声明：本文标题：javascript - Detect a frame with squares in corners with OpenCV.js - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745153059a2645004.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

javascript - Detect a frame with squares in corners with OpenCV.js - Stack Overflow

1 Answer 1

更多相关文章