I am trying to extract data from images (using some of the methods presented in [1524]), and would like to have a way to selectively cover/remove text from axes and legends.
TextRecognize has been reasonably good at finding the text, but I am curious: is there a way to determine where this text is located?
For example, how could we determine the approximate image location of the "Test plot" text in the following case?
testImage = Image[Plot[x, {x, 0, 6 π},
PlotLabel -> "Test plot", BaseStyle -> {FontSize -> 24}]]
TextRecognize[testImage]
"Test plot
5 10 15"
Answer
With enhanced in version 11.1 TextRecognize finding positions of recognized text becomes straightforward:
testImage =
Image[Plot[x, {x, 0, 6 π}, PlotLabel -> "Test plot", BaseStyle -> {FontSize -> 24}]];
res = TextRecognize[testImage, "Block", "BoundingBox", RecognitionPrior -> "SparseText"];
HighlightImage[testImage, {"Boundary", res}]
The original question is answered. But as one can see, not every glyph is recognized...
Another route goes through ComponentMeasurements:
comp = ComponentMeasurements[testImage, "BoundingBox"][[;; , 2]];
(* A workaround for bug *)
comp = Developer`FromPackedArray@comp;
comp = Rectangle @@@ comp;
comp = Select[comp, Area[#] < 10000 &];
HighlightImage[testImage, {"Boundary", comp}]
Now every glyph is found. Voilà !



Comments
Post a Comment