I would like to do the following : I have a set of slides and a video in which those slides are discussed. Now I would like to extract the timestamp when the slide appears in the video. I would suggest to proceed in the following way:
Import the video
Import[]
(How can I import a mp4.file ?)Import the slides as images
For all frames in the video, I define the following function: I compare a certain amount of pixels of the frame to all the slides images. When a certain percentage of pixels are the same, I return the slide image and the corresponding timestamp.
My questions:
- How can I extract all timestamps for a video frames ?
- How can I extract the values of pixels within a certain geometric region of the frame.
In my video, a person might stand in fron of the slides, therefor I cannot simply say that I detect a slide in the video, if all pixels of the frame and the slide are the same.
Thanks.
Here is some material to try:
Instead of a video, one can use a gif:
One can try to extract the timestamp of this frame:
And if someone is steading in front (please excuse my drawing skills):
Please try with the following material: YOu can find the frames and the slides to match. Aim: Find position of the slides within the frame list.
https://www.dropbox.com/sh/l1deic1ris2il6w/AAAOU_ICZM_f0T-9M9kfNNeRa?dl=0
Answer
First I have to say that I'm a bit skeptical whether what you want to do can work in general. What if the person blocks everything that is unique of that slide? What if some slides look the same?
But let's ignore those possible problems for now and try a very simple approach. This is only meant as a starting point! My answer is based on this very good answer here and this one, which contains more explanation and ways to improve.
First let us import your example data:
gif = Import["https://i.stack.imgur.com/k4ChI.gif"];
framenohead = Import["https://i.stack.imgur.com/JMthj.png"];
framewithhead = Import["https://i.stack.imgur.com/pwjwb.png"];
We scale the images down to 32x32 and obtain the pixel data using ImageData
. Scaling down will increase the robustness against small differences between the slide we are searching for and the video, as well as decrease the computation time. Note that you could probably scale down the whole video beforehand. We search for the frame with the head on it, change this line to search for the other one if you want to try it.
seeked = Flatten[ImageData@ImageResize[framewithhead, {32, 32}], 1];
small = Flatten[ImageData[ImageResize[#, {32, 32}]], 1] & /@ gif;
In order to decide whether two colors are similar we can define the following function. Play around with the threshold value!
SimilarColor[a_, b_] := If[Total[(a - b)^2] < 0.0005, 1, 0];
Now we just pick the frame with the highest score, i.e. the highest number of sectors that are similar to the frame we are looking for.
score = Total@MapThread[SimilarColor, {#, seeked}] & /@ small;
Position[score, Max[score]]
This returns 15 in both cases (with and without head)!
Edit: for the provided slides
Lets rename your slides, such that {{1}} became {{001}}. You can do it with something like
Do[RenameFile[
NotebookDirectory[] <> "\\frames\\{{" <> ToString[i] <> "}}.jpg",
NotebookDirectory[] <> "\\frames\\{{0" <> ToString[i] <> "}}.jpg"], {i, 10, 99}]
Now we import all those images, for example:
frames = Import[#] & /@
FileNames["*.jpg", NotebookDirectory[] <> "\\frames"];
slides = Import[#] & /@
FileNames["*.jpg", NotebookDirectory[] <> "\\slides"];
We can scale down the images, I used a slightly higher resolution because of some details in your slides.
res = 48;
smallframes =
Flatten[ImageData[ImageResize[#, {res, res}]], 1] & /@ frames;
smallslides =
Flatten[ImageData[ImageResize[#, {res, res}]], 1] & /@ slides;
I also slightly changed the color comparison, I'm not actually sure that you need to change it, but why not try a different one :)
SimilarColor[a_, b_] := If[And @@ ((# < 0.05) & /@ ((a - b)^2)), 1, 0];
Now comes the heavy calculation (takes a couple of minutes on my laptop): We label each frame by the slide that's in the background.
labels = Monitor[Table[
With[{score =
Total@MapThread[SimilarColor, {#, smallframes[[i]]}] & /@ smallslides},
Position[score, Max[score]]][[1, 1]], {i, Length[frames]}], i]
{2,2,2,2,2,2,2,2,2,2,2,2,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,12,12,12,13,13,13,13,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,20,20,20,20,20,20,20,20,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,3,3,3,3,3,3,3,3,3,3}
This list should contain all the information you need, in particular the first occurrence of slide 9 (with the photo mask) is
Min@Position[labels, 9]
95
Comments
Post a Comment