Panning and zoom detection algorithm

MasamuneXGP

Member
May 18, 2007
36
0
0
Greetings. There's a programming problem that I've been puzzling over for some time now, and I'm hoping to get some advice. Here's what I'm trying to do.

Many times in video files, especially in anime, the camera zooms or pans along a still image. What I need to accomplish is to write a program that will ascertain exactly how far the camera has panned/zoomed between one frame and another. In other words, given a later frame, I need to find out how many pixels I would have to shift over and to what percentage I would have to zoom in order to get the original frame.

I realize that despite how simple the problem sounds and how easy it is for humans to do, coding this will be no small task. From what I've researched so far, I gathered that what I want to do is similar to the motion detection algorithms used in video compression. But I could really use some advice and a point in the right direction. Can anyone give me any advice on this or tell me where I might find some help?
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
It doesn't sound even a little bit simple :). You're talking about detecting the physical parameters of the virtual world the scene was drawn in, and then estimating where in that virtual world the camera is positioned at any given time by detecting changes in the displayed images. Even assuming that you can ignore cuts to a new scene, and make some assumptions about the scale of some easily recognizeable figure (like a character), what you propose is in the category of "damn hard to do."

Assuming a very constrained problem, i.e. a smooth pan from scene a to scene b with recognizeable features persistent in both, and no zooming, you essentially need to detect an object, assign it some dimensions based on scale assumptions or comparison to another object, and then track it from point a to point b and measure its movement. Dealing with zooms would be an extension of this in which you measure the object's scale factor relative to its dimensions when it appeared in the scene.

If there are any existing implementations, and I don't doubt there are, they probably live in expensive image/video analysis suites.

The application has a big impact too. If you just need to make this measurement once, interactively, then it is a lot easier than trying to do it at runtime w/o human intervention.
 

MasamuneXGP

Member
May 18, 2007
36
0
0
I thought about having the user define a reference point at the start and end of the clip, but that would only work for smooth, linear pans. One of the main applications I'd like the program to be able to handle is a scene where the camera shakes or vibrates in random directions (as often happens in anime). Obviously asking the user to define a reference point for each frame would be a situation I would like to avoid if possible.

I get that it's not going to be a walk in the park, but I'd really like to give it a shot. Are there any open source (and preferably somewhat simple) projects that deal with things remotely similar to this type of problem?