FFmpeg
|
This code implements a filter to remove annoying TV logos and other annoying images placed onto a video stream. More...
Data Fields | |
const AVClass * | class |
char * | filename |
int *** | mask |
int | max_mask_size |
int | mask_w |
int | mask_h |
uint8_t * | full_mask_data |
FFBoundingBox | full_mask_bbox |
uint8_t * | half_mask_data |
FFBoundingBox | half_mask_bbox |
This code implements a filter to remove annoying TV logos and other annoying images placed onto a video stream.
It works by filling in the pixels that comprise the logo with neighboring pixels. The transform is very loosely based on a gaussian blur, but it is different enough to merit its own paragraph later on. It is a major improvement on the old delogo filter as it both uses a better blurring algorithm and uses a bitmap to use an arbitrary and generally much tighter fitting shape than a rectangle.
The logo removal algorithm has two key points. The first is that it distinguishes between pixels in the logo and those not in the logo by using the passed-in bitmap. Pixels not in the logo are copied over directly without being modified and they also serve as source pixels for the logo fill-in. Pixels inside the logo have the mask applied.
At init-time the bitmap is reprocessed internally, and the distance to the nearest edge of the logo (Manhattan distance), along with a little extra to remove rough edges, is stored in each pixel. This is done using an in-place erosion algorithm, and incrementing each pixel that survives any given erosion. Once every pixel is eroded, the maximum value is recorded, and a set of masks from size 0 to this size are generaged. The masks are circular binary masks, where each pixel within a radius N (where N is the size of the mask) is a 1, and all other pixels are a 0. Although a gaussian mask would be more mathematically accurate, a binary mask works better in practice because we generally do not use the central pixels in the mask (because they are in the logo region), and thus a gaussian mask will cause too little blur and thus a very unstable image.
The mask is applied in a special way. Namely, only pixels in the mask that line up to pixels outside the logo are used. The dynamic mask size means that the mask is just big enough so that the edges touch pixels outside the logo, so the blurring is kept to a minimum and at least the first boundary condition is met (that the image function itself is continuous), even if the second boundary condition (that the derivative of the image function is continuous) is not met. A masking algorithm that does preserve the second boundary coundition (perhaps something based on a highly-modified bi-cubic algorithm) should offer even better results on paper, but the noise in a typical TV signal should make anything based on derivatives hopelessly noisy.
Definition at line 81 of file vf_removelogo.c.
const AVClass* RemovelogoContext::class |
Definition at line 82 of file vf_removelogo.c.
char* RemovelogoContext::filename |
Definition at line 83 of file vf_removelogo.c.
int*** RemovelogoContext::mask |
Definition at line 86 of file vf_removelogo.c.
int RemovelogoContext::max_mask_size |
Definition at line 87 of file vf_removelogo.c.
int RemovelogoContext::mask_w |
Definition at line 88 of file vf_removelogo.c.
int RemovelogoContext::mask_h |
Definition at line 88 of file vf_removelogo.c.
uint8_t* RemovelogoContext::full_mask_data |
Definition at line 90 of file vf_removelogo.c.
FFBoundingBox RemovelogoContext::full_mask_bbox |
Definition at line 91 of file vf_removelogo.c.
uint8_t* RemovelogoContext::half_mask_data |
Definition at line 92 of file vf_removelogo.c.
FFBoundingBox RemovelogoContext::half_mask_bbox |
Definition at line 93 of file vf_removelogo.c.