关键词:Image retargeting, image resizing, image cropping
相关参考文献
- paper on seam carving: http://graphics.cs.cmu.edu/courses/15-463/2007_fall/hw/proj2/imret.pdf
- tutorial: http://cs.brown.edu/courses/cs129/results/proj3/taox
- tutorial: http://www.cs.cmu.edu/afs/andrew/scs/cs/15-463/f07/proj2/www/wwedler/
Introduction
在图像处理中,传统图像缩放方法难以在改变尺寸时保留重要特征,本文提出的接缝(seam)裁剪技术可有效解决此问题,通过定义并利用接缝(seam),实现以内容感知的方式调整图像大小,同时保持图像的视觉连贯性和重要特征,还能支持多种能量函数。
现有方法如基于面部检测、视觉显著性等的 top-down 和 bottom-up 方法,多依赖传统缩放与裁剪操作。
此外,还有如自适应网格布局、非线性缩放、分解图像等其他尝试,但各有局限。
Overview
Seam carving
是一种内容感知图像缩放技术,它通过删除或插入图像中的无关像素(通常是能量较低的像素)来调整图像大小。
- 图像尺寸缩小方法:采用如
的能量函数衡量像素重要性,旨在去除与周围融合、不易察觉的像素。
- 确定要移除的像素:运用动态规划,先从第二行到最后一行遍历图像,计算每个像素点的累积最小能量,找到最小能量路径的终点,再回溯确定最优接缝路径。
- 具体操作示例:以图像每行移除一个像素的方式减少宽度,,路径上的每个像素从图像中移除以整体减少宽度;若要进一步减少宽度,可移除次低能量路径上的像素,该过程可持续到达到期望宽度。
% 主程序
img = imread('imgs/broadway_tower.jpg'); % 读取图像
img = im2double(img); % 转换为双精度类型numSeamsToRemove = 50; % 要移除的seam数量for i = 1:numSeamsToRemoveenergyMap = computeEnergy(img); % 计算能量图seam = findVerticalSeam(energyMap); % 找到最小的垂直seamimg = removeVerticalSeam(img, seam); % 移除seam
end
1.Calculate energy map
- Calculate energy for each color channel by adding absolute value of gradient in x direction and absolute value of gradient in y direction.
作者测试了多种能量函数,如梯度和范数、显著性度量、Harris 角点度量、眼睛注视测量、面部检测器输出等,对比不同函数在引导接缝裁剪时的效果。
- Sum energy of all color channels into one 2D image to create energy map.
% 计算能量图的函数
function energyMap = computeEnergy(img)grayImg = rgb2gray(img); % 将图像转换为灰度图像[dx, dy] = gradient(double(grayImg)); % 计算x和y方向的梯度energyMap = abs(dx) + abs(dy); % 能量图是x和y梯度绝对值之和
end
2.Find minimum seam from top to bottom edge
①an accumulated cost matrix must be constructed by starting at the top edge and iterating through the rows of the energy map.
The value of a pixel in the accumuluated cost matrix is equal to its corresponding pixel value in the energy map added to the minimum of its three top neighbors (top-left, top-center, and top-right) from the accumulated cost matrix.
For special case:
- The value of the very top row (which obviously has no rows above it) of the accumulated cost matrix is equal to the energy map.
- Boundary conditions in this step are also taken into consideration. If a neighboring pixel is not available due to the left or right edge, it is simply not used in the minimum of top neighbors calculation.
②The minimum seam is then calculated by backtracing from the bottom to the top edge.
- First, the minimum value pixel in the bottom row of the accumulated cost matrix is located. This is the bottom pixel of the minimum seam.
- The seam is then traced back up to the top row of the accumulated cost matrix by following. The minimum seam coordinates are recorded.
An additional enhancement to this step that gives a more accurate energy value is described in the forward energy section.
动态规划表构建
-
初始化:
- 创建一个与输入能量图
energyMap
大小相同的累加能量图dpTable
。 - 将
dpTable
的第一行设置为energyMap
的第一行值,因为对于最顶层的像素来说,它们没有上方的能量可以累加。
- 创建一个与输入能量图
-
填充DP表:对于
dpTable
中的每个非首行元素dpTable(i, j)
,其值应等于energyMap(i, j)
加上它上方相邻三个位置中最小的累加能量值。这三个位置分别是正上方、左上方和右上方(如果存在的话)。
最优接缝回溯
-
确定终点:查找
dpTable
最后一行中的最小值,这代表了到达底部时累积能量最低的位置。 -
回溯接缝:从最后一行的最小值开始向上回溯,每次选择使累计能量最小的那个前驱位置,直到回到顶部。这样就可以获得一条完整的接缝路径。
% 找到垂直seam的函数
function seam = findVerticalSeam(energyMap)[rows, cols] = size(energyMap);M = energyMap; % M为累积能量图backtrack = zeros(rows, cols); % 用于记录路径% 动态规划计算累积能量图for i = 2:rowsfor j = 1:colsif j == 1[val, idx] = min([M(i-1, j), M(i-1, j+1)]);backtrack(i, j) = idx + j - 1;elseif j == cols[val, idx] = min([M(i-1, j-1), M(i-1, j)]);backtrack(i, j) = idx + j - 2;else[val, idx] = min([M(i-1, j-1), M(i-1, j), M(i-1, j+1)]);backtrack(i, j) = idx + j - 2;endM(i, j) = M(i, j) + val;endend% 回溯寻找最小seamseam = zeros(rows, 1);[~, seam(rows)] = min(M(rows, :));for i = rows-1:-1:1seam(i) = backtrack(i+1, seam(i+1));end
end
3.Remove minimum seam from top to bottom edge
The minimum seam coordinates from Step 2 are then used to remove the minimum seam. All the pixels in each row after the pixel to be removed are shifted over one column. Finally, the width of the image has been reduced by exactly one pixel.
4.Repeat Steps 1 - 3 until desired number of seams are removed
The energy map needs to be recomputed everytime a seam is removed.
% 删除垂直seam的函数
function img = removeVerticalSeam(img, seam)[rows, cols, ~] = size(img);for i = 1:rowsimg(i, seam(i):cols-1, :) = img(i, seam(i)+1:cols, :); % 删除seamendimg = img(:, 1:cols-1, :); % 缩小一列
end
Forward Energy
The forward energy enhancement is implemented by taking into account the energy created from the new neighbors after the seam is removed.
The original algorithm is modified by adding this extra forward energy during the creation of the accumulated cost matrix. When deciding which top neighbor (top-left, top-center, top-right) in the accumulated cost matrix has the minimum energy value, the forward energy from the corresponding new neighbors is added to each top neighbor's accumulated cost value.
Seam Insertion
- 与 seam removal(缝隙移除)相反
- 插入 n 条缝隙需先进行 n 次缝隙移除
- 记录最小缝隙的坐标和顺序
- 目的是找到最小缝隙坐标和顺序,不实际修改原图
- 利用坐标按移除顺序向原图插入新缝隙
- 新缝隙像素值来自上下或左右邻居的平均值
在 seam insertion(接缝插入)中不应使用 forward energy(前向能量),原因是新邻居的前向能量与扩展图像无关
Object Removal
- 创建待移除物体的二值掩码(binary mask)进行预处理。
- 测量掩码区域面积以决定采用从上到下或从左到右的接缝(seams)移除物体更高效。
- 在生成图像能量图(energy map)时,给掩码区域下的区域赋予很高的负值权重,确保最小接缝穿过掩码区域。
- 进行最小接缝移除,同时从掩码中也移除相同的最小接缝以保证下一步能量计算准确。
- 移除掩码区域后,使用接缝插入(seam insertion)使图像恢复到原始尺寸。
- 接缝移除时使用前向能量(Forward energy),但接缝插入时不使用。
Video
- 能量图组成:由空间能量图和时间能量图线性组合而成。
- 空间能量图:对所有视频帧进行算法概述中的能量图计算,然后在每个像素坐标处取最大值得到。
- 时间能量图:对每个像素沿 z 轴取帧之间的梯度得到。
- 线性组合公式:energy_map = (alpha)*spatial_energy + (1 - alpha)*temporal_energy,如 alpha 值为 0.5 则是两种能量平均,使用 alpha 值为 0.3(参考 Avidan, Rubinstein, and Shamir),理由是人类眼睛对运动伪影更敏感。
- 找到能量图后计算最小接缝。
- 从视频的所有帧中移除相同的最小接缝,示例中视频宽度从原始视频中移除了 100 个像素。
Failure Case
- Failure case analysis
- Distorted players (Larry Bird and Magic Johnson) instead of background
- Reason: lower energy in some areas of players' skin and uniforms compared to crowd areas; smoother gradient in these areas than in crowd
- Poor leg shape preservation, especially where legs overlap
- Possible algorithm enhancement: tell algorithm certain areas are important so crowd is removed instead
The Operation
energy function
①Seam carving can support several types of energy functions such as gradient magnitude, entropy, visual saliency, eye-gaze movement, and more.
to allow interactive control, we also provide a scribble based user interface for adding weights to the energy of an image and guide the desired results.
②Computing the seam can be done in a variety of ways, including Dijkstra’s shortest path algorithm [1998], dynamic programming [2001] or graph cuts [2001]
HoG的原理与步骤
Histogram of Oriented Gradients (HoG) 是一种常用于计算机视觉和图像处理的特征描述方法,最初用于行人检测。它通过计算图像局部梯度的方向和大小来捕捉物体的边缘和形状信息,从而对图像进行特征编码。
1. 梯度计算
对图像中的每个像素计算其梯度方向和大小。梯度是通过对灰度图像进行水平和垂直方向的差分计算得到的,通常使用Sobel算子。
2. 将图像分为多个小的单元格(Cells)
图像被划分为较小的区域(称为单元格,通常是 8x8 或 16x16 的像素块)。每个单元格中,统计梯度方向的直方图。
3. 梯度方向直方图
在每个单元格内,将像素的梯度方向划分为一定的角度区间(通常是 9 个方向,每个方向占用 20°),并将每个方向上的梯度幅值累计到对应的方向直方图中。
例如,假设一个单元格内的像素梯度方向在 0° 到 180° 之间,HoG会将方向分成 9 个区间,如 0-20°, 20-40°,直到 160-180°,然后根据每个像素的梯度方向和幅值对直方图进行加权累计。
4. 归一化块(Normalization Blocks)
多个相邻的单元格组合成一个块(Block),通常是 2x2 或 3x3 的单元格区域。为了消除局部亮度变化的影响,必须对这些块内的直方图进行归一化。归一化增强了HoG特征的抗光照变化能力。
5. 构建HoG特征向量
将所有归一化的直方图连接在一起,形成整张图像的HoG特征向量。这个特征向量可以用于图像分类、检测等任务。
HoG的优势
局部性:HoG通过分析局部区域的梯度方向分布来捕捉形状信息,因此对图像中的局部几何变化较为敏感,能很好地描述局部特征。
抗光照变化:梯度方向主要反映图像边缘和形状信息,而与图像的亮度变化无关。归一化步骤也进一步增强了对光照变化的鲁棒性。
计算简单:HoG相较于其他复杂特征提取方法(如SIFT、SURF等)计算量适中,且能够很好地应用于实时检测任务中。
良好的检测性能:HoG与支持向量机(SVM)结合在物体检测中,特别是行人检测中表现出了较高的精度。