World Library  
Flag as Inappropriate
Email this Article

Pyramid (image processing)

Article Id: WHEBN0015966023
Reproduction Date:

Title: Pyramid (image processing)  
Author: World Heritage Encyclopedia
Language: English
Subject: OpenWebGlobe, Hierarchical modulation, GLOH, Harris affine region detector, Hessian affine region detector
Collection: Computer Vision, Image Processing
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Pyramid (image processing)

Visual representation of an image pyramid with 5 levels

Pyramid, or pyramid representation, is a type of multi-scale signal representation developed by the computer vision, image processing and signal processing communities, in which a signal or an image is subject to repeated smoothing and subsampling. Pyramid representation is a predecessor to scale-space representation and multiresolution analysis.

Contents

  • Pyramid generation 1
  • Pyramid generation kernels 2
    • Gaussian pyramid 2.1
    • Laplacian pyramid 2.2
    • Steerable pyramid 2.3
  • Applications of pyramids 3
    • Alternative Representation 3.1
    • Detail Manipulation 3.2
  • See also 4
  • References 5
  • External links 6

Pyramid generation

There are two main types of pyramids: lowpass and bandpass.

A lowpass pyramid is made by smoothing the image with an appropriate smoothing filter and then subsampling the smoothed image, usually by a factor of 2 along each coordinate direction. The resulting image is then subjected to the same procedure, and the cycle is repeated multiple times. Each cycle of this process results in a smaller image with increased smoothing, but with decreased spatial sampling density (that is, decreased image resolution). If illustrated graphically, the entire multi-scale representation will look like a pyramid, with the original image on the bottom and each cycle's resulting smaller image stacked one atop the other.

A bandpass pyramid is made by forming the difference between images at adjacent levels in the pyramid and performing some kind of image interpolation between adjacent levels of resolution, to enable computation of pixelwise differences.

Pyramid generation kernels

A variety of different smoothing kernels have been proposed for generating pyramids.[1][2][3][4][5][6] Among the suggestions that have been given, the binomial kernels arising from the binomial coefficients stand out as a particularly useful and theoretically well-founded class.[2][7][8][9] Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an oversampled or hybrid pyramid.[10] With the increasing computational efficiency of CPUs available today, it is in some situations also feasible to use wider support Gaussian filters as smoothing kernels in the pyramid generation steps.

Gaussian pyramid

In a Gaussian pyramid, subsequent images are weighted down using a Gaussian average (Gaussian blur) and scaled down. Each pixel containing a local average that corresponds to a pixel neighborhood on a lower level of the pyramid. This technique is used especially in texture synthesis.

Laplacian pyramid

A Laplacian pyramid is very similar to a Gaussian pyramid but uses a Laplacian transform instead of a Gaussian. This technique can be used in image compression.[11]

Steerable pyramid

A steerable pyramid is an implementation of a multi-scale, multi-orientation band-pass filter bank used for applications including image compression, texture synthesis, and object recognition. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of steerable filters are used at each level of the pyramid instead of a single Laplacian of Gaussian filter.[12][13]

Applications of pyramids

Alternative Representation

In the early days of computer vision, pyramids were used as the main type of multi-scale representation for computing multi-scale image features from real-world image data. More recent techniques include scale-space representation, which has been popular among some researchers due to its theoretical foundation, the ability to decouple the subsampling stage from the multi-scale representation, the more powerful tools for theoretical analysis as well as the ability to compute a representation at any desired scale, thus avoiding the algorithmic problems of relating image representations at different resolution. Nevertheless, pyramids are still frequently used for expressing computationally efficient approximations to scale-space representation.[10][14][15]

Detail Manipulation

Laplacian image pyramids based on the bilateral filter provide a good framework for image detail enhancement and manipulation.[16] The difference images between each layer are modified to exaggerate or reduce details at different scales in an image.

Some image compression file formats use the Adam7 algorithm or some other interlacing technique. These can be seen as a kind of image pyramid. Because those file format store the "large-scale" features first, and fine-grain details later in the file, a particular viewer displaying a small "thumbnail" or on a small screen can quickly download just enough of the image to display it in the available pixels—so one file can support many viewer resolutions, rather than having to store or generate a different file for each resolution.

See also

References

  1. ^
  2. ^ a b
  3. ^ Burt, Peter and Adelson, Ted, "The Laplacian Pyramid as a Compact Image Code", IEEE Trans. Communications, 9:4, 532–540, 1983.
  4. ^
  5. ^ Crowley, J. L. and Sanderson, A. C. "Multiple resolution representation and probabilistic matching of 2-D gray-scale shape", IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(1), pp 113-121, 1987.
  6. ^ P. Meer, E. S. Baugher and A. Rosenfeld "Frequency domain analysis and synthesis of image generating kernels", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 9, pages 512-522, 1987.
  7. ^ Lindeberg, Tony, "Scale-space for discrete signals," PAMI(12), No. 3, March 1990, pp. 234-254.
  8. ^ Lindeberg, Tony. Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994, ISBN 0-7923-9418-6 (see specifically Chapter 2 for an overview of Gaussian and Laplacian image pyramids and Chapter 3 for theory about generalized binomial kernels and discrete Gaussian kernels)
  9. ^ See the article on multi-scale approaches for a very brief theoretical statement
  10. ^ a b Lindeberg, T. and Bretzner, L. Real-time scale selection in hybrid multi-scale representations, Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science, volume 2695, pages 148-163, 2003.
  11. ^ Peter J. Burt and Edward H. Adelson. "The Laplacian Pyramid as a Compact Image Code". IEEE Transactions on Communications. doi:10.1109/TCOM.1983.1095851. 1983.
  12. ^
  13. ^
    Also in
  14. ^ Crowley, J, Riff O. Fast computation of scale normalised Gaussian receptive fields, Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science, volume 2695, 2003.
  15. ^
  16. ^ Photo Detail Manipulation via Image Pyramids

External links

  • Gaussian-Laplacian Pyramid Image Coding - illustrates methods of Downsampling, Upsampling, and Gaussian convolution
  • The Gaussian Pyramid - provides a brief introduction for the procedure and cites several sources
  • Laplacian Irregular Graph Pyramid - Figure 1 on this page illustrates an example of the Gaussian Pyramid
  • The Laplacian Pyramid as a Compact Image Code on eBook Submission
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 



Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.