fa03989b60
* doc: avoid use of GCC's old syntax (`-Wgnu-designator`) * doc: highlight shell commands * doc: point the C++ docs to v8.12 * doc: avoid highlight on output matching bash keywords
328 lines
14 KiB
Markdown
328 lines
14 KiB
Markdown
<refmeta>
|
|
<refentrytitle>How libvips works</refentrytitle>
|
|
<manvolnum>3</manvolnum>
|
|
<refmiscinfo>libvips</refmiscinfo>
|
|
</refmeta>
|
|
|
|
<refnamediv>
|
|
<refname>Internals</refname>
|
|
<refpurpose>A high-level technical overview of libvips's evaluation system</refpurpose>
|
|
</refnamediv>
|
|
|
|
Compared to most image processing libraries, VIPS needs little RAM and runs
|
|
quickly, especially on machines with more than one CPU. VIPS achieves this
|
|
improvement by only keeping the pixels currently being processed in RAM
|
|
and by having an efficient, threaded image IO system. This page explains
|
|
how these features are implemented.
|
|
|
|
**Images**
|
|
|
|
VIPS images have three dimensions: width, height and bands. Bands usually
|
|
(though not always) represent colour. These three dimensions can be any
|
|
size up to 2 ** 31 elements. Every band element in an image has to have the
|
|
same format. A format is an 8-, 16- or 32-bit int, signed or unsigned, 32-
|
|
or 64-bit float, and 64- or 128-bit complex.
|
|
|
|
**Regions**
|
|
|
|
An image can be very large, much larger than the available memory, so you
|
|
can't just access pixels with a pointer \*.
|
|
|
|
Instead, you read pixels from an image with a region. This is a rectangular
|
|
sub-area of an image. In C, the API looks like:
|
|
|
|
```c
|
|
VipsImage *image = vips_image_new_from_file( filename, NULL );
|
|
VipsRegion *region = vips_region_new( image );
|
|
|
|
// ask for a 100x100 pixel region at 0x0 (top left)
|
|
VipsRect r = { .left = 0, .top = 0, .width = 100, .height = 100 };
|
|
if( vips_region_prepare( region, &r ) )
|
|
vips_error( ... );
|
|
|
|
// get a pointer to the pixel at x, y, where x, y must
|
|
// be within the region
|
|
|
|
// as long as you stay within the valid area for the region,
|
|
// you can address pixels with regular pointer arithmetic
|
|
|
|
// compile with -DDEBUG and the macro will check bounds for you
|
|
|
|
// add VIPS_REGION_LSKIP() to move down a line
|
|
VipsPel *pixel = VIPS_REGION_ADDR( region, x, y );
|
|
|
|
// you can call vips_region_prepare() many times
|
|
|
|
// everything in libvips is a GObject ... when you're done,
|
|
// just free with
|
|
g_object_unref( region );
|
|
```
|
|
|
|
The action that `vips_region_prepare()` takes varies with the type of
|
|
image. If the image is a file on disc, for example, then VIPS will arrange
|
|
for a section of the file to be read in.
|
|
|
|
(\* there is an image access mode where you can just use a pointer, but
|
|
it's rarely used)
|
|
|
|
**Partial images**
|
|
|
|
A partial image is one where, instead of storing a value for each pixel, VIPS
|
|
stores a function which can make any rectangular area of pixels on demand.
|
|
|
|
If you use `vips_region_prepare()` on a region created on a partial image,
|
|
VIPS will allocate enough memory to hold the pixels you asked for and use
|
|
the stored function to calculate values for just those pixels \*.
|
|
|
|
The stored function comes in three parts: a start function, a generate
|
|
function and a stop function. The start function creates a state, the
|
|
generate function uses the state plus a requested area to calculate pixel
|
|
values and the stop function frees the state again. Breaking the stored
|
|
function into three parts is good for SMP scaling: resource allocation and
|
|
synchronisation mostly happens in start functions, so generate functions
|
|
can run without having to talk to each other.
|
|
|
|
VIPS makes a set of guarantees about parallelism that make this simple to
|
|
program. Start and stop functions are mutually exclusive and a state is
|
|
never used by more than one generate. In other words, a start / generate /
|
|
generate / stop sequence works like a thread.
|
|
|
|
![](Sequence.png)
|
|
|
|
(\* in fact VIPS keeps a cache of calculated pixel buffers and will return
|
|
a pointer to a previously-calculated buffer if it can)
|
|
|
|
**Operations**
|
|
|
|
VIPS operations read input images and write output images, performing some
|
|
transformation on the pixels. When an operation writes to an image the
|
|
action it takes depends upon the image type. For example, if the image is a
|
|
file on disc then VIPS will start a data sink to stream pixels to the file,
|
|
or if the image is a partial one then it will just attach start / generate /
|
|
stop functions.
|
|
|
|
Like most threaded image processing systems, all VIPS operations have to
|
|
be free of side-effects. In other words, operations cannot modify images,
|
|
they can only create new images. This could result in a lot of copying if
|
|
an operation is only making a small change to a large image so VIPS has a
|
|
set of mechanisms to copy image areas by just adjusting pointers. Most of
|
|
the time no actual copying is necessary and you can perform operations on
|
|
large images at low cost.
|
|
|
|
**Run-time code generation**
|
|
|
|
VIPS uses
|
|
<ulink url="https://gstreamer.freedesktop.org/modules/orc.html">Orc</ulink>, a
|
|
run-time compiler, to generate code for some operations. For example, to
|
|
compute a convolution on an 8-bit image, VIPS will examine the convolution
|
|
matrix and the source image and generate a tiny program to calculate the
|
|
convolution. This program is then "compiled" to the vector instruction set
|
|
for your CPU, for example SSE3 on most x86 processors.
|
|
|
|
Run-time vector code generation typically speeds operations up by a factor
|
|
of three or four.
|
|
|
|
**Joining operations together**
|
|
|
|
The region create / prepare / prepare / free calls you use to get pixels
|
|
from an image are an exact parallel to the start / generate / generate /
|
|
stop calls that images use to create pixels. In fact, they are the same:
|
|
a region on a partial image holds the state created by that image for the
|
|
generate function that will fill the region with pixels.
|
|
|
|
![](Combine.png)
|
|
|
|
VIPS joins image processing operations together by linking the output of one
|
|
operation (the start / generate / stop sequence) to the input of the next
|
|
(the region it uses to get pixels for processing). This link is a single
|
|
function call, and very fast. Additionally, because of the the split between
|
|
allocation and processing, once a pipeline of operations has been set up,
|
|
VIPS is able to run without allocating and freeing memory.
|
|
|
|
This graph (generated by `vipsprofile`, the vips profiler) shows memory use
|
|
over time for a vips pipeline running on a large image. The bottom trace
|
|
shows total memory, the upper traces show threads calculating useful results
|
|
(green), threads blocked on synchronisation (red) and memory allocations
|
|
(white ticks).
|
|
|
|
![](Memtrace.png)
|
|
|
|
Because the intermediate image is just a small region in memory, a pipeline
|
|
of operations running together needs very little RAM. In fact, intermediates
|
|
are small enough that they can fit in L2 cache on most machines, so an
|
|
entire pipeline can run without touching main memory. And finally, because
|
|
each thread runs a very cheap copy of just the writeable state of the
|
|
entire pipeline, threads can run with few locks. VIPS needs just four lock
|
|
operations per output tile, regardless of the pipeline length or complexity.
|
|
|
|
**Data sources**
|
|
|
|
VIPS has data sources which can supply pixels for processing from a variety
|
|
of sources. VIPS can stream images from files in VIPS native format, from
|
|
tiled TIFF files, from binary PPM/PGM/PBM/PFM, from Radiance (HDR) files,
|
|
from FITS images and from tiled OpenEXR images. VIPS will automatically
|
|
unpack other formats to temporary disc files for you but this can
|
|
obviously generate a lot of disc traffic. It also has a special
|
|
sequential mode for streaming operations on non-random-access formats. Another
|
|
section in these docs explains <ulink url="How-it-opens-files.md.html">how
|
|
libvips opens a file</ulink>. One of the sources uses the <ulink
|
|
url="http://www.imagemagick.org">ImageMagick</ulink> (or optionally <ulink
|
|
url="http://www.graphicsmagick.org">GraphicsMagick</ulink> library, so VIPS
|
|
can read any image format that these libraries can read.
|
|
|
|
VIPS images are held on disc as a 64-byte header containing basic image
|
|
information like width, height, bands and format, then the image data as
|
|
a single large block of pixels, left-to-right and top-to-bottom, then an
|
|
XML extension block holding all the image metadata, such as ICC profiles
|
|
and EXIF blocks.
|
|
|
|
When reading from a large VIPS image (or any other format with the same
|
|
structure on disc, such as binary PPM), VIPS keeps a set of small rolling
|
|
windows into the file, some small number of scanlines in size. As pixels
|
|
are demanded by different threads VIPS will move these windows up and down
|
|
the file. As a result, VIPS can process images much larger than RAM, even
|
|
on 32-bit machines.
|
|
|
|
**Data sinks**
|
|
|
|
In a demand-driven system, something has to do the demanding. VIPS has a
|
|
variety of data sinks that you can use to pull image data though a pipeline
|
|
in various situations. There are sinks that will build a complete image
|
|
in memory, sinks to draw to a display, sinks to loop over an image (useful
|
|
for statistical operations, for example) and sinks to stream an image to disc.
|
|
|
|
The disc sink looks something like this:
|
|
|
|
![](Sink.png)
|
|
|
|
The sink keeps two buffers\*, each as wide as the image. It starts threads
|
|
as rapidly as it can up to the concurrency limit, filling each buffer with
|
|
tiles\*\* of calculated pixels, each thread calculating one tile at once. A
|
|
separate background thread watches each buffer and, as soon as the last tile
|
|
in a buffer finishes, writes that complete set of scanlines to disc using
|
|
whatever image write library is appropriate. VIPS can write with libjpeg,
|
|
libtiff, libpng and others. It then wipes the buffer and repositions it
|
|
further down the image, ready for the next set of tiles to stream in.
|
|
|
|
These features in combination mean that, once a pipeline of image processing
|
|
operations has been built, VIPS can run almost lock-free. This is very
|
|
important for SMP scaling: you don't want the synchronization overhead to
|
|
scale with either the number of threads or the complexity of the pipeline
|
|
of operations being performed. As a result, VIPS scales almost linearly
|
|
with increasing numbers of threads:
|
|
|
|
![](Vips-smp.png)
|
|
|
|
Number of CPUs is on the horizontal axis, speedup is on the vertical
|
|
axis. Taken from the [[Benchmarks]] page.
|
|
|
|
(\* there can actually be more than one, it allocate enough buffers to
|
|
ensure that there are at least two tiles for every thread)
|
|
|
|
(\*\* tiles can be any shape and size, VIPS has a tile hint system that
|
|
operations use to tell sinks what tile geometry they prefer)
|
|
|
|
**Operation cache**
|
|
|
|
Because VIPS operations are free of side-effects\*, you can cache them. Every
|
|
time you call an operation, VIPS searches the cache for a previous call to
|
|
the same operation with the same arguments. If it finds a match, you get
|
|
the previous result again. This can give a huge speedup.
|
|
|
|
By default, VIPS caches the last 1,000 operation calls. You can also control
|
|
the cache size by memory use or by files opened.
|
|
|
|
(\* Some vips operations DO have side effects, for example,
|
|
`vips_draw_circle()` will draw a circle on an image. These operations emit an
|
|
"invalidate" signal on the image they are called on and this signal makes
|
|
all downstream operations and caches drop their contents.)
|
|
|
|
**Operation database and APIs**
|
|
|
|
VIPS has around 300 image processing operations written in this style. Each
|
|
operation is a GObject class. You can use the standard GObject calls to walk
|
|
the class hierarchy and discover operations, and libvips adds a small amount
|
|
of extra introspection metadata to handle things like optional arguments.
|
|
|
|
The <ulink url="using-from-c.html">C API</ulink> is a set of simple wrappers
|
|
which create class instances for you. The <ulink url="using-from-cpp.html">C++
|
|
API</ulink> is a little fancier and adds things like automatic object lifetime
|
|
management. The <ulink url="using-cli.html"> command-line interface</ulink>
|
|
uses introspection to run any vips operation in the class hierarchy.
|
|
|
|
There are bindings for <ulink url="https://libvips.github.io/libvips">many
|
|
other languages</ulink> on many platforms. Most of these bindings use the
|
|
introspection system to generate the binding at run-time.
|
|
|
|
**Snip**
|
|
|
|
The VIPS GUI, nip2, has its own scripting language called Snip. Snip is a
|
|
lazy, higher-order, purely functional, object oriented language. Almost all
|
|
of nip2's menus are implemented in it, and nip2 workspaces are Snip programs.
|
|
|
|
VIPS operations listed in the operation database appear as Snip functions. For
|
|
example, `abs` can be used from Snip as:
|
|
|
|
```
|
|
// absolute value of image b
|
|
a = vips_call "abs" [b] [];
|
|
```
|
|
|
|
However, `abs` won't work on anything except the primitive vips image type. It
|
|
can't be used on any class, or list or number. Definitions in `_stdenv.dev`
|
|
wrap each VIPS operation as a higher level Snip operation. For example:
|
|
|
|
```
|
|
abs x
|
|
= oo_unary_function abs_op x, is_class x
|
|
= vips_call "abs" [x] [], is_image x
|
|
= abs_cmplx x, is_complex x
|
|
= abs_num x, is_real x
|
|
= abs_list x, is_real_list x
|
|
= abs_list (map abs_list x), is_matrix x
|
|
= error (_ "bad arguments to " ++ "abs")
|
|
{
|
|
abs_op = Operator "abs" abs Operator_type.COMPOUND false;
|
|
|
|
abs_list l = (sum (map square l)) ** 0.5;
|
|
|
|
abs_num n
|
|
= n, n >= 0
|
|
= -n;
|
|
|
|
abs_cmplx c = ((re c)**2 + (im c)**2) ** 0.5;
|
|
}
|
|
```
|
|
|
|
This defines the behaviour of `abs` for the base Snip types (number, list,
|
|
matrix, image and so on), then classes will use that to define operator
|
|
behaviour on higher-level objects.
|
|
|
|
Now you can use:
|
|
|
|
```
|
|
// absolute value of anything
|
|
a = abs b;
|
|
```
|
|
|
|
and you ought to get sane behaviour for any object, including things like
|
|
the `Matrix` class.
|
|
|
|
You can write Snip classes which present functions to the user as menu
|
|
items. For example, `Math.def` has this:
|
|
|
|
```
|
|
Math_arithmetic_item = class
|
|
Menupullright "_Arithmetic" "basic arithmetic for objects" {
|
|
|
|
Absolute_value_item = class
|
|
Menuaction "A_bsolute Value" "absolute value of x" {
|
|
action x = map_unary abs x;
|
|
}
|
|
}
|
|
```
|
|
|
|
Now the user can select an object and click `Math / Abs` to find the absolute
|
|
value of that object.
|
|
|