libvips/doc/using-python.xml

445 lines
15 KiB
XML

<?xml version="1.0"?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<!-- vim: set ts=2 sw=2 expandtab: -->
<refentry id="using-from-python">
<refmeta>
<refentrytitle>VIPS from Python</refentrytitle>
<manvolnum>3</manvolnum>
<refmiscinfo>VIPS Library</refmiscinfo>
</refmeta>
<refnamediv>
<refname>Using VIPS</refname>
<refpurpose>How to use the VIPS library from Python</refpurpose>
</refnamediv>
<refsect3 id="python-intro">
<title>Introduction</title>
<para>
VIPS comes with a convenient, high-level Python API based
on <code>gobject-introspection</code>. As long as you can get GOI
for your platform, you should be able to use vips. The
<code>Vips.py</code> file
needs to be copied to the overrides directory of your GOI install,
and you need to have the vips typelib on your
<code>GI_TYPELIB_PATH</code>. This may already have happened, depending
on your platform.
</para>
<para>
<programlisting language="Python">
#!/usr/bin/python
import sys
from gi.repository import Vips
im = Vips.Image.new_from_file(sys.argv[1])
im = im.extract_area(100, 100, im.width - 200, im.height - 200)
im = im.similarity(scale = 0.9)
mask = Vips.Image.new_from_array([[-1, -1, -1],
[-1, 16, -1],
[-1, -1, -1]], scale = 8)
im = im.conv(mask)
im.write_to_file(sys.argv[2])
</programlisting>
Reading this example, the first line loads the input file. You can append
load options to the argument list as keyword arguments, for example:
<programlisting language="Python">
im = Vips.Image.new_from_file(sys.argv[1], access = Vips.Access.SEQUENTIAL)
</programlisting>
See the various loaders for a list of the available options
for each file format. The C equivalent to this function,
vips_image_new_from_file(), has more extensive documentation. Try
<code>help(Vips.Image)</code> to see a list of all the image
constructors --- you can load from memory, or create from an array,
for example.
</para>
<para>
The next line crops 100 pixels off every edge. Try
<code>help(im.extract_area)</code> and the C API docs for
vips_extract_area() for details. You can use <code>.crop()</code> as a
synonym, if you like. <code>im.width</code> gets the image width in
pixels, see <code>help(Vips.Image)</code> and vips_image_get_width()
and friends for a list of the other getters.
</para>
<para>
The <code>similarity</code> line shrinks by 10%. By default it uses
bilinear interpolation, use <code>interpolate</code> to pick another
interpolator, for example:
<programlisting language="Python">
im = im.similarity(scale = 0.9, interpolate = Vips.Interpolate.new("bicubic"))
</programlisting>
</para>
<para>
<code>.new_from_array()</code> makes an image from a 2D array. The
<code>scale</code> keyword argument lets you set a divisor for
convolution, handy for integer convolutions. You can set
<code>offset</code> as well. See vips_conv() for details on the vips
convolution operator.
</para>
<para>
Finally, <code>.write_to_file()</code> sends the image back to the
filesystem. There's also <code>.write_to_buffer()</code> to make a
string containing the formatted image, and <code>.write()</code> to
write to another image.
</para>
</refsect3>
<refsect3 id="python-basics">
<title><code>pyvips8</code> basics</title>
<para>
The Python interface comes in two main parts. First, the C source code
to libvips has been marked up with special comments describing the
interface in a standard way. These comments are read by
gobject-introspection when libvips is compiled and used to generate a
typelib, a description of how to call the library. When your Python
program starts, the import line:
<programlisting language="Python">
from gi.repository import Vips
</programlisting>
loads the typelib and creates Python classes for all the objects and
all the functions in the library. You can then call these functions
from your code, and they will call into libvips for you. C functions
become Python functions in an obvious way: vips_operation_new(),
for example, the constructor for the class #VipsOperation, becomes
<code>Vips.Operation.new()</code>. See the C API docs for details.
</para>
<para>
Using libvips like this is possible, but a bit painful. To make the API
seem more pythonesque, vips includes a set of overrides which form a
layer over the bare functions created by gobject-introspection.
</para>
</refsect3>
<refsect3 id="python-wrapping">
<title>Automatic wrapping</title>
<para>
The overrides intercept member lookup
on the <code>Vips.Image</code> class and look for vips operations
with that name. So the vips operation "add", which appears in the
C API as vips_add(), appears in Python as
<code>image.add()</code>.
</para>
<para>
The first input image argument becomes the <code>self</code>
argument. If there are no input image arguments, the operation
appears as a class member. Optional input arguments become
keyword arguments. The result is a list of all the output
arguments, or a single output if there is only one.
</para>
<para>
Optional output arguments are enabled with a boolean keyword
argument of that name. For example, "min" (the operation which
appears in the C API as vips_min()), can be called like this:
<programlisting language="Python">
min_value = im.min()
</programlisting>
and <code>min_value</code> will be a floating point value giving
the minimum value in the image. "min" can also find the position
of the minimum value with the <code>x</code> and <code>y</code>
optional output arguments. Call it like this:
<programlisting language="Python">
min_value, opts = im.min(x = True, y = True)
x = opts['x']
y = opts['y']
</programlisting>
In other words, if optional output args are requested, an extra
dictionary is returned containing those objects.
Of course in this case, the <code>.minpos()</code> convenience
function would be simpler, see below.
</para>
<para>
Because operations are member functions and return the result image,
you can chain them. For example, you can write:
<programlisting language="Python">
result_image = image.sin().pow(2)
</programlisting>
to calculate the square of the sine for each pixel. There is also a
full set of arithmetic operator overloads, see below.
</para>
<para>
VIPS types are also automatically wrapped. The override looks
at the type of argument required by the operation and converts
the value you supply, when it can. For example, "linear" takes a
#VipsArrayDouble as an argument for the set of constants to use for
multiplication. You can supply this value as an integer, a float,
or some kind of compound object and it will be converted for you.
You can write:
<programlisting language="Python">
result_image = image.linear(1, 3)
result_image = image.linear(12.4, 13.9)
result_image = image.linear([1, 2, 3], [4, 5, 6])
result_image = image.linear(1, [4, 5, 6])
</programlisting>
And so on. A set of overloads are defined for <code>.linear()</code>,
see below.
</para>
<para>
It does a couple of more ambitious conversions. It will
automatically convert to and from the various vips types,
like #VipsBlob and #VipsArrayImage. For example, you can read the
ICC profile out of an image like this:
<programlisting language="Python">
profile = im.get_value("icc-profile-data")
</programlisting>
and <code>profile</code> will be a string.
</para>
<para>
You can use array constants instead of images. A 2D array is simply
changed into a one-band double image. This is handy for things like
<code>.erode()</code>, for example:
<programlisting language="Python">
im = im.erode([[128, 255, 128],
[255, 255, 255],
[128, 255, 128]])
</programlisting>
will erode an image with a 4-connected structuring element.
</para>
<para>
If an operation takes several input images, you can use a 1D array
constant or a number constant
for all but one of them and the wrapper will expand it
to an image for you. For example, <code>.ifthenelse()</code> uses
a condition image to pick pixels between a then and an else image:
<programlisting language="Python">
result_image = condition_image.ifthenelse(then_image, else_image)
</programlisting>
You can use a constant instead of either the then or the else
parts, and it will be expanded to an image for you. If you use a
constant for both then and else, it will be expanded to match the
condition image. For example:
<programlisting language="Python">
result_image = condition_image.ifthenelse([0, 255, 0], [255, 0, 0])
</programlisting>
Will make an image where true pixels are green and false pixels
are red.
</para>
<para>
This is also useful for <code>.bandjoin()</code>, the thing to join
two or more images up bandwise. You can write:
<programlisting language="Python">
rgba = rgb.bandjoin(255)
</programlisting>
to add a constant 255 band to an image, perhaps to add an alpha
channel. Of course you can also write:
<programlisting language="Python">
result_image = image1.bandjoin(image2)
result_image = image1.bandjoin([image2, image3])
result_image = Vips.Image.bandjoin([image1, image2, image3])
result_image = image1.bandjoin([image2, 255])
</programlisting>
and so on.
</para>
</refsect3>
<refsect3 id="python-doc">
<title>Automatic docstrings</title>
<para>
Try <code>help(Vips)</code> for everything,
<code>help(Vips.Image)</code> for something slightly more digestible, or
something like <code>help(Vips.Image.black)</code> for help on a
specific class member.
</para>
<para>
You can't get help on dynamically bound member functions like
<code>.add()</code> this way. Instead, make an image and get help
from that, for example:
<programlisting language="Python">
image = Vips.Image.new_from_file("x.jpg")
help(image.add)
</programlisting>
And you'll get a summary of the operator's behaviour and how the
arguments are represented in Python. Use the C API docs for more detail.
</para>
</refsect3>
<refsect3 id="python-exceptions">
<title>Exceptions</title>
<para>
The wrapper spots errors from vips operations and raises the
<code>Vips.Error</code> exception. You can catch it in the
usual way. The <code>.detail</code> member gives the detailed
error message.
</para>
</refsect3>
<refsect3 id="python-modify">
<title>Draw operations</title>
<para>
Paint operations like <code>draw_circle</code> and <code>draw_line</code>
modify their input image. This makes them hard to use with the rest of
libvips: you need to be very careful about the order in which operations
execute or you can get nasty crashes.
</para>
<para>
The wrapper spots operations of this type and makes a private copy of
the image in memory before calling the operation. This stops crashes,
but it does make it inefficient. If you draw 100 lines on an image,
for example, you'll copy the image 100 times. The wrapper does make sure
that memory is recycled where possible, so you won't have 100 copies in
memory. At least you can execute these operations.
</para>
<para>
If you want to avoid the copies, you'll need to call drawing
operations yourself.
</para>
</refsect3>
<refsect3 id="python-overloads">
<title>Overloads</title>
<para>
The wrapper defines the usual set of arithmetic, boolean and
relational overloads on
<code>image</code>. You can mix images, constants and lists of
constants (almost) freely. For example, you can write:
<programlisting language="Python">
result_image = ((image * [1, 2, 3]).abs() &lt; 128) | 4
</programlisting>
</para>
<para>
The wrapper overloads <code>[]</code> to be vips_extract_band(). You
can write:
<programlisting language="Python">
result_image = image[2]
</programlisting>
to extract the third band of the image. It implements the usual
slicing and negative indexes, so you can write:
<programlisting language="Python">
result_image = image[1:]
result_image = image[:3]
result_image = image[-2:]
result_image = [x.avg() for x in image]
</programlisting>
and so on.
</para>
<para>
The wrapper overloads <code>()</code> to be vips_getpoint(). You can
write:
<programlisting language="Python">
r, g, b = image(10, 10)
</programlisting>
to read out the value of the pixel at coordinates (10, 10) from an RGB
image.
</para>
</refsect3>
<refsect3 id="python-expansions">
<title>Expansions</title>
<para>
Some vips operators take an enum to select an action, for example
<code>.math()</code> can be used to calculate sine of every pixel
like this:
<programlisting language="Python">
result_image = image.math(Vips.OperationMath.SIN)
</programlisting>
This is annoying, so the wrapper expands all these enums into
separate members named after the enum. So you can write:
<programlisting language="Python">
result_image = image.sin()
</programlisting>
See <code>help(Vips.Image)</code> for a list.
</para>
</refsect3>
<refsect3 id="python-utility">
<title>Convenience functions</title>
<para>
The wrapper defines a few extra useful utility functions:
<code>.get_value()</code>,
<code>.set_value()</code>,
<code>.bandsplit()</code>,
<code>.maxpos()</code>,
<code>.minpos()</code>,
<code>.median()</code>.
Again, see <code>help(Vips.Image)</code> for a list.
</para>
</refsect3>
<refsect3 id="python-args">
<title>Command-line option parsing</title>
<para>
GLib includes a command-line option parser, and Vips defines a set of
standard flags you can use with it. For example:
<programlisting language="Python">
import sys
from gi.repository import GLib, Vips
context = GLib.OptionContext(" - test stuff")
main_group = GLib.OptionGroup("main",
"Main options", "Main options for this program",
None)
context.set_main_group(main_group)
Vips.add_option_entries(main_group)
context.parse(sys.argv)
</programlisting>
</para>
</refsect3>
</refentry>