libvips/doc/using-python.xml

<?xml version="1.0"?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
               "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<!-- vim: set ts=2 sw=2 expandtab: -->
<refentry id="using-from-python">
  <refmeta>
    <refentrytitle>VIPS from Python</refentrytitle>
    <manvolnum>3</manvolnum>
    <refmiscinfo>VIPS Library</refmiscinfo>
  </refmeta>

  <refnamediv>
    <refname>Using VIPS</refname>
    <refpurpose>How to use the VIPS library from Python</refpurpose>
  </refnamediv>

  <refsect3 id="python-intro">
    <title>Introduction</title>
    <para>
      VIPS comes with a convenient, high-level Python API built on
      on <code>gobject-introspection</code>. As long as you can get GOI
      for your platform, you should be able to use libvips.
    </para>

    <para>
      To test the binding, start up Python and at the console enter:

<programlisting language="Python">
>>> from gi.repository import Vips
>>> x = Vips.Image.new_from_file("/path/to/some/image/file.jpg")
>>> x.width
1450
>>>
</programlisting>

      <orderedlist>
        <listitem>
          <para>
            If import fails, check you have the Python
            gobject-introspection packages installed, that you have the
            libvips typelib installed, and that the typelib is either
            in the system area or on your <code>GI_TYPELIB_PATH</code>.
          </para>
        </listitem>

        <listitem>
          <para>
             If <code>.new_from_file()</code> fails, the vips overrides
             have not been found. Make sure <code>Vips.py</code> is in
             your system overrides area.
          </para>
        </listitem>
      </orderedlist>
    </para>
  </refsect3>

  <refsect3 id="python-example">
    <title>Example program</title>

    <para>
      Here's a complete example program:

<programlisting language="Python">
#!/usr/bin/python

import sys

from gi.repository import Vips

im = Vips.Image.new_from_file(sys.argv[1])

im = im.extract_area(100, 100, im.width - 200, im.height - 200)
im = im.similarity(scale = 0.9)
mask = Vips.Image.new_from_array([[-1, -1,  -1],
                                  [-1, 16,  -1],
                                  [-1, -1,  -1]], scale = 8)
im = im.conv(mask)

im.write_to_file(sys.argv[2])
</programlisting>
    </para>

    <para>
      Reading this code, the first interesting line is:

<programlisting language="Python">
from gi.repository import Vips
</programlisting>

      When Python executes the import line it performs the following steps:
    </para>

    <orderedlist>
      <listitem>
        <para>
          It searches for a file called <code>Vips-x.y.typelib</code>. This
          is a binary file generated automatically during libvips build
          by introspection of the libvips shared library plus scanning
          of the C headers. It lists all the API entry points, all the
          types the library uses, and has an extra set of hints for object
          ownership and reference counting. The typelib is searched for
          in <code>/usr/lib/gi-repository-1.0</code> and along the path
          in the environment variable <code>GI_TYPELIB_PATH</code>.
        </para>
      </listitem>

      <listitem>
        <para>
          It uses the typelib to make a basic binding for libvips. It's
          just the C API with a little very light mangling, so for
          example the enum member <code>VIPS_FORMAT_UCHAR</code>
          of the enum <code>VipsBandFormat</code> becomes
          <code>Vips.BandFormat.UCHAR</code>.
        </para>
      </listitem>

      <listitem>
        <para>
          The binding you get can be rather unfriendly, so it also
          loads a set of overrides from <code>Vips.py</code> in
          <code>/usr/lib/python2.7/dist-packages/gi/overrides</code>
          (on my system at least). If you're using python3, it's
          <code>/usr/lib/python3/dist-packages/gi/overrides</code>.
          Unfortunately, as far as I know, there is no way to extend
          this search using environment variables. You MUST have
          <code>Vips.py</code> in exactly this directory. If you install
          vips via a package manager this will happen automatically,
          since vips and pygobject will have been built to the same
          prefix, but if you are installing vips from source and the
          prefix does not match, it will not be installed for you,
          you will need to copy it.
        </para>
      </listitem>

      <listitem>
        <para>
          Finally, <code>Vips.py</code> makes the rest of the binding. In
          fact, <code>Vips.py</code> makes almost all the binding: it
          defines <code>__getattr__</code> on <code>Vips.Image</code>
          and binds at runtime by searching libvips for an operation of
          that name.
        </para>
      </listitem>
    </orderedlist>

    <para>
      The next line is:

<programlisting language="Python">
im = Vips.Image.new_from_file(sys.argv[1])
</programlisting>

      This loads the input image. You can append
      load options to the argument list as keyword arguments, for example:

<programlisting language="Python">
im = Vips.Image.new_from_file(sys.argv[1], access = Vips.Access.SEQUENTIAL)
</programlisting>

      See the various loaders for a list of the available options
      for each file format. The C equivalent to this function,
      vips_image_new_from_file(), has more extensive documentation. Try
      <code>help(Vips.Image)</code> to see a list of all the image
      constructors --- you can load from memory, or create from an array,
      for example.
    </para>

    <para>
      The next line is:

<programlisting language="Python">
im = im.extract_area(100, 100, im.width - 200, im.height - 200)
</programlisting>

      The arguments are left, top, width, height, so this crops 100 pixels off
      every edge. Try <code>help(im.extract_area)</code> and the C API docs
      for vips_extract_area() for details. You can use <code>.crop()</code>
      as a synonym, if you like.
    </para>

    <para>
      <code>im.width</code> gets the image width
      in pixels, see <code>help(Vips.Image)</code> and vips_image_get_width()
      and friends for a list of the other getters.
    </para>

    <para>
      The next line:

<programlisting language="Python">
im = im.similarity(scale = 0.9)
</programlisting>

      shrinks by 10%.  By default it uses
      bilinear interpolation, use <code>interpolate</code> to pick another
      interpolator, for example:

<programlisting language="Python">
im = im.similarity(scale = 0.9, interpolate = Vips.Interpolate.new("bicubic"))
</programlisting>

      see vips_similarity() for full documentation. The similarity operator
      will not give good results for large resizes (more than a factor of
      two). See vips_resize() if you need to make a large change.
    </para>

    <para>
      Next:

<programlisting language="Python">
mask = Vips.Image.new_from_array([[-1, -1,  -1],
                                  [-1, 16,  -1],
                                  [-1, -1,  -1]], scale = 8)
im = im.conv(mask)
</programlisting>

      makes an image from a 2D array, then convolves with that. The
      <code>scale</code> keyword argument lets you set a divisor for
      convolution, handy for integer convolutions. You can set
      <code>offset</code> as well. See vips_conv() for details on the vips
      convolution operator.
    </para>

    <para>
      Finally,

<programlisting language="Python">
im.write_to_file(sys.argv[2])
</programlisting>

      sends the image back to the
      filesystem. There's also <code>.write_to_buffer()</code> to make a
      string containing the formatted image, and <code>.write()</code> to
      write to another image.
    </para>

    <para>
      As with <code>.new_from_file()</code> you can append save options as
      keyword arguments. For example:

<programlisting language="Python">
im.write_to_file("test.jpg", Q = 90)
</programlisting>

      will write a JPEG image with quality set to 90. See the various save
      operations for a list of all the save options, for example
      vips_jpegsave().
    </para>

  </refsect3>

  <refsect3 id="python-doc">
    <title>Getting help</title>
    <para>
      Try <code>help(Vips)</code> for everything,
      <code>help(Vips.Image)</code> for something slightly more digestible, or
      something like <code>help(Vips.Image.black)</code> for help on a
      specific class member.
    </para>

    <para>
      You can't get help on dynamically bound member functions like
      <code>.add()</code> this way. Instead, make an image and get help
      from that, for example:

<programlisting language="Python">
image = Vips.Image.black(1, 1)
help(image.add)
</programlisting>

      And you'll get a summary of the operator's behaviour and how the
      arguments are represented in Python.
    </para>

    <para>
      The API docs have a <link linkend="function-list">handy table of all vips
      operations</link>, if you want to find out how to do something, try
      searching that.
    </para>

    <para>
      The <command>vips</command> command can be useful too. For example, in a
      terminal you can type <command>vips jpegsave</command> to get a
      summary of an operation:

<programlisting language="Python">
$ vips jpegsave
save image to jpeg file
usage:
   jpegsave in filename
where:
   in           - Image to save, input VipsImage
   filename     - Filename to save to, input gchararray
optional arguments:
   Q            - Q factor, input gint
                      default: 75
                      min: 1, max: 100
   profile      - ICC profile to embed, input gchararray
   optimize-coding - Compute optimal Huffman coding tables, input gboolean
                      default: false
   interlace    - Generate an interlaced (progressive) jpeg, input gboolean
                      default: false
   no-subsample - Disable chroma subsample, input gboolean
                      default: false
   trellis-quant - Apply trellis quantisation to each 8x8 block, input gboolean
                      default: false
   overshoot-deringing - Apply overshooting to samples with extreme values, input gboolean
                      default: false
   optimize-scans - Split the spectrum of DCT coefficients into separate scans, input gboolean
                      default: false
   strip        - Strip all metadata from image, input gboolean
                      default: false
   background   - Background value, input VipsArrayDouble
operation flags: sequential-unbuffered nocache
</programlisting>

    </para>
  </refsect3>

  <refsect3 id="python-basics">
    <title><code>pyvips8</code> basics</title>
    <para>
      As noted above, the Python interface comes in two main parts,
      an automatically generated binding based on the vips typelib,
      plus a set of extra features provided by overrides.
      The rest of this chapter runs through the features provided by the
      overrides.
    </para>
  </refsect3>

  <refsect3 id="python-wrapping">
    <title>Automatic wrapping</title>
    <para>
      The overrides intercept member lookup
      on the <code>Vips.Image</code> class and look for vips operations
      with that name. So the vips operation "add", which appears in the
      C API as vips_add(), appears in Python as
      <code>image.add()</code>.
    </para>

    <para>
      The first input image argument becomes the <code>self</code>
      argument. If there are no input image arguments, the operation
      appears as a class member.  Optional input arguments become
      keyword arguments. The result is a list of all the output
      arguments, or a single output if there is only one.
    </para>

    <para>
      Optional output arguments are enabled with a boolean keyword
      argument of that name. For example, "min" (the operation which
      appears in the C API as vips_min()), can be called like this:

<programlisting language="Python">
min_value = im.min()
</programlisting>

      and <code>min_value</code> will be a floating point value giving
      the minimum value in the image. "min" can also find the position
      of the minimum value with the <code>x</code> and <code>y</code>
      optional output arguments. Call it like this:

<programlisting language="Python">
min_value, opts = im.min(x = True, y = True)
x = opts['x']
y = opts['y']
</programlisting>

      In other words, if optional output args are requested, an extra
      dictionary is returned containing those objects.
      Of course in this case, the <code>.minpos()</code> convenience
      function would be simpler, see below.
    </para>

    <para>
      Because operations are member functions and return the result image,
      you can chain them. For example, you can write:

<programlisting language="Python">
result_image = image.sin().pow(2)
</programlisting>

      to calculate the square of the sine for each pixel. There is also a
      full set of arithmetic operator overloads, see below.
    </para>

    <para>
      VIPS types are also automatically wrapped.  The override looks
      at the type of argument required by the operation and converts
      the value you supply, when it can. For example, "linear" takes a
      #VipsArrayDouble as an argument for the set of constants to use for
      multiplication. You can supply this value as an integer, a float,
      or some kind of compound object and it will be converted for you.
      You can write:

<programlisting language="Python">
result_image = image.linear(1, 3)
result_image = image.linear(12.4, 13.9)
result_image = image.linear([1, 2, 3], [4, 5, 6])
result_image = image.linear(1, [4, 5, 6])
</programlisting>

      And so on. A set of overloads are defined for <code>.linear()</code>,
      see below.
    </para>

    <para>
      It does a couple of more ambitious conversions. It will
      automatically convert to and from the various vips types,
      like #VipsBlob and #VipsArrayImage. For example, you can read the
      ICC profile out of an image like this:

<programlisting language="Python">
profile = im.get_value("icc-profile-data")
</programlisting>

      and <code>profile</code> will be a string.
    </para>

    <para>
      You can use array constants instead of images. A 2D array is simply
      changed into a one-band double image. This is handy for things like
      <code>.erode()</code>, for example:

<programlisting language="Python">
im = im.erode([[128, 255, 128],
               [255, 255, 255],
               [128, 255, 128]])
</programlisting>

      will erode an image with a 4-connected structuring element.
    </para>

    <para>
      If an operation takes several input images, you can use a 1D array
      constant or a number constant
      for all but one of them and the wrapper will expand it
      to an image for you. For example, <code>.ifthenelse()</code> uses
      a condition image to pick pixels between a then and an else image:

<programlisting language="Python">
result_image = condition_image.ifthenelse(then_image, else_image)
</programlisting>

      You can use a constant instead of either the then or the else
      parts, and it will be expanded to an image for you. If you use a
      constant for both then and else, it will be expanded to match the
      condition image. For example:

<programlisting language="Python">
result_image = condition_image.ifthenelse([0, 255, 0], [255, 0, 0])
</programlisting>

      Will make an image where true pixels are green and false pixels
      are red.
    </para>

    <para>
      This is also useful for <code>.bandjoin()</code>, the thing to join
      two or more images up bandwise. You can write:

<programlisting language="Python">
rgba = rgb.ibandjoin(255)
</programlisting>

      to add a constant 255 band to an image, perhaps to add an alpha
      channel. Of course you can also write:

<programlisting language="Python">
result_image = image1.ibandjoin(image2)
result_image = image1.ibandjoin([image2, image3])
result_image = Vips.Image.bandjoin([image1, image2, image3])
result_image = image1.ibandjoin([image2, 255])
</programlisting>

      and so on.
    </para>

    <para>
      There's one annoying wrinkle in <code>.bandjoin()</code>. The vips
      operation <code>vips_bandjoin()</code> takes an array of images to join,
      so you can't use it as an instance member, it appears in the API as

<programlisting language="Python">
result_image = Vips.Image.bandjoin([image1, image2, image3])
</programlisting>

      For convenience, the wrapper has an extra instance function called
      <code>.ibandjoin()</code> which puts the <code>self</code> at the head
      of the image array. This means the previous line is the same as:

<programlisting language="Python">
result_image = image1.ibandjoin([image2, image3])
</programlisting>

    </para>

  </refsect3>

  <refsect3 id="python-exceptions">
    <title>Exceptions</title>
    <para>
      The wrapper spots errors from vips operations and raises the
      <code>Vips.Error</code> exception. You can catch it in the
      usual way. The <code>.detail</code> member gives the detailed
      error message.
    </para>
  </refsect3>

  <refsect3 id="python-memory">
    <title>Reading and writing areas of memory</title>
    <para>
      You can use the C API functions vips_image_new_from_memory() and
      vips_image_write_to_memory() directly from Python to read and write
      areas of memory. This can be useful if you need to get images to and
      from other other image processing libraries, like PIL or numpy.
    </para>

    <para>
      Use them from Python like this:

<programlisting language="Python">
image = Vips.Image.new_from_file("/path/to/some/image/file.jpg")
memory_area = image.write_to_memory()
</programlisting>

      <code>memory_area</code> is now a string containing uncompressed binary
      image data. For an RGB image, it will have bytes
      <code>RGBRGBRGB...</code>, being
      the first three pixels of the first scanline of the image. You can pass
      this string to the numpy or PIL constructors and make an image there.
    </para>

    <para>
      Note that <code>.write_to_memory()</code> will make a copy of the image.
      It would
      be better to use a Python buffer to pass the data, but sadly this isn't
      possible with gobject-introspection, as far as I know.
    </para>

    <para>
      Going the other way, you can construct a vips image from a string of
      binary data. For example:

<programlisting language="Python">
image = Vips.Image.new_from_file("/path/to/some/image/file.jpg")
memory_area = image.write_to_memory()
image2 = Vips.Image.new_from_memory(memory_area,
                                    image.width, image.height, image.bands,
                                    Vips.BandFormat.UCHAR)
</programlisting>

      Now <code>image2</code> should be an identical copy of <code>image</code>.
    </para>

    <para>
      Be careful: in this direction, vips does not make a copy of the memory
      area, so if <code>memory_area</code> is freed by the Python garbage
      collector and
      you later try to use <code>image2</code>, you'll get a crash.
      Make sure you keep a reference to <code>memory_area</code> around
      for as long as you need it.
    </para>
  </refsect3>

  <refsect3 id="python-modify">
    <title>Draw operations</title>
    <para>
      Paint operations like <code>draw_circle</code> and <code>draw_line</code>
      modify their input image. This makes them hard to use with the rest of
      libvips: you need to be very careful about the order in which operations
      execute or you can get nasty crashes.
    </para>

    <para>
      The wrapper spots operations of this type and makes a private copy of
      the image in memory before calling the operation. This stops crashes,
      but it does make it inefficient. If you draw 100 lines on an image,
      for example, you'll copy the image 100 times. The wrapper does make sure
      that memory is recycled where possible, so you won't have 100 copies in
      memory. At least you can execute these operations.
    </para>

    <para>
      If you want to avoid the copies, you'll need to call drawing
      operations yourself.
    </para>
  </refsect3>

  <refsect3 id="python-overloads">
    <title>Overloads</title>
    <para>
      The wrapper defines the usual set of arithmetic, boolean and
      relational overloads on
      <code>image</code>. You can mix images, constants and lists of
      constants (almost) freely. For example, you can write:

<programlisting language="Python">
result_image = ((image * [1, 2, 3]).abs() &lt; 128) | 4
</programlisting>
    </para>

    <para>
      The wrapper overloads <code>[]</code> to be vips_extract_band(). You
      can write:

<programlisting language="Python">
result_image = image[2]
</programlisting>

      to extract the third band of the image. It implements the usual
      slicing and negative indexes, so you can write:

<programlisting language="Python">
result_image = image[1:]
result_image = image[:3]
result_image = image[-2:]
result_image = [x.avg() for x in image]
</programlisting>

      and so on.
    </para>

    <para>
      The wrapper overloads <code>()</code> to be vips_getpoint(). You can
      write:

<programlisting language="Python">
r, g, b = image(10, 10)
</programlisting>

      to read out the value of the pixel at coordinates (10, 10) from an RGB
      image.
    </para>

  </refsect3>

  <refsect3 id="python-expansions">
    <title>Expansions</title>
    <para>
      Some vips operators take an enum to select an action, for example
      <code>.math()</code> can be used to calculate sine of every pixel
      like this:

<programlisting language="Python">
result_image = image.math(Vips.OperationMath.SIN)
</programlisting>

      This is annoying, so the wrapper expands all these enums into
      separate members named after the enum. So you can write:

<programlisting language="Python">
result_image = image.sin()
</programlisting>

      See <code>help(Vips.Image)</code> for a list.
    </para>
  </refsect3>

  <refsect3 id="python-utility">
    <title>Convenience functions</title>
    <para>
      The wrapper defines a few extra useful utility functions:
      <code>.get_value()</code>,
      <code>.set_value()</code>,
      <code>.bandsplit()</code>,
      <code>.maxpos()</code>,
      <code>.minpos()</code>,
      <code>.median()</code>.
      Again, see <code>help(Vips.Image)</code> for a list.
    </para>
  </refsect3>

  <refsect3 id="python-args">
    <title>Command-line option parsing</title>
    <para>
      GLib includes a command-line option parser, and Vips defines a set of
      standard flags you can use with it. For example:

<programlisting language="Python">
import sys
from gi.repository import GLib, Vips

context = GLib.OptionContext(" - test stuff")
main_group = GLib.OptionGroup("main",
                              "Main options", "Main options for this program",
                              None)
context.set_main_group(main_group)
Vips.add_option_entries(main_group)
context.parse(sys.argv)
</programlisting>
    </para>

</refsect3>

</refentry>