From eefb2dad980c12922ad15d987be7c4b42a44d5cf Mon Sep 17 00:00:00 2001
From: John Cupitt <jcupitt@gmail.com>
Date: Sun, 4 Mar 2018 18:30:25 +0000
Subject: [PATCH] improve rounding in convi intize

We were rounding up with ceil() when intize-ing convolution masks.
However, the vector path has a true range of (1.0, -1.0], so a mask with
1.0 as the max (for example) was actually triggering the overflow detector
and falling back to the C path.

Round up with ceil(x + 1) instead, so 1.0 (for example) will be mapped
to 0.5 and won't overflow.
---
 ChangeLog                   | 2 ++
 libvips/convolution/convi.c | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 37ba8378..68f64654 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -2,6 +2,8 @@
 - use pkg-config to find libjpeg, if we can
 - better clean of output image in vips_image_write() fixes a crash 
   writing twice to memory
+- better rounding behaviour in convolution means we hit the vector path more
+  often
 
 5/1/18 started 8.6.2
 - vips_sink_screen() keeps a ref to the input image ... stops a rare race
diff --git a/libvips/convolution/convi.c b/libvips/convolution/convi.c
index 2045ede7..18b812cf 100644
--- a/libvips/convolution/convi.c
+++ b/libvips/convolution/convi.c
@@ -904,8 +904,12 @@ vips_convi_intize( VipsConvi *convi, VipsImage *M )
 
 	/* The mask max rounded up to the next power of two gives the exponent
 	 * all elements share. Values are eg. -3 for 1/8, 3 for 8.
+	 *
+	 * Add one so we round up stuff exactly on x.0. We multiply by 128
+	 * later, so 1.0 (for example) would become 128, which is outside
+	 * signed 8 bit. 
 	 */
-	shift = ceil( log2( mx ) );
+	shift = ceil( log2( mx + 1 ) );
 
 	/* We need to sum n_points, so we have to shift right before adding a
 	 * new value to make sure we have enough range.