Potrace: a polygon-based tracing algorithm

如果无法正常显示，请先停止浏览器的去广告插件。

1. Potrace: a polygon-based tracing algorithm Peter Selinger September 20, 2003 1 Introduction Black-on-white images can be represented either as a bitmap or as a vector outline. A bitmap represents an image as a grid of black or white pixels. A vector outline describes an image via an algebraic description of its contours, typically in the form of Bezier curves. The advantage of representing an image as a vector outline is that it can be scaled to any size without loss of quality. Outline images are independent of the resolution of any particular output device. They are particularly popular in the description of fonts, which must be reproducible at many different sizes. Examples of outline font formats include PostScript Type 1 fonts, TrueType, and Metafont. On the other hand, most actual input and output devices, such as scanners, displays, and printers, ultimately produce or consume bitmaps. The process of converting a vector outline to a bitmap is called rendering. The converse process of turning bitmaps into outlines is called tracing. It is clear that no tracing algorithm can be perfect in an absolute sense, as there are in general many possible outlines that can give rise to the same bitmap. The process of tracing cannot be used to generate information that is not already present. On the other hand, out of the many possible outlines that could give rise to a given bitmap, clearly some are more plausible or aesthetically pleasing than others. For example, a common way of rendering bitmaps at a high resolution is to draw each black pixel as a precise square, which gives rise to “jaggies” or staircase patterns. Clearly, jaggies are neither pleasant to look at, nor are they particularly plausible interpretations of the original image. There is probably no absolute measure of what constitutes a good tracing algorithm, but it seems clear that some algorithms give better results than others. In this paper, we describe a tracing algorithm that is simple, efficient, and tends to produce excellent results. The algorithm is called Potrace, which stands for polygon tracer. However, the output of the algorithm is not a polygon, but a smooth contour made from Bezier curves. The name of the algorithm derives from the fact that it uses polygons as an intermediate representation of images. The Potrace algorithm is designed to work well on high resolution images. Thus, a typical application is to produce a vector outline from a company or university logo that has been scanned at a high resolution. Another possible application is the conversion of bitmapped fonts to outline fonts, if the original bitmapped fonts are available at a high enough resolution. No tracing algorithm will work well on very small scales, such as 1

2. (a) (b) (c) (d) Figure 1: Corner detection. (a) the original bitmap; (b) too many corners; (c) too few corners; (d) good corner detection. bitmaps for a typical 10pt screen font rendered at 75dpi. However, it will do a decent job of tracing non-exact shapes, such as scanned handwriting or cartoon drawings, even at relatively moderate resolutions. Any good tracing algorithm has to perform several functions. Two of these func- tions are to find the most plausible curve that approximates a given outline, and to detect corners. There is a tradeoff between these two goals. If too many corners are detected, the output will look like a polygon and will no longer be smooth. If too few corners are detected, the output will look smooth but too rounded. An example is shown in Figure 1. Another important function performed by a tracing algorithm is to decide which features of the bitmapped image are relevant, and which features are artifacts of the scanning or rendering process. Those features that can be explained as artifacts should be filtered out completely, because if even a slight hint of these features remains, this can lead to visible imperfections in the output. Consider a straight line of positive, but very small, slope. When rendered as a bitmap, such a line will lead to a staircase pattern, where the individual steps of the stair could be far apart. No matter how far the steps are apart, the output should be a straight line, or else it will be visually annoying. This example also shows that tracing is not in general a local operation, i.e., it cannot be based on merely looking at fixed-size neighborhoods of a point. Although the Potrace algorithm is very efficient, it produces nicer output than other comparable algorithms. For instance, Figure 2 compares the output of Potrace 1.0, with its standard settings, to that of AutoTrace 0.31.1, another freely available tracing program (see http://autotrace.sourceforge.net/). In addition to its superior graphical output, Potrace also compares favorably to AutoTrace in terms of speed and file size: The bitmap in Figure 2 took Potrace 0.27 seconds to process, compared to 1.69 seconds for AutoTrace. Potrace produces an EPS file of 15790 bytes, compared to 39788 bytes for AutoTrace. 2 Description of the Potrace algorithm The Potrace algorithm transforms a bitmap into a vector outline in several steps. In the first step, the bitmap is decomposed into a number of paths, which form the boundaries 2

3. Figure 2: A detail from the seal of Stanford University; the original scanned image, left; the output of AutoTrace, center; the output of Potrace, right. between black and white areas. In the second step, each path is approximated by an op- timal polygon. In the third step, each polygon is transformed into a smooth outline. In an optional fourth step, the resulting curve is optimized by joining consecutive Bezier curve segments together where this is possible. Finally, the output is generated in the required format. The following subsections describe each of these steps in more detail. 2.1 Paths 2.1.1 Path decomposition We imagine our bitmapped image to be placed on a coordinate system such that the corners (and not the centers) of each pixel have integer coordinates. Let us further assume that the background color of the image is white, and the foreground color is black. By convention, the parts of the coordinate plane that lie outside the bitmap boundaries are assumed to be filled with white pixels. We now construct a directed graph from our bitmap as follows. Let p be a point of integer coordinates; such a point is adjacent to four pixels. The point is called a vertex if the four pixels are not all of the same color. If v and w are vertices, we say that there is an edge from v to w if the Euclidean distance between v and w is 1, and if the straight line segment from v to w separates a black pixel from a white pixel, so that the black pixel is to its left and the white pixel is to its right when traveling in the direction from v to w. Let us call the resulting directed graph G with the vertices and edges just described. A path is a sequence of vertices {v 0 , . . . , v n } such that there is an edge from v i to v i+1 , for all i = 0, . . . , n − 1, and such that all these edges are distinct. A path is called closed if further, v n = v 0 . The length of a path is the number of edges in it, i.e., n. The goal of path decomposition is to decompose the graph G into closed paths, i.e., to find a set of closed paths in which each edge of G occurs exactly once. Potrace uses the following straightforward method to decompose a bitmap into paths. Start by picking a pair of adjacent pixels of different color. This can be accom- plished, for instance, by picking the leftmost black pixel in some row. The two chosen 3

4. ? Figure 3: The path extension algorithm pixels meet at an edge; we orient this edge so that the black pixel is to its left and the white pixel is to its right. This edge defines a path of length one. We then continue to extend this path in such a way that each new edge has a black pixel on its left and a white pixel on its right, relative to the direction of the path. In other words, we move along the edges between pixels, and each time we hit a corner, we either go straight or turn left or right, depending on the colors of the surrounding pixels as shown in Fig- ure 3. We continue until we return to the vertex where we started, at which point we have defined a closed path. Every time we have found a closed path, we remove it from the graph by inverting all the pixel colors in its interior. This defines a new bitmap, to which we apply the algorithm recursively until there are no more black pixels left. The result is a set of closed paths to be passed to the next phase of the Potrace algorithm. The later phases of the Potrace algorithm look at each path independently. 2.1.2 Turn policies In the situation in Figure 3(d), we have a choice of whether to take a left turn or a right turn. This choice has no effect on the success or failure of the path decomposition algorithm, as we will end up with a set of closed paths either way. However, the choice does have an effect on the shape of the closed paths chosen. In the Potrace algorithm, the choice of whether to turn left or right is governed by a turn policy, which can be defined via the --turnpolicy command line option. Possible turn policies are: left, which always takes a left turn, right, which always takes a right turn, black, which prefers to connect black components, white, which prefers to connect white components, minority, which prefers to connect the color (black or white) that occurs least frequently within a given neighborhood of the current position, majority, which prefers to connect the color that occurs most frequently, and random, which makes a (more or less) random choice. The default turn policy is minority. The reason that black and white are distinct turn policies from right and left is that some pixel colors may get inverted during the course of the path decomposition algorithm. The black and white policies look at the original pixel colors to determine the direction of the turn. 4

5. (a) (b) (c) (d) (e) Figure 4: Examples of straight and non-straight paths. The vertices of the path are shown as dots, and their 1/2-neighborhoods are shown as squares. (a), (b), and (d) are straight, whereas (c) and (e) are not. 2.1.3 Despeckling Despeckling can be performed by dropping all paths whose interior consists of fewer than t pixels, for a given parameter t. The parameter t can be set with the --turdsize command line option. The area of the interior of a path can be efficiently computed by the formula Z Z Area = y dx = yx ′ dt. 2.2 Polygons The second phase of the Potrace algorithm has as its input a closed path as defined in Section 2.1. The output is an optimal polygon that approximates this path. We start by making precise what is meant by “optimal” and by “approximates”. 2.2.1 Straight paths Given two points z 0 = (x 0 , y 0 ) and z 1 = (x 1 , y 1 ) in the coordinate plane, not necessar- ily of integer coordinates, we define their max-distance to be d(z 0 , z 1 ) = max{|x 1 − x 0 |, |y 1 − y 0 |}. Thus, the set of points of max-distance at most 1/2 from the point (1/2, 1/2) is just the pixel centered at (1/2, 1/2). For any two points a, b in the coordinate plane, let ab denote the straight line seg- ment connecting a and b. Here a and b are not required to have integer coordinates. Given a non-closed path p = {v 0 , . . . , v n } as in Section 2.1, we say that a line segment ab approximates the path if d(v 0 , a) 6 1/2, d(v n , b) 6 1/2, and for each i = 1, . . . , n − 1, there exists some point c i on ab such that d(v i , c i ) 6 1/2. For a path p = {v 0 , . . . , v n }, we say the direction at index i is v i+1 − v i , where i = 0, . . . , n − 1. There are four possible directions: (0, 1), (1, 0), (0, −1), and (−1, 0). A path is called straight if it is approximated by some line segment, and not all four directions occur in p. Figure 4 shows some examples of straight and non-straight paths. Note that in this figure, the dots represent vertices in the path, which lie at the corners, not at the 5

6. centers, of the pixels of the original bitmap. The squares shown are not pixels, but rather neighborhoods of path points. Figure 4(e) shows an example of a path that is not straight, although it is approxi- mated by some line segment. This is because all four directions occur in this path. It is clear from the definition that if a path is straight, then so are all its subpaths. In order to compute whether a given path is straight or not, we use the stronger fact that straightness is a triplewise property, in the following sense. Suppose that a given path p = {v 0 , . . . , v n } does not use all four directions. Then p is straight if and only if for all triples (i, j, k) of indices such that 0 6 i < j < k 6 n, there exists a point w on the straight line through v i and v k such that d(v j , w) 6 1. This observation gives rise to a naive straightness testing algorithm that is of cubic complexity in the worst case; it proceeds simply by testing the above property for all triples (i, j, k). In the Potrace implementation, we use an optimization that allows us to find all straight subpaths of a given closed path of length n in time O(n 2 ) in the worst case. Briefly, the trick is to compute, for every pair (i, j), a constraint on the position of all future v k ’s. If i is fixed and j is increasing, it suffices to check the constraint once for each j. Moreover, a constraint consist of at most two inequalities and can be updated and checked in constant time. 2.2.2 Polygons Now consider a closed path p = {v 0 , . . . , v n }. Recall that v n = v 0 , so that this path is of length n. Any pair of indices i, j ∈ {0, . . . , n − 1} defines a subpath p i, j , which is v i , . . . , v j if i 6 j, or v i , . . . , v n−1 , v 0 , . . . , v j if j < i. Let us write j − ◦ i for the cyclic difference between i and j, which is defined as j −i ◦ = j − i if i 6 j, and j −i ◦ = j − i+ n if j < i. Thus, the length of the subpath p i, j is precisely j −i. ◦ In the following discussion, we often assume tacitly that additions and subtractions are taken modulo n. We now want to construct a polygon from the closed path p. We say that there is a possible segment from i to j if j − ◦ i 6 n − 3 and the subpath p i−1, j+1 is straight in the sense of the previous definition. In other words, a subpath corresponds to a possible segment if it can be extended by one point in both directions and still be straight. This peculiar “clipping” of a vertex from both ends of a straight path is important to the overall quality of the output of the Potrace algorithm; without it, there would be strange behavior around the corners. Note that any path of length 3 is straight in the sense of Section 2.2.1, thus it is guaranteed that there is always a possible segment from i to i + 1. A polygon, for the purpose of this phase of the algorithm, is a sequence of in- dices i 0 6 i 1 6 . . . 6 i m−1 such that there is a possible segment from i k to i k+1 for k = 0, . . . , m − 2, and from i m−1 to i 0 . Figure 5 shows a path and two possible polygons for it. Note that the polygon segments shown in Figure 5 do not actually have to approx- imate their corresponding subpaths in the sense of the red line segments of Figure 4. They simply represent the fact that an approximating line segment exists. 6

7. Figure 5: An optimal and a non-optimal polygon for a path 2.2.3 Penalties Out of all possible polygons, we now want to find an optimal one. Our primary cri- terion for optimality is the number of segments: a polygon with fewer segments is considered more optimal than one with more segments. In Figure 5, the left polygon has 14 segments, whereas the right one has 17 segments. Thus, the left polygon is more optimal than the right one. Among the polygons of the same number of segments, some are still more prefer- able than others. We associate to every possible segment a penalty. Given a possible segment from i to j, associate to it the straight line segment v i v j (shown in blue in Figure 5). The penalty associated with the segment is equal to the Euclidean length of v i v j , times the standard deviation of the Euclidean distances of the path points from v i v j . In symbols, the penalty is equal to v u j u 1 dist(v k , v i v j ) 2 , P i, j = |v j − v i | · t ∑ j − ◦ i + 1 k=i where dist(a, bc) denotes the Euclidean distance of a point from a straight line, and it is understood that the sum counts from i to j in a cyclic manner. In words, the further the path points stray from the segment, the greater the penalty. The formula for P i, j was chosen because it can be computed efficiently; namely, let (x, y) = v j − v i and (x, y) = (v i + v j )/2. Then we have p P i, j = cx 2 + 2bxy + ay 2 , where a = b = c = j E(x 2 k ) − 2xE(x k ) + x 2 , E(x k y k ) − xE(x k ) − yE(y k ) + xy, E(y 2 k ) − 2yE(y k ) + y 2 . 1 2 2 Here E(x 2 k ) = j − ◦ i+1 ∑ k=i x k is the expected value of x k for k = i, . . . , j, and similarly for the other “E” notations. Note that the sums can be computed ahead of time, by making a table of sums of j the form ∑ k=0 q k , for each quantity q k to be summed. After making such tables, which 7

8. takes time and space linear in the length of the given path, the above formula for P i, j can be computed in constant time for each given i and j. 2.2.4 Optimal polygon We can regard the given closed path p = {v 0 , . . . , v n } as a directed graph with vertices 0, . . . , n − 1, where there is an arrow from i to j if there is a possible segment from i to j. To each sequence i 0 → i 1 → . . . → i k of arrows, we can associate a penalty, which is an ordered pair (k, P), where k is the number of arrows in the sequence, and P is the sum of their numerical penalties as discussed in Section 2.2.3. Penalties are compared lexicographically, i.e., (k, P) is preferable to (k ′ , P ′ ) if either k < k ′ , or k = k ′ and P < P ′ . In this way, the problem of finding an optimal polygon reduces to the problem of finding an optimal cycle in a directed graph. We use a variant of a standard graph- theoretic algorithm to solve this problem efficiently. Once the graph has been com- puted, an optimal cycle can be found in time O(nm), where n is the size of the input path, and m is the length of the longest possible segment. We remark that it is this optimization step that makes our algorithm non-local, because we have to consider an entire path at once; each part of the optimal polygon depends potentially on all the other parts. The previous phase, which computes a path from a bitmapped image, and the following phase, which transforms a polygon into a vector outline, are strictly local, in that they only look at a few adjacent points at a time. 2.3 From polygons to vector outlines The final phase of the algorithm transforms the polygon obtained in the previous phase into a smooth vector outline. In a preliminary step, we adjust the position of the vertices of the polygon to fit the original bitmap as best as possible. In the main step, we calculate corners and curves based on the lengths of adjacent line segments and the angles between them. 2.3.1 Vertex adjustments The output of the previous phase of the algorithm is a polygon {i 0 , . . . , i m−1 } associated to a closed path {v 0 , . . . , v n }. We refer to the indices i 0 , . . . , i m−1 , as well as to their associated points v i 0 , . . . , v i m−1 , as the vertices of the polygon. As our polygon is cyclic, we follow the usual convention of taking indices modulo m. For the purpose of calculating penalties, we have placed the vertex i of the polygon precisely at the corresponding path point v i , which is a point with integer coordinates in the coordinate plane (i.e., located at a meeting point of four pixels in the original bitmap). While this placement of vertices allowed us to calculate penalties efficiently, it is not necessarily the optimal arrangement. We now associate to each vertex i k a point a k in the coordinate plane, not necessarily of integer coordinates, such that a k is near v i k , and such that, for any two consecutive vertices i k and i k+1 of the polygon, the resulting line segment a k a k+1 is reasonably close to the original subpath v i k , . . . , v i k+1 . 8

9. o z 2 z 1 z 0 z 3 Figure 6: A typical Bezier curve We use the following algorithm for placing the points a k : for each consecutive pair of vertices i k and i k+1 , calculate the straight line L k,k+1 that optimally approximates the points v i k , . . . , v i k+1 , in the sense that it minimizes the squares of their Euclidean distances to the line. Now if i k−1 , i k , and i k+1 are consecutive vertices, then we ideally want to place a k at the intersection of L k−1,k and L k,k+1 . However, we do not want a k to be too far from the original vertex v i k . Thus, we let a k be a point in the unit square with max-distance d(a k , v i k ) 6 1/2 such that the sum of the square of the Euclidean distances from a k to L k−1,k and L k,k+1 is minimal. In particular, if the intersection of L k−1,k and L k,k+1 lies in this unit square, then we place a k at the intersection; else, we place it at a point near v i k that is “close” to the intersection. Calculating a k is easy, as it simply amounts to minimizing a quadratic function on a square. Also, the straight line L k,k+1 is easily computed from the data points v i k , . . . , v i k+1 by using a standard method of “best fit”: this line passes through the center of gravity (E(x k ), E(y k )), where k = i k , . . . , i k+1 , and its slope is given by the a b eigenvector of the larger eigenvalue of the matrix , where b c a b c = E(x 2 j ) − E(x j ) 2 , = E(x j y j ) − E(x j )E(y j ), = E(y 2 j ) − E(y j ) 2 . 2.3.2 A class of Bezier curves The purpose of this section is to make a simple, yet useful observation about Bezier curves. Recall that a Bezier curve is given by four control points z 0 , z 1 , z 2 , z 3 , and by the parametric equation z = (1 −t) 3 z 0 + 3t(1 −t) 2 z 1 + 3t 2 (1 −t)z 2 +t 3 z 3 . For the purposes of our analysis, we restrict ourselves to the case where the straight lines through z 0 z 1 and through z 3 z 2 intersect at a point o (i.e., they are not parallel). Further, we restrict ourselves to curves that are convex and change direction by less than 180 degrees; this means that z 1 lies between z 0 and o, and that z 2 lies between z 3 and o. The situation is as shown in Figure 6. By a linear transformation of the coordinate system, we can assume that z 0 = (−1, 0), z 3 = (1, 0), and o = (0, 1). Any Bezier curve of this particular form is uniquely determined by two parameters α, β ∈ [0, 1] such that z 1 = (−1 + α, α) and z 2 = (1 − β, β). Figure 7 gives an overview of the Bezier curves in this standardized 9

10. Figure 7: A 2-parameter family of Bezier curves form for all values of α and β that are multiples of 0.2. In the illustration, the control points z 1 and z 2 are shown as red dots. We can see immediately from the illustration that the Bezier curves in any particular horizontal row are visually almost indistinguishable, except perhaps in the case when α or β are very close to 0. We will see that our algorithm never produces Bezier curves with α and β very small, so that we can ignore the latter possibility. It follows that we do not lose any interesting curves if we restrict ourselves to the case α = β. This eliminates one degree of freedom from the set of possible Bezier curves that we need to consider, and thus it simplifies our task of finding optimal curves. We should emphasize that we do not claim that all Bezier curves resemble the ones shown in Figure 7. Rather, this is the case up to a linear transformation. Thus, if z 0 and z 3 are given, there are two degrees of freedom in the placement of o, and one additional degree of freedom in the choice of α. By setting α = β, a fourth degree of freedom has 10

11. b i b i a i a i a i a i b i b i–1 b i–1 b i–1 (a) α < 0.55 (b) α = 0.65 (c) α = 0.9 b i b i–1 (d) α > 1 Figure 8: Corner detection and smoothing been eliminated. An interesting fact is that the area enclosed between a Bezier curve of the above 3 3 (2α + 2β − αβ), or 10 (4 − (2 − α)(2 − β)). From form and the x-axis is equal to 10 Figure 7, we find that two curves look very similar if they enclose areas of equal size. Thus, we may approximate any p curve with parameters α and β by a new curve with equal parameters α ′ = β ′ = 2 − (2 − α)(2 − β). Another interesting measure of a Bezier curve is the height of its highest point. In case α = β, the highest point is reached when t = 1/2, and its y-coordinate is 3α/4. 2.3.3 Smoothing and corner analysis The input to the last phase of the algorithm is the adjusted polygon from Section 2.3.1. Suppose the vertices of this polygon are a 0 , . . . , a k−1 . Let b 0 , . . . , b k−1 be the midpoints of the edges of the polygon, i.e., b i = (a i + a i+1 )/2. For each i, we now consider the corner b i−1 ..a i ..b i , and we decide whether to approximate it by a smooth curve, as shown by the blue line in Figure 8(a)–(c), or by a sharp angle, as shown in Figure 8(d). We proceed as follows. First, we draw a unit square centered at the point a i . Next, we find the line L i that is parallel to b i−1 b i , that touches the square around a i , and that is as close as possible to the line b i−1 b i . Let c be the point where L i intersects b i−1 a i , and let γ be the quotient of the lengths of b i−1 c and b i−1 a i . Let α = 4γ/3 and consider the Bezier curve (of the kind discussed in Section 2.3.2) connecting b i−1 and b i with parameter α. This curve is tangent to the three lines b i−1 a i , L i , and a i b i . We use the parameter α just calculated to perform corner detection and to determine the final curve from b i−1 to b i . There are two cases. If α 6 1, then we draw a smooth Bezier curve at this vertex, as shown in Figure 8(a)–(c). If α > 1, there is no convex Bezier curve connecting b i−1 and b i and tangent to L i . In this case, we have detected a corner and we connect b i−1 and b i via two straight line segments that meet at a i , as shown in Figure 8(d). Corner detection can be customized via the so-called corner threshold parameter α max , which is configurable via the --alphamax command line option. If this param- 11

12. eter is set, then a vertex will be rounded if α 6 α max , and a corner if α > α max . Thus, smaller values of α max lead to more corners, as in Figure 1(b), and larger values lead to more rounded shapes, as in Figure 1(c). The default value is α max = 1. If α max < 0, then no smoothing is performed and the output of Potrace is a polygon. If α max > 4/3, then there will be no corners at all and the output is an everywhere smooth curve. After corners have been detected, the value of α is further adjusted to be between 0.55 and 1. The lower bound α > 0.55 was chosen to prevent the curve from becoming too “flat”. Allowing α < 0.55 often leads to strange looking images. The upper bound of 1 ensures that the resulting Bezier curve segment is convex. The value α = 0.55 was chosen because it tends to give a good approximation to a circle in case the input is a regular polygon. It was chosen to be close to the theoretical value 4 √ α 0 = ( 2 − 1) ≈ 0.552285, 3 which gives the best possible approximation by a Bezier curve segment to a quarter cir- cle. More precisely, the Bezier curve with control points z 0 = (1, 0), z 1 = (1, α 0 ), z 2 = (α 0 , 1), and z 3 = (0, 1) lies between the unit circle and the circle of radius 1.00027253. Thus, this curve deviates from a true circle (of median radius) by less than 0.01363%. Although this approximation of a quarter circle by a Bezier curve segment is well- known, the exact bound is difficult to find in the literature; for instance, Faux and Pratt [1, p.134] falsely give this value as 0.13%, due to an apparent typographical error, whereas Knuth [2, p.14] gives it only as “less than 0.06%”. Note that our corner detection algorithm has the following property: Corners are favored both by sharp angles and by long segments. Thus, we detect a corner if two short segments meet at a very sharp angle, and also if two very long segments meet at a slight angle. 2.4 Curve optimization The output of the previous phase of the Potrace algorithm, after corner analysis and smoothing, is a curve consisting of Bezier curve segments and straight line segments. The resulting curve is very close to the final output of Potrace. However, there is an optional last phase of the algorithm, the curve optimization phase, which attempts to further optimize the curve by joining adjacent Bezier curve segments together if this is possible. Curve optimization only makes very small changes to the shape of the final curve; small enough that the difference is not normally visible. However, the resulting curve consists of fewer segments, and thus can be represented more compactly in the final output of the program. If curve optimization is not desired, it can be disabled by giving the --longcurve command line option to Potrace. The curve optimization algorithm is based on a few simple ideas. First, we only attempt to join adjacent curved segments, never straight line segments or corners. Sec- ond, we only join adjacent curve segments that agree in convexity, i.e., they all curve to the right or all to the left. Third, we only join adjacent curve segments if the to- tal change in direction is less than 179 degrees. (We do not quite allow 180 degrees, in order to avoid unbounded quantities in the computations below). This leaves us to consider a sequence of segments like the one shown in blue in Figure 9. 12

13. b 0 a 1 b 1 a 2 O b 2 b n–1 a n b n Figure 9: Curve optimization The question is whether we can find a single Bezier curve from b 0 to b n that ap- proximates the given sequence of shorter Bezier curves. Suppose there was such a curve C. Clearly, C would have to be tangent to b 0 a 1 and a n b n . We can thus find the point O where b 0 a 1 and a n b n intersect. Following our discussions in Section 2.3.2, this leaves only one degree of freedom in the curve to be considered, namely the parameter α. If we impose the further requirement that the area enclosed by the curve C should be equal to the total area enclosed by the original curve segments and the line b 0 b n , then this uniquely determines the parameter α. Recall from Section 2.3.2 that the areas in question are easily calculated. This leaves us with a unique Bezier curve C that is a candidate for approximating the given segments. It is shown in red in Figure 9. It remains to check whether C actually is an acceptable approximation to the given curve segments, and if yes, to assign it a numerical penalty. We do this by a simple tangency check. For each i = 1, . . . , n − 1, we find the point z i on C where the tangent to C is parallel to a i a i+1 . We let d i be the Euclidean distance of z i to the line segment a i a i+1 . Further, for each i = 1, . . . , n, we find the point z ′ i on C where the tangent to C is parallel to b i−1 b i . We let d i ′ be the Euclidean distance of z ′ i to the line segment L i define in Section 2.3.3, counted positive if z ′ i is on the same side of L i as a i , and otherwise negative. We say that the approximation is acceptable if all d i 6 ε, d i ′ > −ε, and the orthog- onal projection of z i onto the line a i a i+1 lies between a i and a i+1 . Here, the value ε is a constant called the tolerance of the curve optimization algorithm; it is pre-set to 0.2, and it can be altered via the --opttolerance option. For an acceptable curve, we define its penalty to be the sum of the squares of all the distances d i and d i ′ . Finally, we use a standard graph-theoretic algorithm for 13

14. shortest path search to decompose a given sequence of curve segments into acceptable approximations, optimizing first the number of segments, then the total penalty. 2.5 Output generation 2.5.1 Scaling and rotation The Potrace algorithm has produced a family of curves, each of which consists of Bezier segments and straight line segments. The endpoints and control points of these segments are arbitrary points in the coordinate plane. Depending on the chosen back- end and parameters, Potrace now performs a linear transformation (to scale the image to the desired size, and possibly to rotate it). 2.5.2 Redundancy coding When using one of the PostScript backends (PostScript or EPS), Potrace uses a very compact numerical format to represent Bezier curves in the output. To do so, it takes advantage of redundancies in the curve parameters. In principle, 6 parameters are needed to describe each Bezier curve segment (1 endpoint and 2 control points). How- ever, by eliminating redundancies in these parameters, Potrace can encode each seg- ment by using only 3 to 4 real numbers. One degree of freedom can be eliminated because we only use curves with α = β, see Section 2.3.2. Another degree of freedom can be eliminated because the point b i always lies on the line segment a i a i+1 , see Sec- tion 2.3.3. A third degree of freedom can often be eliminated because b i is actually half way between a i and a i+1 for those curve segments that were not affected by the curve optimization step of Section 2.4. This redundancy coding of Bezier curves is only performed in the PostScript back- end, because it takes advantage of the macro capabilities of the PostScript language. Redundancy coding can be turned off with the --longcoding option, resulting in longer, but more readable output. 2.5.3 Quantization For most backends, the final coordinates, which are real numbers, are quantized, which means they are rounded to the closest 1/10 pixel. Thus, the number of decimal digits needed to represent each coordinate is reduced by effectively placing all control points on a very fine grid. The coordinates of the points can then be output as integers. The default quantization constant of 1/10 usually gives good results; however, it is config- urable via the --unit command line option. 3 A complete example A complete example of a run of the Potrace algorithm is shown in Figure 10. Part (a) shows the original bitmap. In part (b), note how the default “minority” turn policy keeps the black outlines along the outside of the figure connected, while at the same time keeping the white outlines inside the figure’s hair connected as well. Also note 14

15. (a) (b) 0.52 0.58 0.64 0.44 0.52 0.59 0.58 0.64 0.27 0.44 0.00 0.59 0.27 0.00 0.00 0.77 0.00 0.77 0.52 0.52 0.00 0.67 1.03 0.00 0.00 0.00 0.49 0.67 1.03 0.07 0.00 0.00 0.49 0.07 0.00 0.00 0.57 0.43 0.04 0.00 0.50 0.82 0.86 0.54 0.57 0.04 0.00 0.43 0.04 0.46 0.00 0.50 0.82 0.86 0.54 0.00 0.00 0.46 0.86 0.16 0.04 0.00 0.86 0.16 0.00 0.00 0.00 0.00 0.66 0.00 0.20 0.66 0.46 0.00 0.00 0.00 0.00 0.00 0.20 0.46 0.00 0.00 0.00 0.09 0.00 0.09 0.00 0.19 0.00 0.00 0.00 0.30 0.53 0.19 0.00 0.00 0.00 0.63 0.17 0.30 0.53 0.00 0.63 0.17 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.46 0.27 0.00 0.00 0.85 0.18 0.00 0.34 0.00 0.34 0.18 0.18 0.23 0.00 0.18 0.48 0.23 0.47 0.01 0.47 0.00 0.14 0.68 0.00 0.47 0.01 0.00 0.11 0.46 0.47 0.18 0.48 0.71 0.00 0.14 0.11 0.57 0.22 0.61 0.00 0.20 0.61 0.23 0.61 0.20 0.73 0.39 0.57 0.71 0.61 0.20 0.00 0.20 0.00 0.73 0.00 0.56 0.23 0.00 0.00 0.00 0.43 0.39 0.00 0.71 0.00 0.73 0.23 0.61 0.00 0.56 0.61 0.46 0.57 0.22 0.61 0.68 0.00 0.71 0.73 0.57 0.00 0.46 0.27 0.00 0.00 0.85 0.18 0.61 0.23 0.00 0.00 0.43 0.78 0.78 0.42 0.42 0.15 0.15 1.02 1.02 0.00 0.00 0.00 0.00 0.73 0.00 0.00 0.73 0.00 0.63 0.00 (c) 0.63 (d) (e) Figure 10: A complete example. (a) the original bitmap, (b) path decomposition and optimal polygon, (c) vertex adjustment, corner analysis, and smoothing, (d) curve op- timization, (e) the final output. 15

16. that a speckle of size 1, inside the figure’s hair, has been removed. Part (b) also shows the optimal polygon calculated for each path component. Part (c) shows the adjusted polygon vertices, relative to the underlying bitmap, which is shown in grey. Each vertex is surrounded by its unit square. Also, the line segments L i from Section 2.3.3 are shown, and the parameter α is written inside the unit square of each vertex; this is best seen by looking at the page at a very high magnification in Acrobat Reader. Corner analysis is performed at this step; note that for this particular bitmap, only very few corners are detected. Generally, corner analysis works better at higher resolutions. Smoothing is then performed; the resulting Bezier curve segments and line segments are shown in blue. Part (d) shows the result of curve optimization; the original curve is shown in blue, and the optimized curve is shown in red. Red dots indicate the new segment boundaries. Note that the number of segments has been reduced from 112 to 68, or by 40%. The final result of the algorithm is shown in Part (e). Debugging output in the style of Figure 10(b)–(d) can be produced by giving the command line options -d1 through -d3 to Potrace. References [1] I. D. Faux and M. J. Pratt. Computational Geometry for Design and Manufacture. Ellis Horwood Series in Mathematics and its Applications, Editor: G. M. Bell. Ellis Horwood, New York, NY, USA, 1979. [2] D. E. Knuth. The METAFONTbook, volume C of Computers and Typesetting. Addison-Wesley, Reading, MA, USA, 1986. 16