4
Good rain knows the season, when spring is here. Sneaked into the night wind, moisten things silently. --Du Fu

introduction

There are a large number of scenes related to curves in the field of graph visualization, but it is not easy to get a suitable curve. The author recently encountered such a problem when AntV G6 The shape is flat, the arrow direction is inconsistent with the trend of the connection, the start and end of the connection are hidden, etc. The good-looking curves are always similar, but the ugly curves have their own problems. How to get a nice curve? Let's find out.

problem

AntV G6 comes with a cubic-vertical , which is a third-order Bezier curve in the vertical direction, but this connection does not support incoming control points. The connection itself is not beautiful and cannot handle some extreme situations, as shown in the figure below. The curve in Figure 2 is too flat, and the line trend and the direction of the end arrow deviate too much; in Figure 3, the arrow and the starting point of the line are completely covered.

图1

figure 1

图 2

figure 2

图 3

image 3

Connection of other diagram editing products

Stones from other hills, can learn. In order to improve the user's connection experience, we horizontally compared the connection solutions of other graph editing products. A total of 4 relatively similar well-known products were found. Among them, Draw.io and Processon are more focused on graph editing scenes. Both support Bezier curves, and both support the user to change the shape of the curve control point But this kind of interaction is equivalent to handing over the task of optimizing the curve to the user, which is different from our goal. In addition, DataV's Blueprint Editor and AntV X6's examples are both lightweight graph editing scenes, both of which are optimized to adjust the cubic Bezier curve to obtain a better connection effect. Various connection schemes are described in detail below.

Draw.io

draw.io is very focused on graph editing and supports third-order Bezier curves and high-order Bezier curves. Users can generate arbitrarily complex and arbitrarily shaped curves by dragging control points. As shown in the figure below, draw.io supports adding or reducing control points when dragging the curve, you can adjust the connection position of the anchor points, and adjust the direction of the end arrow.

图 4

Figure 4

The advantages of Draw.io are:

  1. Supports dynamic addition and deletion of control points, and users can almost draw curves of any shape.
  2. When dragging a curve or control point, the connection position of the anchor point and the direction of the arrow at the end can be adjusted adaptively to obtain better results. As shown in FIG.

The disadvantages are:

  1. The interaction is more complicated and difficult to get started.
  2. The implementation is complicated, and the logic of dynamic addition and deletion of control points is complicated.

ProcessOn

Processon also focuses on graph editing, but only supports third-order Bezier curves. Users can adjust the shape of the curve by dragging the control points, and the arrow direction will be automatically adjusted during the process of dragging the control points. And the control points of Processon are outside the curve, which conforms to the original definition of Bezier curve. In addition, there are auxiliary lines between the control points and the first and last two points of the curve, which is convenient for users to perceive how the control points act on the shape of the curve.

图 5

Figure 5

The advantages of Processon are:

  1. While dragging the control point, the direction of the arrow at the end will be adjusted automatically.
  2. The interaction of the control points is relatively simple, and there are auxiliary lines to help users understand.

The disadvantages are:

  1. After dragging the curve and then dragging the node, the curve shape changes greatly.
  2. Like Draw.io, the task of optimizing the shape of the curve is handed over to the user.

DataV's blueprint editor

DataV's blueprint editor only supports third-order Bezier curves, and does not support dragging control points, but its connection effect is very good. The arrow at the end of the connection will also adjust its shape adaptively during the process of dragging the node.

In addition, the blueprint editor restricts whether the anchor point is output or input. For example, the anchor point on the left side of the specified node only supports input, and the anchor point on the right side of the node only supports output. This setting can reduce the special situations that need to be handled during the connection.

图 6

Image 6

The advantages of DataV Blueprint Editor are:

  1. The connection line is a simple Bezier third-order curve, and only the control point parameters are adjusted to adapt to the shape formed by node dragging.
  2. During node dragging, the connection changes smoothly.

The disadvantages are:

  1. The shape and curve of the end arrow do not fit perfectly, as shown by the orange arrow in the figure above.

AntV X6

AntV X6 similar to the blueprint editor, but the implementation is different. This example also restricts whether the anchor point is input or output according to the position of the anchor point relative to the node. The X6 example adds two straight lines at both ends of the Bezier curve to obtain a better connection effect, as shown in the figure below.

图 7

Figure 7

The advantages of X6's implementation are:

  1. The implementation is simple, only adding straight lines at both ends of the Bezier curve, and the parameters of the Bezier curve are fixed values.
  2. The interactive effect is relatively smooth.

The disadvantages are:

  1. The parameters of the Bezier curve are fixed, and when the nodes are closer, the curve is too "curved", as shown in the figure below.

图 8

Figure 8

Connection optimization

This curve optimization mainly draws on the example of AntV X6 and the idea of DataV blueprint editor.

Add a straight line

Design ideas

The first is to add straight lines to both ends of the line based on the idea of X6. The focus of this solution is to calculate the start and end points of the first and last two straight lines. As shown in the figure below, the direction in which the straight line extends varies according to the direction of the anchor point relative to the node. The connecting anchor point of the starting node on the left side of the figure is below the node, so the straight line at the beginning end should extend downward, and the connecting anchor point of the ending end node is located above the node, and the straight line at the terminating end should extend upward. The connection anchor point of the starting node in the right figure is on the right side of the node, so the starting line should extend to the right.

图 9

Picture 9

Implementation plan

In order to verify the correctness of our ideas, we first analyzed the curve of AntV X6.

Data collection
Connection shapePath
M -335 -185 L -335 -181 C -335 -101 -165 -194 -165 -114 L -165 -110
M -375 -75 L -375 -71 C -375 9 -165 -194 -165 -114 L -165 -110
M -255 -95 L -255 -91 C -255 -11 -175 -324 -175 -244 L -175 -240
Reverse law

According to the data in the path column in the above table, it can be seen that AntV X6 adds straight lines at both ends of the Bezier curve: MoveTo start point, LineTo Bezier curve start point, then Curve, third-order Bezier curve, and finally The end of lineTo. Take the third line as an example, M -255 -95 moves to the point (-255, -95), and then L -255 -91 a straight line from the point (-255, -95) to (-255, -91). Then, C -255 -11 -175 -324 -175 -244 is drawn with (-255, -11) as the first control point, (-175, -324) as the second control point, and (-175, -244) as the end point of the Bezier curve Bezier curve. Finally, L -175 -240 draws a straight line from (-175, -244) to (-175, -240). To sum up, the coordinates in the first command and the last command are the start and end points of the connection, respectively. Both segments of the connection line deviate from the start (end) point by 4 units in the y direction, and the y coordinates of the two control points of the Bezier curve are the y coordinates of the start (end) point + (-) 80, and the two The x-coordinates of each control point are the same as the x-coordinates of the start point and the end point, respectively.

The formula is:

startPoint, endPoint
M startPoint.x  startPoint.y
L startPoint.x   startPoint.y+4
C startPoint.x  startPoint.y+4+80   endPoint.x endPoint.y-4-80  endPoint.x endPoint.y-4
L endPoint.x endPoint.y

actual effect

According to the position of the anchor point relative to the node, adapting the extension direction of the straight line at both ends can keep the arrow at the end of the connection in the correct direction. At the same time, the straight line at the beginning and end of the connection also ensures that the direction of the connection can be accurately perceived by the user. As shown below, the left side of FIG default cubic-vertical graph, the connection portion of FIG red circle, is the starting point or end not be displayed with respect to the direction of the anchor node either end arrows are hidden , These two situations will seriously affect the user's perception of the connection direction. The connection in the picture on the right completely avoids these two problems, and the connection direction is clearer and smoother, which is more in line with the user's psychological perception.

图 10

Picture 10

Room for improvement

However, this scheme is not perfect. There are still two problems to be solved simply by adding straight lines at both ends of the Bezier curve.

  1. The above scheme uses a Magic Numbe with a constant value of 80 when calculating the control points of the middle Bezier curve, which causes the shape of the curve to be a little strange when the two nodes are relatively close, as shown in the figure below.

    图 11

    Picture 11

The core of the above problem is that we do not have an accurate understanding of the relationship between the control points of the Bezier curve and the shape of the curve, and cannot understand the principle behind the Magic Number. The corresponding solution is to understand the control points of the Bezier curve. Number is modified to Func (startPoint, endPoint) .

  1. The above solution does not perceive the user's connection direction during the curve connection process, which causes the shape of the arrow and the curve to change greatly in the connection process and result. As shown in the figure below, the left side is the curve shape and arrow direction during the connection process, and the right side is the curve shape and arrow direction after the connection is completed. The core of this problem is to accurately perceive the direction that the user wants to connect to connection process. The approach of the above solution is to directly equate the connection direction with the opposite direction of the starting end connection anchor point relative to the node.

    图 12

    Picture 12

    The solution to this problem is to determine the direction of the connection according to the coordinates of the end of the connection relative to the position of the startPoint during the connection process. The transformation effect is shown in the figure below.

图 13

Figure 13

Optimize the parameters of the Bezier curve

In addition to drawing on the connection scheme of AntV X6, we also hope to draw on the Bezier curve in the DataV blueprint editor. With the above "reverse engineering" of the connection in the AntV X6 example, we plan to do the same for the Bezier curve in the DataV blueprint editor.

Implementation plan

Data collection
Curve shapePathTranslate the starting point to the coordinate origin
M 586.5 336 C 532.7496710549095 336 545.2503289450905 335.5 491.5 335.5M 0 0 C -54 0 -41 0 -95 0
M 662.5 229 C 603.0729185103246 229 626.9270814896754 159.5 567.5 159.5M 0 0 C -60 0 -36 -70 -96 -70
M 640.5 304 C 571.5182334289478 304 648.4817665710522 160.5 579.5 160.5M 0 0 C -69 0 8 -144 -61 -144
M 457.5 345 C 372.2029329439616 345 664.7970670560384 160.5 579.5 160.5M 0 0 C - 85 0 207 -185 122 -185
M 310.5 302 C 204.51346747613758 302 685.4865325238624 160.5 579.5 160.5M 0 0 C -106 0 375 -142 269 -142
M 269.5 158 C 164.2498961794736 158 675.7501038205264 157.5 570.5 157.5M 0 0 C -105 0 406 0 301 0
M sx sy,C x1 y1 x2 y2 x3 y3,sy === y1,y2 === y3M 0 0,C x1 y1 x2 y2 x3 y3,x1 + x2 === x3,y1 + y2 === y3

As shown in the above table, we collected the Paths corresponding to different shapes of Bezier curves and tried to find the laws directly. The first column is the shape of the curve, the second column is the Path corresponding to the curve, and the third column is the data after the curve's starting point is translated to the coordinate origin.

Find the pattern?

As shown in the last row of the above table, the second column only finds the law of y1 and y2 coordinates, that is, how the y coordinates of control point 1 and control point 2 are calculated. The equations described in the third column of only about x1 x2 linear equations, at this time can not be determined x1 x2, we need at least one equation with respect to x1 x2 can find the control point coordinate.

摊手

But how to find another equation for x1 and x2? First try the combination of number and shape, but the intangible number is not intuitive.

图 14

Figure 14

As shown in the figure above, the black curve in the figure corresponds to the four data in the above table. We use the data in the third column to draw this figure. We hope that the starting point will be concentrated to the origin to facilitate comparison and discover the law. Each bending point of the red curve in the figure is a control point.

We can get the following knowledge from this figure:

  1. The two control points are symmetrical with respect to the midpoint of the curve, that is, only one control point is required to get the other control point.
  2. The distance between the two control points in the x direction is greater than the distance between the two starting points in the x direction.
  3. The control points of different shapes of curves have different distances from the starting point in the x direction, that is, the distance from the control point to the starting point in the x direction is not constant.

After getting these conclusions, combined with the laws in the table, we found that in fact, as long as the coordinates of a single control point are calculated, the coordinates of another control point can be obtained. The only requirement in the coordinates of a single control point is its x-coordinate, that is, only the distance between the control point and the starting point in the x-direction is required.

From another perspective, we should only know the starting point and ending point of a curve before drawing. That is to say, the x coordinate of the control point should be calculated from the starting point and ending point, which is controlPoint.x = F (startPoint, endPoint) . However, it is also possible that controlPoint.x = F (startPoint.x, endPoint.x) , that is, the x-coordinate of the control point is only determined by the x-coordinates of the start and end points.

Take some more data to verify!

Curve shapePath
M 490.5 -146 C 439.9372150475671 -146 494.0627849524329 -213.5 443.5 -213.5
M 490.5 -89 C 427.23097348884403 -89 506.76902651115597 -213.5 443.5 -213.5

As shown in the above table, the x-coordinates of the starting point and the end point of the two curve shapes are the same, but the shape of the curve is different, and the control points are also different! The description should be controlPoint.x = F (startPoint, endPoint)

But how to find F?

再次摊手

Dig deep

Observing controlPoint.x = F (startPoint, endPoint) , combined with Figure 14, we can find that: coordinate translation does not change the shape of the curve, and the data that is meaningful for calculating control points should be the distance between the start and end points in the x direction and the y direction. In other words, controlPoint.x = F(startPoint, endPoint) is ctrlDistanceX = F (distanceX, distanceY) .

Looking for F is actually looking for patterns. Can you use the methods in probability statistics to try? Maybe the rule is simple enough, you can also draw a picture first. But the first step requires collecting more data.

  1. More data

In order to collect more data more conveniently, we directly click the code in the console on the blueprint editor page.

const data = [];
const func = () => {
  const targetPath = document.querySelector('.butterflies-link');
  data.push(targetPath.getAttribute('d'));
}
const intervalId = setInterval(func, 1000);

Then drag the node to change the shape of the curve. Finally, first clearInterval(intervalId) , then enter data in the console, copy and paste the obtained data.

The data after deduplication is as follows:

        "M 524.5 72 C 402.67229108270203 72 527.327708917298 -275.5 405.5 -275.5",
        "M 587.5 -53 C 485.63630523692785 -53 507.36369476307215 -275.5 405.5 -275.5",
        "M 523.5 -85 C 437.4786592002655 -85 491.5213407997345 -275.5 405.5 -275.5",
        "M 407.5 173 C 265.3738851783404 173 547.6261148216596 -275.5 405.5 -275.5",
        "M 600.5 17 C 482.61468766056527 17 523.3853123394347 -275.5 405.5 -275.5",
        "M 600.5 17 C 481.18236106456914 17 939.8176389354309 -264.5 820.5 -264.5",
        "M 514.5 -1 C 383.54572507813253 -1 951.4542749218674 -264.5 820.5 -264.5",
        "M 355.5 -25 C 194.73655661844936 -25 981.2634433815506 -264.5 820.5 -264.5",
        "M 355.5 -25 C 203.2729443821868 -25 952.7270556178132 -227.5 800.5 -227.5",
        "M 355.5 -25 C 251.76642131972704 -25 751.2335786802729 -66.5 647.5 -66.5",
        "M 609.5 92 C 538.7521089502782 92 718.2478910497218 -66.5 647.5 -66.5",
        "M 609.5 92 C 538.5661738289713 92 717.4338261710287 -67.5 646.5 -67.5",
        "M 609.5 92 C 473.7472665837899 92 461.2527334162101 -221.5 325.5 -221.5",
        "M 309.5 124 C 193.03243021224662 124 441.9675697877534 -221.5 325.5 -221.5",
        "M 441.5 -199 C 381.95950872107915 -199 385.04049127892085 -221.5 325.5 -221.5",
        "M 441.5 -255 C 360.8441452051197 -255 428.1558547948803 -75.5 347.5 -75.5",
        "M 441.5 -255 C 301.5519298714161 -255 496.4480701285839 176.5 356.5 176.5",
        "M 242.5 -257 C 100.44023636915881 -257 498.5597636308412 176.5 356.5 176.5",
        "M 242.5 -257 C 93.61264522666846 -257 762.3873547733315 40.5 613.5 40.5",
        "M 649.5 -258 C 544.3342456633342 -258 718.6657543366658 40.5 613.5 40.5",
        "M 649.5 -258 C 560.515759520021 -258 712.484240479979 -23.5 623.5 -23.5",
        "M 649.5 -258 C 597.4897041137563 -258 614.5102958862437 -244.5 562.5 -244.5",
        "M 649.5 -258 C 597.4202440004424 -258 613.5797559995576 -250.5 561.5 -250.5",
        "M 641.5 -242 C 591.3874261965307 -242 611.6125738034693 -250.5 561.5 -250.5",
        "M 317.5 29 C 194.74486200215108 29 684.2551379978489 -250.5 561.5 -250.5",
        "M 600.5 -79 C 526.5303726988732 -79 635.4696273011268 -250.5 561.5 -250.5",
        "M 600.5 -79 C 525.5554160660041 -79 685.4445839339959 -258.5 610.5 -258.5",
        "M 600.5 -79 C 491.4951544207572 -79 1025.5048455792428 -75.5 916.5 -75.5",
        "M 570.5 140 C 438.59432977012517 140 1048.4056702298749 -75.5 916.5 -75.5",
        "M 568.5 176 C 465.1543925991474 176 800.8456074008526 -87.5 697.5 -87.5",
        "M 568.5 176 C 453.3999669506527 176 537.6000330493473 -131.5 422.5 -131.5",
  1. Data processing

Convert the data format to calculate the distance from the first control point in the x direction to the start point, the distance between the start point and the end point in the x direction, and the distance between the start point and the end point in the y direction.

     // 第一个控制点在 X 方向上到起点的距离
     [
        121.82770891729797, 101.86369476307215, 86.0213407997345,
        142.1261148216596, 117.88531233943473, 119.31763893543086,
        130.95427492186747, 160.76344338155064, 152.2270556178132,
        103.73357868027296, 70.74789104972183, 70.93382617102873,
        135.7527334162101, 116.46756978775338, 59.540491278920854,
        80.65585479488033, 139.94807012858388, 142.0597636308412,
        148.88735477333154, 105.16575433666583, 88.98424047997901,
        52.01029588624374, 52.07975599955762, 50.11257380346933,
        122.75513799784892, 73.96962730112682, 74.94458393399589,
        109.00484557924278, 131.90567022987483, 103.3456074008526,
        115.1000330493473,
      ]

            // 起点和终点在 X 方向上的距离
            [
        119, 182, 118, 2, 195, 220, 306, 465, 445, 292, 38, 37, 284, 16, 116,
        94, 85, 114, 371, 36, 26, 87, 88, 80, 244, 39, 10, 316, 346, 129, 146,
      ]
       
      // 起点和终点在 Y 方向上的距离
      [
        347.5, 222.5, 190.5, 448.5, 292.5, 281.5, 263.5, 239.5, 202.5, 41.5,
        158.5, 159.5, 313.5, 345.5, 22.5, 179.5, 431.5, 433.5, 297.5, 298.5,
        234.5, 13.5, 7.5, 8.5, 279.5, 171.5, 179.5, 3.5, 215.5, 263.5, 307.5,
      ]
  1. Try to analyze

    If you look directly, you won’t see anything, just draw a picture and try.

    图 15

    <div align="center">Figure 15</div>

    In the figure, the abscissa is the id of the different Bezier curves, the ordinate is the distance, the red polyline is the distance from the first control point in the x direction to the starting point, and the blue polyline is the distance between the starting point and the end point in the x direction. The green polyline is the distance between the start point and the end point in the y direction.

    I don't seem to see any trends, so try it out in order.

    图 16

    Figure 16

    The red polyline is always located between the blue polyline and the green polyline, which is similar to the average value and should be in the form z = ax + by Here, I tried a = b = 0.5 | 0.4 ... and observed the degree of fit between the broken line and the red broken line obtained by different calculation results. As shown in the figure below, the black discount in the figure is the calculation result when a = b = 0.5, but the optimal result is definitely not tried...

    图 17

    Figure 17

  2. Least squares method

One method of fitting a curve is the least square method, the principle is not introduced here. In short, assuming that the red polyline can be z = ax + by by the blue polyline and the green polyline in the form of 0617bb11aa1fbc, the least squares method can help us find the most accurate a and b.

import numpy as np
from scipy import optimize        # 最小二乘法拟合


def func(x, y, p):
    """ 数据拟合所用的函数:z=ax+by
    :param x: 自变量 x
    :param y: 自变量 y
    :param p: 拟合参数 a, b
    """
    a, b = p
    return a * x + b * y

def residuals(p, z, x, y):
    """ 得到数据 z 和拟合函数之间的差
    """
    return z - func(x, y, p)

xSource = [80, 87, 88, 116, 38, 37, 39, 10, 94, 118, 26, 182, 129, 292, 36, 316, 146, 16, 195, 220, 119, 244, 306, 346, 284, 85, 114, 2, 371, 445, 465]
ySource = [8.5, 13.5, 7.5, 22.5, 158.5, 159.5, 171.5, 179.5, 179.5, 190.5, 234.5, 222.5, 263.5, 41.5, 298.5, 3.5, 307.5, 345.5, 292.5, 281.5, 347.5, 279.5, 263.5, 215.5, 313.5, 431.5, 433.5, 448.5, 297.5, 202.5, 239.5]
cSource = [50.11, 52.01, 52.07, 59.54, 70.74, 70.93, 73.96, 74.94, 80.65, 86.02, 88.98, 101.86, 103.34, 103.73, 105.16, 109.0, 115.1, 116.46, 117.88, 119.31, 121.82, 122.75, 130.95, 131.9, 135.75, 139.94, 142.05, 142.12, 148.88, 152.22, 160.76]
    

def main():

  x = np.array(xSource)
  y = np.array(ySource)
  z = np.array(cSource)  # 数据随便取的
  
  plsq = optimize.leastsq(residuals, np.array([0, 0]), args=(z, x, y))  # 最小二乘法拟合
    # [0, 0] 为参数 a, b 初始值
  
  a, b = plsq[0]  # 获得拟合结果
  print("what >>>>>")
  print("拟合结果:\na = {}".format(a))
  print("b = {}".format(b))

main()

a = 0.22320872185884902 , b = 0.28534578186377385 , which corresponds to the following effect on the figure. The red polyline is the target polyline, and the black polyline is the fitting result.

图 18

Figure 18

With a and b, there is also a calculation method for the control points of the Bezier curve. Let's go back to the canvas and take a look.

actual effect

Compare the remaining issues in the previous section "Needs to be improved", as shown in the figure below. It can be found that by modifying the calculation method of the control points of the Bezier curve, we have perfectly solved the remaining problems in the previous section.

Figure 19

Look to the future

The problem that still remains is that we only know what is happening, but we still don’t know why. We can provide the best Bezier curve shape in the graph editing scene, but it is not clear for the time being how to describe the law of the influence of the control points on the curve shape, so it is impossible to provide any shape (style) of the Bezier curve. In the future, you can consider making a small tool, first let the user drag the control point to generate several curves of a certain style, and then infer the calculation method of the control point through the program, and output it to the user.

Summarize

For the connection optimization problem in the graph editing scene, we refer to the connection scheme in the AntV X6 example and the DataV blueprint editor, and optimize the calculation method by adding straight lines at both ends of the Bezier curve and optimizing the control points of the Bezier curve. Compared with the original connection scheme- cubic-vertical curve in AntV G6, the user experience of connection is greatly improved.

Author: ES2049 / Jinx

The article can be reprinted at will, but please keep this link to the original text.
You are very welcome to join ES2049 Studio if you are passionate. Please send your resume to caijun.hcj@alibaba-inc.com


ES2049
3.7k 声望3.2k 粉丝