3

problem to be solved

In the visualization scene of the risk control field graph, due to the large number of nodes and complex relationships in the visualization graph, it is difficult for users to see the relationship between the nodes; usually we use some graph layout algorithms to lay out the entire graph, so that the entire graph is The relationship of the graph is clearer, which is convenient for user analysis.

Common graph layout algorithms include circular layout, hierarchical layout, orthogonal layout, and force-oriented layout. Usually, we will apply a certain layout algorithm in a large graph, or provide an interactive way to switch the layout of the entire graph to help. Users analyze problems, but it is often difficult to meet business demands with a single layout in a large graph, because each layout algorithm has certain advantages and disadvantages. For example, circular layout and concentric circle layout are easy to find the nodes with the most degrees in the graph. However, it is not suitable for the case of many nodes; the hierarchical layout is suitable for seeing the level of the nodes, but it will cause a waste of space; the force-guided layout can avoid the overlapping of nodes, but it often leads to complicated and difficult connections. The relationship between nodes in the analysis graph is analyzed, and the calculation performance of the layout in the large graph scene is relatively low. However, by switching the layout of the graph, the node positions of the entire graph need to be recalculated. Since the positions of all nodes have changed, it is not only unfavorable for analysis, but also affects performance.

Usually, the urgent demand of users is to be able to select any node in a large graph to customize the layout. Therefore, this method aims to solve the problem that the traditional single layout method cannot well display the relationship between the nodes in the graph in the graph visualization scene.

technical background

subgraph

The concept of subgraph originated from graph theory, which refers to a graph in which the node set and the edge set are subsets of the node set and edge set of a graph, for example: set \(G=<V,E>\), \(G'=<V',E'>\) is two graphs (same undirected graph and directed graph), if \(V'\subset V\) and \(G=<V,E>\) , then \(G\) is called a subgraph of \(G'\).

Force-guided layout algorithm (Fruchterman-Reingold)

The force-directed layout was first proposed by Peter Eades in his 1982 paper "Heuristic Drawing Algorithms". The purpose is to reduce the intersection of edges in the layout and keep the lengths of the edges as consistent as possible. This method uses the spring model to simulate the layout process, and uses the spring to simulate the relationship between two nodes. After the nodes are affected by the elastic force, the nodes that are too close will be bounced away and the points that are too far will be pulled closer. Iterate, and finally make the entire graph layout reach dynamic balance and tend to be stable.

After that, Thomas Fruchterman & Edward Reingold proposed a new concept of force-guided layout algorithm in 1991, namely the FR algorithm (Fruchterman-Reingold). The algorithm improves the previous spring model, enriches the physical model between nodes, adds electrostatic force between nodes, and achieves the purpose of layout by calculating the total energy of the system and minimizing the energy. Whether it is this improved energy model or the previous spring model, the essence of the algorithm is an energy optimization problem. The difference lies in the composition of the optimization function. The optimization object includes the gravitational force and the repulsive force, and different algorithms express the gravitational force and the repulsive force. different.

For example, for a graph\(G\), there are nodes\(i\) and nodes\(j\), respectively, using the Euclidean distance (ie real distance) to represent the two points, \(s(i,j)\) Represents the natural length of the spring, \(k\) is the elastic coefficient, \(r\) is the electrostatic force constant between the two nodes, \(w\) is the weight between the two nodes. Then the formulas of the two algorithms are as follows:

Spring model: image.png

Energy Model: image.png

Similar program research

A Visual Layout Method Based on Graph Multi-stage Task System Module Decomposition

The layout method first calls the multi-stage task system module decomposition algorithm to decompose the graph, generates a module decomposition tree, and represents the internal substructure of the graph in the form of a tree. The nodes in the tree are divided into three types: Parallel, Serial, and Neighbor according to the link law. ;Secondly, from top to bottom, local layout of subgraphs is carried out according to node type, different types of subgraphs use different layout algorithms, the principles are beautiful, non-overlapping and can reflect the clustering characteristics of nodes; Finally, according to the size of the canvas and actual needs , set the position of the tree node, and then from top to bottom, combine the position of the parent node and the displacement of the child node relative to the parent node to perform the overall layout of all nodes in the tree, and the final obtained leaf node position is the layout result.

The visual layout method based on the module decomposition of the multi-stage task system of the graph, the subgraph layout is carried out by dividing all nodes into three types, which means that a graph can only support three layouts at most, and the nodes are divided by the module decomposition algorithm. To decompose, there is no interactive method for user-defined selection of nodes for sub-graph layout, which will cause users to be unable to analyze any sub-graph in the graph.

For example, in a complex relationship graph, the user usually selects multiple nodes to be analyzed, and then selects the corresponding layout algorithm to layout the subgraph. Using the advantages of different layout algorithms, each node in the subgraph can be quickly identified. Characteristics.

Therefore, in view of the above two shortcomings, this solution proposes a visual layout method based on self-defined sub-images. This method allows users to arbitrarily select multiple sub-images in a large image for various layout methods, so that users can quickly Effectively analyze the information contained in the graph.

The implementation process of this program

img

Detailed description of the implementation process of this scheme:

1. Subgraph segmentation

Users can filter out the sub-images that they want to layout from a large image according to their own needs. As shown in the figure below, nodes with different colors represent different subgraphs. As shown below

img

2. Submap layout parameter input

This method can support any existing layout algorithm to layout the sub-pictures, so the user needs to set the relevant layout parameters of the sub-picture layout.

3. Calculate the position of all nodes in the subgraph

Each subgraph will calculate all node positions in the subgraph according to the layout parameters set by the user. For example, concentric circle layout and grid layout calculations can be performed on the subgraphs in the above figure.

4. Is there a definite subgraph center position?

After the layout calculation of the nodes in each subgraph, there will be overlap between the subgraphs. Therefore, this method allows the user to customize the center position of each subgraph. In the scenario of a small amount of data, due to the graph structure It is relatively simple and easy for users to customize. This method is usually more effective; however, in the scenario of large data volume, because the graph structure will become very complex, it is difficult for users to customize the center position of each subgraph. Therefore, this solution provides A force-guided layout algorithm that can avoid overlapping sub-graphs after layout. The specific process is as follows

img

1) First of all, this method ignores the solid edges of nodes in all subgraphs, because in the subsequent force-guided layout calculation, the existence of solid edges will affect the position of the overall layout.

2) After that, each subgraph will be abstracted into an oversized circular virtual node.

3) Each subgraph will create a virtual edge with each other,

4) At this time, the subgraph will be constructed with other nodes into another large graph, and then we calculate the positions of all nodes through the force-guided layout algorithm.

5) Since the position calculated by the force-guided layout cannot keep the topology information of the original subgraph, we will record the relative position of each node, and the Laplacian difference between each node and its adjacent nodes can be calculated by the following formula :

img

6) Finally, update the positions of all subgraphs and other nodes in batches, so as to achieve the purpose of customizing the layout of subgraphs.

final effect

The final layout effect is shown in the following figure:

  1. Concentric circle layout + grid layout

同心圆布局+网格布局

  1. Circle Layout + Grid Layout

圆形布局+网格布局

  1. Concentric circle layout + DAG layout + grid layout

同心圆布局+DAG布局+网格布局

Summarize

Subgraph layout has always been the focus of research in the field of graph visualization. This method abstracts subgraphs into virtual nodes, creates virtual edges, and then uses force-derivation algorithm to calculate the positions of all subgraphs, and finally uses Laplace transform. In this way, the entire graph maintains the original topological properties as much as possible. Thus, a graph can be split into multiple custom subgraphs for different layouts. At present, it is widely used in graph analysis business scenarios.

Author: ES2049 / Xie Kangkui

The article can be reproduced at will, but please keep the original link.
You are very welcome to join ES2049 Studio , please send your resume to caijun.hcj@alibaba-inc.com


ES2049
3.7k 声望3.2k 粉丝