1
头图

In the new version of the Flutter visualization library Graphic , the declarative definition syntax is optimized to better reflect the essence of the graphic syntax.

This article uses Graphic graphic grammar definition transformation, step by step the histogram is evolved into a pie chart, showing the flexibility and richness of graphic grammar. It also allows beginners to understand the basic concepts of graphic grammar.

If you have never been exposed to graphic grammar, it will not affect the reading of this article. This article can be regarded as an introductory tutorial for Graphic

Both histogram and pie charts are common types in data visualization. They are very different at first glance, but they have the same essence in the grammar of graphics. Why? Let us step by step from a bar chart to a pie chart to understand the reason.

Let’s start with the most common histogram. The data used is the same as entry example

const data = [
  {'category': 'Shirts', 'sales': 5},
  {'category': 'Cardigans', 'sales': 20},
  {'category': 'Chiffons', 'sales': 36},
  {'category': 'Pants', 'sales': 10},
  {'category': 'Heels', 'sales': 10},
  {'category': 'Socks', 'sales': 20},
];

Declarative definition

Graphic adopts a declarative definition, and all the visualization syntax is in the constructor of the Chart 1619c988e90be6:

Chart(
  data: data,
  variables: {
    'category': Variable(
      accessor: (Map map) => map['category'] as String,
    ),
    'sales': Variable(
      accessor: (Map map) => map['sales'] as num,
    ),
  },
  elements: [IntervalElement()],
  axes: [
    Defaults.horizontalAxis,
    Defaults.verticalAxis,
  ],
)

Data and variables

The data of the chart is data , which can be an array of any type. Inside the chart, these data items will be converted to the standard Tuple type. How to convert the data item to Tuple field value is defined by the variable ( Variable ).

As can be seen from the code, the defined grammar is very brief, but variables takes up half of the space. Dart is a strictly typed language. In order to allow any type of input data, a detailed Variable definition is indispensable.

Geometric elements

The most important feature of graphic grammar is the distinction between abstract data graphs and concrete graphs.

For example, whether the data describes an interval or a single point, this is called a graph; and whether the graph is represented as a long bar or a triangle, how high and wide, this is called a graphic. The steps to generate graph and graphic are called geometry and aesthetic respectively.

The concepts of Graph and graphic touch the essential relationship between data and graphs, and are the key to graph grammar breaking out of the traditional chart classification.

The two definitions are called geometric elements ( GeomElement ). Its type determines the graph, which is divided into:

The column height of the histogram represents the interval from 0 to the data value, so IntervalElement . In this way, we get the most common histogram :

back to the question at the beginning, the opening angle of the pie chart also expresses an interval, which should also belong to 1619c988e90e1f IntervalElement , but why the bar chart is a bar and the pie chart is a sector?

Coordinate System

The coordinate system assigns different variables to different dimensions on the plane. For the rectangular coordinate system ( RectCoord ), the dimensions are horizontal and vertical respectively, and for the polar coordinate system ( PolarCoord ), the dimensions are angle and radius respectively.

The current example does not specify the coord field, so the coordinate system is the default rectangular coordinate system. Since the pie chart expresses the interval through the opening angle, the polar coordinate system should be used. We add a line of definition to specify the use of the polar coordinate system:

coord: PolarCoord()

The graphic becomes rose diagram :

It seems to be getting close to the pie chart. However, the graphics obtained by this "one-key switch" is still very imperfect and needs some processing.

measure

The first problem is that the ratio of the radius of the sector seems to be sales from the ratio of the 0619c988e90ecc data.

To deal with this problem, it involves an important concept in graph grammar: scale ( Scale ).

The value of the original data may be numeric, string, or time. Even if they are both numerical values, the scales may differ by several orders of magnitude. Therefore, before graphs use them, they need to be standardized. This process is called measurement.

For continuous data, such as numeric values and time, normalize them to [0, 1] ; for discrete data, such as strings, map them to natural number indexes such as 0, 1, 2, 3... .

Each variable has a corresponding metric, which is set in the scale Variable Tuple may be one of three values ( num ), time ( DateTime ), string ( String ), so the metric is divided into three types according to the original data type processed:

  • LinearScale : linearly normalize the interval value to [0, 1] , continuous type
  • TimeScale : linearly normalize the interval time to the value on [0, 1]
  • OrdinalScale : sequentially map the string to a natural number index, continuous

For numerical values, the default LinearScale will determine the interval according to the data range of the chart, so the minimum value is not necessarily 0. For a histogram, this can make the graph a good focus on the height difference, but it is not suitable for a rose chart, because people tend to think that the radius reflects a proportional relationship.

Therefore, you need to manually set the minimum value of the LinearScale

'sales': Variable(
  accessor: (Map map) => map['sales'] as num,
  scale: LinearScale(min: 0),
),

Concrete attributes

The second problem is that different sectors are next to each other and need to be distinguished by color, and people in rose diagrams are more accustomed to using labels instead of coordinate axes for labeling.

Similar to colors, labels, etc., what people use to perceive graphics is called aesthetic attributes. Graphic has the following concrete attribute types:

  • position : location
  • shape : specific shape
  • color : color
  • gradient : gradient color, can replace color
  • elevation : shadow height
  • label : Tags
  • size : Dimensions

Except position , each representational attribute is defined in GeomElement by the corresponding Attr class. By defining the different fields, it can be divided into the following ways:

  • Specify the attribute value directly through value
  • Specify the associated variables and target attribute values through variable , values , and stops . The variable value will be interpolated or indexed into the attribute value depending on the type. This attribute is called the channel attribute ( ChannelAttr ).
  • Through encoder directly define the method of data item mapping attribute value.

In the example, we use color and label to configure different colors and labels for each sector:

elements: [IntervalElement(
  color: ColorAttr(
    variable: 'category',
    values: Defaults.colors10,
  ),
  label: LabelAttr(
    encoder: (tuple) => Label(
      tuple['category'].toString(),
    ),
  ),
)]

In this way, a more complete rose chart is obtained:

How to change from rose chart to pie chart?

Coordinate system transpose

There is often a functional relationship between different variables of the data: y = f(x) , we call the dimension of the function domain as the domain dimension, which is often represented by x; the dimension where the function range is called the measure dimension , Usually y means. Conventionally, for a plane, the domain dimension of the rectangular coordinate system corresponds to the horizontal direction, and the value domain dimension corresponds to the vertical direction; the domain dimension of the polar coordinate system corresponds to the angle, and the value domain dimension corresponds to the radius.

The rose chart uses the radius to represent the value, while the pie chart uses the angle to represent the value. Therefore, the two are converted to each other. The first step is to exchange the corresponding relationship between the dimension and the plane in the coordinate system. This is called coordinate system transpose:

coord: PolarCoord(transposed: true)

The graph becomes racing graph :

It seems to be closer to a pie chart.

Variable conversion

In the pie chart, all the sectors add up to form a circle, and the arc length of each sector is the proportion of this data item in the total. In the above picture, all the arcs are spliced together, which obviously exceeds a circle.

One way is to set the interval of the measure of sales sales values, so that exactly each sales value is measured as its proportion in the total. But for dynamic data, we often don't know what the actual data is when we define the chart.

Another way is, if the range variable is sales value in the total, then just define the original interval of the variable measurement as [0, 1] .

At this time, variable transformation ( VariableTransform ) can be used, which can perform statistical transformation on existing variable data, modify variable data or generate new variables. Proportion is used here, which calculates sales in the total, generates a new percent variable, and sets the [0, 1] measure of the original interval for this variable:

transforms: [
  Proportion(
    variable: 'sales',
    as: 'percent',
  ),
]

Graphic algebra

After setting the variable conversion, we encountered a new problem. The original Tuple has only category and sales , which can be assigned to the two dimensions of the domain and the range, which is self-explanatory. But now there is an additional percent variable. How to distribute the three chestnuts to the two monkeys must be clearly specified.

To define the relationship between variables and dimensions, you need to use graphic algebra.

Graphic algebra uses an expression to connect the variable set Varset with an operator to define the relationship between variables and how they are allocated to each dimension. There are three operators in graphics algebra:

  • * : called cross, assign the variables on both sides to different dimensions in order.
  • + : called blend, assign the variables on both sides to the same dimension in sequence.
  • / : called nest, group all data by the variable on the right

We need to assign the category and the converted percent variables to the domain and the value domain respectively. Thanks to Dart's class operator overloading, Graphic implements all graphic algebra operations through the Varset class, so graphic algebra passes position defined as follows:

position: Varset('category') * Varset('percent')

After setting up variable conversion and graph algebra in this way, the graph becomes:

Grouping and adjustment

The length of each arc is processed, and then it is time to "splice" them. The first step of splicing is to adjust their positions to be connected end to end in angle.

This position adjustment is defined Modifier The object of adjustment is not a single data item, so we must first group all the data according to category . For the sample data, each data item is a group after grouping. Grouping is defined by the nest operator in graphic algebra. Then we set the "stack adjustment" ( StackModifier ):

elements: [IntervalElement(
  ...
  position: Varset('category') * Varset('percent') / Varset('category'),
  modifiers: [StackModifier()],
)]

Since the sum of the arc lengths has been made to be a circle before, the effect of connecting end to end in the angle is achieved after stacking, which can be regarded as sunburst diagram :

Coordinate dimension

It's just the last step: the angle of each arc is already in place, as long as they are filled with the entire radius, a pie is formed as a whole.

We observe the radius dimension, and have just category to it through graphic algebra, so each arc falls in a different "track" in order. But in fact, we hope that the radius position should not be distinguished, only the angle of the dimension works. In other words, we hope that this polar coordinate system is a one-dimensional coordinate system with only angles.

We only need to specify the number of dimensions of the coordinate system as 1, and remove category algebraic expression:

coord: PolarCoord(
  transposed: true,
  dimCount: 1,
)
...
position: Varset('percent') / Varset('category')

In this way, each arc segment will cover the entire radius indiscriminately, and the pie chart is drawn:


The complete definition of the pie chart is as follows:

Chart(
  data: data,
  variables: {
    'category': Variable(
      accessor: (Map map) => map['category'] as String,
    ),
    'sales': Variable(
      accessor: (Map map) => map['sales'] as num,
      scale: LinearScale(min: 0),
    ),
  },
  transforms: [
    Proportion(
      variable: 'sales',
      as: 'percent',
    ),
  ],
  elements: [IntervalElement(
    position: Varset('percent') / Varset('category'),
    groupBy: 'category',
    modifiers: [StackModifier()],
    color: ColorAttr(
      variable: 'category',
      values: Defaults.colors10,
    ),
    label: LabelAttr(
      encoder: (tuple) => Label(
        tuple['category'].toString(),
        LabelStyle(Defaults.runeStyle),
      ),
    ),
  )],
  coord: PolarCoord(
    transposed: true,
    dimCount: 1,
  ),
)

In this process, we changed the graph grammar definitions such as coordinates, metrics, concrete attributes, variable conversion, graph algebra, adjustments, etc., so that the graphs were constantly transformed, and obtained the histogram, rose graph, racing graph, and sunburst in the traditional chart classification. Graphs, pie charts.

It can be seen that the definition of graphic grammar breaks out of the shackles of traditional chart types, and can be arranged and combined to form more visual graphics, which has better flexibility and expandability. More importantly, it reveals the connections and differences between the essence of different visualization graphics, and provides a theoretical basis for the development of data visualization science.


Entronad
6 声望4 粉丝

程鼓师