Netty series: xml codec commonly used in netty

Introduction

Before json, xml was the most commonly used data transmission format. Although xml has a lot of redundant data, the structure of xml is simple and clear, and it is still used in different places in the program. support.

Netty's support for xml is manifested in two aspects. The first aspect is to split the encoded multiple xml data into frames, and each frame contains a complete xml. On the other hand, semantic parsing of xml is performed on the segmented frame.

You can use XmlFrameDecoder for frame splitting, and XmlDecoder for parsing the content of xml files. Next, we will explain the implementation and use of the two decoders in detail.

XmlFrameDecoder

Because we receive a data stream, we are not sure what the received data is. A normal xml data may be split into multiple data frames.

As follows:

 +-------+-----+--------------+
   | <this | IsA | XMLElement/> |
   +-------+-----+--------------+

This is a normal xml data, but it has been split into three frames, so we need to merge it into one frame as follows:

 +-----------------+
   | <thisIsAXMLElement/> |
   +-----------------+

There may also be cases where different xml data is split in multiple frames, as follows:

 +-----+-----+-----------+-----+----------------------------------+
   | <an | Xml | Element/> | <ro | ot><child>content</child></root> |
   +-----+-----+-----------+-----+----------------------------------+

The above data needs to be split into two frames:

 +-----------------+-------------------------------------+
   | <anXmlElement/> | <root><child>content</child></root> |
   +-----------------+-------------------------------------+

The logic of splitting is very simple, mainly by judging the position of the xml separator to determine whether the xml starts or ends. There are three delimiters in xml, they are '<', '>' and '/'.

In the decode method, only these three delimiters need to be judged.

There are also some additional judgment logic, such as whether it is a valid xml start character:

 private static boolean isValidStartCharForXmlElement(final byte b) {
        return b >= 'a' && b <= 'z' || b >= 'A' && b <= 'Z' || b == ':' || b == '_';
    }

Is it a comment:

 private static boolean isCommentBlockStart(final ByteBuf in, final int i) {
        return i < in.writerIndex() - 3
                && in.getByte(i + 2) == '-'
                && in.getByte(i + 3) == '-';
    }

Is it CDATA data:

 private static boolean isCDATABlockStart(final ByteBuf in, final int i) {
        return i < in.writerIndex() - 8
                && in.getByte(i + 2) == '['
                && in.getByte(i + 3) == 'C'
                && in.getByte(i + 4) == 'D'
                && in.getByte(i + 5) == 'A'
                && in.getByte(i + 6) == 'T'
                && in.getByte(i + 7) == 'A'
                && in.getByte(i + 8) == '[';

After using these methods to determine the starting position of the xml data, you can call the extractFrame method to copy the ByteBuf to be used from the original data, and finally put it into out:

 final ByteBuf frame =
                    extractFrame(in, readerIndex + leadingWhiteSpaceCount, xmlElementLength - leadingWhiteSpaceCount);
            in.skipBytes(xmlElementLength);
            out.add(frame);

XmlDecoder

After splitting the xml data into frames, the next step is to parse the specific data in the xml.

Netty provides an xml data parsing method called XmlDecoder, which is mainly used to parse the substantial content of a frame that is already a single xml data. Its definition is as follows:

 public class XmlDecoder extends ByteToMessageDecoder

XmlDecoder splits the xml part into XmlElementStart, XmlAttribute, XmlNamespace, XmlElementEnd, XmlProcessingInstruction, XmlCharacters, XmlComment, XmlSpace, XmlDocumentStart, XmlEntityReference, XmlDTD and XmlCdata according to the read xml content.

These data basically cover all possible elements in xml.

All these elements are defined in the io.netty.handler.codec.xml package.

But XmlDecoder's reading and parsing of xml borrows a third-party xml toolkit: fasterxml.

XmlDecoder uses AsyncXMLStreamReader and AsyncByteArrayFeeder in fasterxml to parse xml data.

These two properties are defined as follows:

 private static final AsyncXMLInputFactory XML_INPUT_FACTORY = new InputFactoryImpl();
    private final AsyncXMLStreamReader<AsyncByteArrayFeeder> streamReader;
    private final AsyncByteArrayFeeder streamFeeder;

            this.streamReader = XML_INPUT_FACTORY.createAsyncForByteArray();
        this.streamFeeder = (AsyncByteArrayFeeder)this.streamReader.getInputFeeder();

The logic of decode is to read different data by judging the type of xml element, and finally encapsulate the read data into various xml objects mentioned above, and finally add the xml object to the out list and return.

Summarize

We can use XmlFrameDecoder and XmlDecoder to achieve very convenient xml data parsing. Netty has already built the wheel for us, so we don't need to invent it by ourselves.

This article has been included in http://www.flydean.com/14-7-netty-codec-xml/
The most popular interpretation, the most profound dry goods, the most concise tutorials, and many tricks you don't know are waiting for you to discover!
Welcome to pay attention to my official account: "Program those things", understand technology, understand you better!

Netty series: xml codec commonly used in netty

Introduction

XmlFrameDecoder

XmlDecoder

Summarize

flydean

引用和评论

在stable diffussion中完美修复AI图片

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性