1

Recently, I encountered a requirement in my work. I need to verify whether the mathematical formula string is legal and calculate the formula, which is similar to the effect of a simple calculator.

公式

There are parentheses, operators and variables in mathematical formulas. Variables are read from the database and can be added and deleted at will.

Suppose the built-in variables are: height, length, width, num. For formula strings such as (length*(1+width)/height)*num , you need to verify whether the formula format is legal, and then assign values to the variables to calculate the result of the formula.

1. Formula verification

To verify the formula, in addition to some routine checks, the format of the formula needs to be optimized to clear obstacles for subsequent formula calculations.

The optimization content includes:

  • Remove the spaces in the formula.
  • Add zeros to negative and positive numbers, for example, -a optimized to 0-a .
  • Add the multiplication operator between the two parentheses, for example, (a+b)(b+c) optimized to (a+b)*(b+c) .

The complete verification logic is as follows:

import org.apache.commons.lang3.StringUtils;
import org.junit.Test;

import java.util.Arrays;
import java.util.List;
import java.util.Stack;
import java.util.regex.Pattern;

/**
 * @author Sumkor
 * @since 2021/7/14
 */
public class ValidateTest {

    @Test
    public void test() {
        List<String> variables = Arrays.asList("height", "length", "width", "num");

        String result01 = validate("(height+length)(width+num)", variables);
        System.out.println("result01 = " + result01);

        String result02 = validate("-num+100", variables);
        System.out.println("result02 = " + result02);

        String result03 = validate("(length*(1+width)/height)*num", variables);
        System.out.println("result03 = " + result03);
    }

    /**
     * 使用正则来校验数学公式
     *
     * @param expression 数学公式,包含变量
     * @param variables  内置变量集合
     */
    private String validate(String expression, List<String> variables) {
        if (variables == null || variables.isEmpty()) {
            throw new RuntimeException("内置变量为空");
        }
        // 去空格
        expression = expression.replaceAll(" ", "");
        // 连续运算符处理
        if (expression.split("[\\+\\-\\*\\/]{2,}").length > 1) {
            throw new RuntimeException("公式不合法,包含连续运算符");
        }
        if (StringUtils.contains(expression, "()")) {
            throw new RuntimeException("公式不合法,包含空括号");
        }
        expression = expression.replaceAll("\\)\\(", "\\)*\\(");
        expression = expression.replaceAll("\\(\\-", "\\(0-");
        expression = expression.replaceAll("\\(\\+", "\\(0+");
        // 校验变量
        String[] splits = expression.split("\\+|\\-|\\*|\\/|\\(|\\)");
        for (String split : splits) {
            if (StringUtils.isBlank(split) || Pattern.matches("-?(0|([1-9]\\d*))(\\.\\d+)?", split)) {
                continue;
            }
            if (!variables.contains(split)) {
                throw new RuntimeException("公式不合法,包含非法变量或字符");
            }
        }
        // 校验括号
        Character preChar = null;
        Stack<Character> stack = new Stack<>();
        String resultExpression = expression;
        for (int i = 0; i < expression.length(); i++) {
            char currChar = expression.charAt(i);
            if (i == 0) {
                if (Pattern.matches("\\*|\\/", String.valueOf(currChar))) {
                    throw new RuntimeException("公式不合法,以错误运算符开头");
                }
                if (currChar == '+') {
                    resultExpression = expression.substring(1);
                }
                if (currChar == '-') {
                    resultExpression = "0" + expression;
                }
            }
            if ('(' == currChar) {
                stack.push('(');
            } else if (')' == currChar) {
                if (stack.size() > 0) {
                    stack.pop();
                } else {
                    throw new RuntimeException("公式不合法,括号不配对");
                }
            }
            if (preChar != null && preChar == '(' && Pattern.matches("[\\+\\-\\*\\/]+", String.valueOf(currChar))) {
                throw new RuntimeException("公式不合法,左括号后是运算符");
            }
            if (preChar != null && preChar == ')' && !Pattern.matches("[\\+\\-\\*\\/]+", String.valueOf(currChar))) {
                throw new RuntimeException("公式不合法,右括号后面不是运算符");
            }
            if (i == expression.length() - 1) {
                if (Pattern.matches("\\+|\\-|\\*|\\/", String.valueOf(currChar)))
                    throw new RuntimeException("公式不合法,以运算符结尾");
            }
            preChar = currChar;
        }
        if (stack.size() > 0) {
            throw new RuntimeException("公式不合法,括号不配对");
        }
        return resultExpression;
    }
}

2. Formula calculation

When the validity of the mathematical formula string is verified and the variables in it are assigned, the result of its operation can be calculated.

Here we respectively use the JDK built-in JS engine and the postfix expression algorithm implemented by myself to calculate the calculation result of the mathematical formula string, and compare the calculation efficiency of the two.

2.1 Use JS engine

Use the JDK built-in JS engine to calculate the result, which is convenient and accurate. The disadvantage is that the calculation efficiency is relatively low.

@Test
public void scriptEngine() throws ScriptException {
    String str = "43*(2+1.4)+2*32/(3-2.1)";
    ScriptEngineManager manager = new ScriptEngineManager();
    ScriptEngine engine = manager.getEngineByName("js");
    Object result = engine.eval(str);
    System.out.println("结果类型:" + result.getClass().getName() + ",计算结果:" + result);
}

operation result:

结果类型:java.lang.Double,计算结果:217.3111111111111

2.2 Using algorithm

Write an algorithm by yourself to complete the formula calculation, you can get the calculation result in a shorter time, but you need to ensure the accuracy of the algorithm.

The online article java realizes the calculation of complex mathematical expressions provides a good idea, but the code in this article only supports the calculation of single digits, and in some cases it will be calculated incorrectly.

The idea is to do it in two steps

  1. Translate the input mathematical expression, that is, infix expression to postfix expression. For example, a+b*(c-d) converted to a suffix expression is abcd-*+
  2. Calculate the result of the suffix expression. The stack is used here to store the calculation results. Each time it calculates two numbers, such as abcd-*+ . The calculation method is to traverse from the beginning, and the numbers are directly put on the stack. When a calculation symbol is encountered, the two numbers are taken out from the top of the stack for calculation and then Then push the result to the stack, and after all calculations are completed, only one element remains in the stack is the result.

This article also implements this idea, supports the calculation and verification of multiple digits and decimal points.

2.2.1 How to use postfix expressions

The core idea of this algorithm is to convert the mathematical formula from infix expression to suffix expression, for example, 3-10*2+5 to suffix expression is 3, 10, 2, *, -, 5, + .
Before discussing how to generate postfix expressions, let's take a look at how to use postfix expressions to calculate results.

parameter definition

  • Use the linked list LinkedList to store each element of the suffix expression. The element may be a numeric value or an operator, so the element type is defined as a string.
  • Since it supports the calculation of multiple digits and decimal points, the BigDecimal type is used as the result of the operation.
/**
 * 根据后缀表达式,得到计算结果
 */
private static BigDecimal doCalculate(LinkedList<String> postfixList) {
    // 操作数栈
    Stack<BigDecimal> numStack = new Stack<>();
    while (!postfixList.isEmpty()) {
        String item = postfixList.removeFirst();
        BigDecimal a, b;
        switch (item) {
            case "+":
                a = numStack.pop();
                b = numStack.pop();
                numStack.push(b.add(a));
                break;
            case "-":
                a = numStack.pop();
                b = numStack.pop();
                numStack.push(b.subtract(a));
                break;
            case "*":
                a = numStack.pop();
                b = numStack.pop();
                numStack.push(b.multiply(a));
                break;
            case "/":
                a = numStack.pop();
                b = numStack.pop();
                numStack.push(b.divide(a, 2, RoundingMode.HALF_UP));
                break;
            default:
                numStack.push(new BigDecimal(item));
                break;
        }
    }
    return numStack.pop();
}

2.2.2 How to generate postfix expressions

Using suffix expressions to obtain the calculation results is relatively simple to implement, and the key lies in how to convert the correct suffix expressions.

parameter definition

  • Define an operator stack Stack<Character> optStack to adjust the order of operators according to priority.
  • Define a multi-digit chain LinkedList<Character> multiDigitList , which is used to temporarily store the multi-digit and decimal point in the mathematical formula.
  • LinkedList<String> postfixList parameter uses the linked list 061c531ad53bad to store the suffix expression.

Traverse the string of mathematical expressions character by character to generate a suffix expression, the steps are as follows:

  1. If a number or decimal point is encountered, the current character is temporarily stored in a multi-digit chain.
  2. If you encounter an operator, first take out all the existing content in the multi-digit chain, store it in the suffix expression chain, and then proceed to the next step of judgment.
  3. If the current character is a left parenthesis, it is directly pushed onto the operator stack.
  4. If the current character is an addition, subtraction, multiplication, and division operator, it needs to be compared with the operator at the top of the operator stack:
    If the current operator has a higher priority, it will be pushed directly onto the stack;
    Otherwise, repeatedly pop the top element of the stack to the suffix expression chain until the priority of the current operator is higher than that of the top operator, and then push the current operator onto the stack.
  5. If the current character is a right parenthesis, repeatedly pop the top element of the operator stack to the postfix expression until the top element of the stack is a left parenthesis, and pop the left parenthesis from the stack and discard it.
  6. At the end of the traversal, if there is still data in the multi-digit chain or operator stack, it is added to the postfix expression chain.

The implementation is as follows:

/**
 * 将中缀表达式,转换为后缀表达式,支持多位数、小数
 * 
 * @author Sumkor
 * @since 2021/7/14
 */
private static LinkedList<String> getPostfix(String mathStr) {
    // 后缀表达式链
    LinkedList<String> postfixList = new LinkedList<>();
    // 运算符栈
    Stack<Character> optStack = new Stack<>();
    // 多位数链
    LinkedList<Character> multiDigitList = new LinkedList<>();
    char[] arr = mathStr.toCharArray();
    for (char c : arr) {
        if (Character.isDigit(c) || '.' == c) {
            multiDigitList.addLast(c);
        } else {
            // 处理当前的运算符之前,先处理多位数链中暂存的数据
            if (!multiDigitList.isEmpty()) {
                StringBuilder temp = new StringBuilder();
                while (!multiDigitList.isEmpty()) {
                    temp.append(multiDigitList.removeFirst());
                }
                postfixList.addLast(temp.toString());
            }
        }
        // 如果当前字符是左括号,将其压入运算符栈
        if ('(' == c) {
            optStack.push(c);
        }
        // 如果当前字符为运算符
        else if ('+' == c || '-' == c || '*' == c || '/' == c) {
            while (!optStack.isEmpty()) {
                char stackTop = optStack.pop();
                // 若当前运算符的优先级高于栈顶元素,则一起入栈
                if (compare(c, stackTop)) {
                    optStack.push(stackTop);
                    break;
                }
                // 否则,弹出栈顶运算符到后缀表达式,继续下一次循环
                else {
                    postfixList.addLast(String.valueOf(stackTop));
                }
            }
            optStack.push(c);
        }
        // 如果当前字符是右括号,反复将运算符栈顶元素弹出到后缀表达式,直到栈顶元素是左括号(为止,并将左括号从栈中弹出丢弃。
        else if (c == ')') {
            while (!optStack.isEmpty()) {
                char stackTop = optStack.pop();
                if (stackTop != '(') {
                    postfixList.addLast(String.valueOf(stackTop));
                } else {
                    break;
                }
            }
        }
    }
    // 遍历结束时,若多位数链中具有数据,说明公式是以数字结尾
    if (!multiDigitList.isEmpty()) {
        StringBuilder temp = new StringBuilder();
        while (!multiDigitList.isEmpty()) {
            temp.append(multiDigitList.removeFirst());
        }
        postfixList.addLast(temp.toString());
    }
    // 遍历结束时,运算符栈若有数据,说明是由括号所致,需要补回去
    while (!optStack.isEmpty()) {
        postfixList.addLast(String.valueOf(optStack.pop()));
    }
    return postfixList;
}

For the operator stack, the left parenthesis will be pushed directly onto the stack, and the right parenthesis will not be pushed onto the stack.
Therefore, to compare the priority of the top element of the stack with the current element, the code is as follows:

/**
 * 比较优先级
 * 返回 true 表示 curr 优先级大于 stackTop
 */
private static boolean compare(char curr, char stackTop) {
    // 左括号会直接入栈,这里是其他运算符与栈顶左括号对比
    if (stackTop == '(') {
        return true;
    }
    // 乘除法的优先级大于加减法
    if (curr == '*' || curr == '/') {
        return stackTop == '+' || stackTop == '-';
    }
    // 运算符优先级相同时,先入栈的优先级更高
    return false;
}

2.2.3 Verification

Write multiple test cases as follows.

@Test
public void calculateTest() {
    String str = "5-1*(5+6)+2";
    System.out.println(str + " = " + calculate(str));

    str = "50-1*(5+6)+2";
    System.out.println(str + " = " + calculate(str));

    str = "(50.5-1)*(5+6)+2";
    System.out.println(str + " = " + calculate(str));

    str = "1+2*(3-4*(5+6))";
    System.out.println(str + " = " + calculate(str));

    str = "1/2*(3-4*(5+6))*10";
    System.out.println(str + " = " + calculate(str));

    str = "43*(2+1)+2*32+98";
    System.out.println(str + " = " + calculate(str));

    str = "3-10*2+5";
    System.out.println(str + " = " + calculate(str));
}

/**
 * 1. 将中缀表达式转后缀表达式
 * 2. 根据后缀表达式进行计算
 */
public static BigDecimal calculate(String mathStr) {
    if (mathStr == null || mathStr.length() == 0) {
        return null;
    }
    LinkedList<String> postfixList = getPostfix(mathStr);
    // System.out.println("后缀表达式:" + postfixList.toString());
    return doCalculate(postfixList);
}

The execution result is as follows, and the calculation result is correct.

5-1*(5+6)+2 = -4
50-1*(5+6)+2 = 41
(50.5-1)*(5+6)+2 = 546.5
1+2*(3-4*(5+6)) = -81
1/2*(3-4*(5+6))*10 = -205
43*(2+1)+2*32+98 = 291
3-10*2+5 = -12

2.3 Performance comparison

Execute 10,000 formula calculations, which is time-consuming for comparison.

/**
 * 耗时对比
 */
@Test
public void vs() throws ScriptException {
    long start = System.currentTimeMillis();
    ScriptEngineManager manager = new ScriptEngineManager();
    ScriptEngine engine = manager.getEngineByName("js");
    for (int i = 0; i < 10000; i++) {
        String str = "43*(2+1.4)+2*32/(3-2.1)" + "+" + i;
        Object result = engine.eval(str);
    }
    System.out.println("耗时:" + (System.currentTimeMillis() - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        String str = "43*(2+1.4)+2*32/(3-2.1)" + "+" + i;
        BigDecimal result = TransferTest.calculate(str);
    }
    System.out.println("耗时:" + (System.currentTimeMillis() - start));
}

It can be seen that the computational efficiency of the JDK built-in JS engine is far inferior to the postfix expression algorithm.

耗时:5989
耗时:71

Author: Sumkor
Link: https://segmentfault.com/a/1190000040348550


Sumkor
148 声望1.3k 粉丝

会写点代码