头图

final goal

Using nodejs to realize the console man-machine game of Gobang, the man moves first, and the interaction stops after one side wins. (see gif below)
GIFa.gif

The main components of the program of man-machine game

  • Some way of representing a chess position in a machine that allows the program to know the state of the game.
  • Generate rules for legal moves so that the game can be played fairly and can judge whether human opponents are rambling.
  • The technique of selecting the best move from all the legal moves.
  • A method of assessing the strengths and weaknesses of a situation, used in conjunction with the above techniques to make intelligent choices.
  • An interface, with which this program can be used.

Main components of this program

This article describes the implementation of the following main parts in order from easy to difficult

  • drawBoard chessboard situation drawing
  • isEnd judges the end of the round
  • getMark Current Situation Valuation
  • abnegamax A minimax algorithm for decision tree search in two-player games

drawBoard chessboard situation drawing

By mapping 1: '●', '-1': '○', 0: '+' the situation is drawn and output to the console

 function drawBoard(board, cursor) {
    if (typeof board === 'string')
        board = board.replace(/(\d{16})/g, "$1,").slice(0, -1).split(',').map(v => v.split('').map(v => v == 2 ? '-1' : v))
    console.log((cursor ? "人" : "机") + "  0 1 2 3 4 5 6 7 8 9 A B C D E F")
    for (let i = 0; i < board.length; i++)
        console.log((i > 9 ? i : ' ' + i) + ' ' + board[i].map((v, j) => {
            if (cursor && cursor.x === j && cursor.y === i)
                return '◎'
            return ({ 1: '●', '-1': '○', 0: '+' })[v]
        }).join(''))
}

When a person moves a move, it is controlled by the arrow keys '◎' represents the position of the move, and the move is determined when pressing Enter

 process.stdin.on('keypress', (str, key) => {
    if (key.name === 'return') {

    } else {
        switch (key.name) {
            case 'up': cursor.y = cursor.y - 1; break;
            case 'down': cursor.y = cursor.y + 1; break;
            case 'left': cursor.x = cursor.x - 1; break;
            case 'right': cursor.x = cursor.x + 1; break;
        }
        cursor.x = Math.min(15, Math.max(0, cursor.x))
        cursor.y = Math.min(15, Math.max(0, cursor.y))
        drawBoard(chessBoard.join(''), cursor)
    }

isEnd judges the end of the round

After each move, count the 4 lines (horizontal and vertical strokes) starting from the current move position to see if there are five consecutive beads

 function isEnd(x, y, chessBoard) {
    let vect = [[-1, 0], [-1, 1], [0, 1], [1, 1]]
    let qi = chessBoard[y][x]
    for (let i = 0; i < 4; i++) {
        let a = 1; let b = 1;
        while (chessBoard[y + vect[i][0] * a] && chessBoard[y + vect[i][0] * a][x + vect[i][1] * a] === qi) 
            a++
        while (chessBoard[y - vect[i][0] * b] && chessBoard[y - vect[i][0] * b][x - vect[i][1] * b] === qi) 
            b++
        if (a + b > 5) return true
    }
    return false;
}

getMark Current Situation Valuation

Traverse the 4 lines of each grid (horizontal and vertical skimming) and sum up the scores for the number of consecutive points as a reference for scoring the position:

Lianzi Score Lianzi Score
live 5 1<<16 sleep 5 1<<15
live 4 1<<12 sleep 4 1<<11
live 3 1<<8 sleep 3 1<<7
live 2 1<<6 sleep 2 1<<5
other 1
 function getScore(cnt, flag) {
    flag = flag ? 0 : 1;
    switch (cnt) {
        case 5: return 1 << (15 + flag)
        case 4: return 1 << (11 + flag)
        case 3: return 1 << (7 + flag)
        case 2: return 1 << (5 + flag)
        default: return 1
    }
}

Minimax decision algorithm related

You can watch the video directly https://www.bilibili.com/video/BV1bT4y1C7P5 to learn the following is the description of nodejs implementation

About the data structure of the node (generated by the move)

Properties of the Nod class Description of the property
board situation information
cur The one who fell last
nxt The one who is going to have a baby next
deep current node depth
pos record placement
mark current situation rating
children() child node
 class Nod {
    constructor({ board, cur, nxt, deep, pos }) {
        this.board = board;
        this.cur = cur;
        this.nxt = nxt;
        this.deep = deep;
        this.pos = pos;
        this.mark = getMark(this.board);
    }
    children() {
        const arr = [];
        const reg = /0/g;
        while (reg.exec(this.board) != null) {
            arr.push(new Nod({
                board: this.board.slice(0, reg.lastIndex - 1) + this.nxt + this.board.slice(reg.lastIndex),
                cur: this.nxt,
                nxt: this.cur,
                deep: this.deep + 1,
                pos: [...this.pos, reg.lastIndex - 1]
            }));
        }
        return arr;
    }
}

minimax

Also known as the minimax algorithm, it is an algorithm that finds the minimum of the maximum probability of failure.

This algorithm is a tree-structured recursive algorithm. The children and parent nodes of each node are opponent players. All nodes are divided into maximum value (our side) nodes and minimum value (other side) nodes.

That is to say, under the decision-making path when the opponent player makes the best decision every time, our side is still the one with the greatest advantage.

 function minimax(node, MAXDEEP) {
    if (node.deep >= MAXDEEP) return node;
    let arr = node.children().map(v => minimax(v, MAXDEEP));
    if (node.deep % 2)//min
        return arr.sort((a, b) => a.mark - b.mark)[0]
    else
        return arr.sort((a, b) => b.mark - a.mark)[0]
}

negamax

The "Negamax algorithm" is a small improvement made nearly 20 years after the "MiniMax algorithm" was proposed. There is no difference in program function and efficiency... The only difference is that The former looks more concise (the latter takes the maximum value for a while, and the minimum value for a while). You can find that NegaMax uses -alpha when passing parameters to play the role of taking the reverse maximum (that is, the negative maximum value), so It is not necessary to take the maximum value for one judgment and the minimum value for one judgment. In fact, it is also alternated between "positive maximum" and "negative maximum". The principle is the same but the implementation method is different.

 function negamax(node, MAXDEEP) {
    if (node.deep >= MAXDEEP) return node;
    let arr = node.children();
    let bestV = arr.map(v => {
        const n = negamax(v, MAXDEEP);
        n.mark = -n.mark
        return n;
    }).sort((a, b) => b.mark - a.mark)[0];
    return bestV;
}

minimax does alpha-beta pruning

In the above-mentioned max-min algorithm, the MIN and MAX processes save all possibilities in the search tree, and then calculate backward from the estimated value of the endpoint, which is very inefficient. The introduction of the α-β algorithm can improve the computing efficiency and discard some unnecessary estimated values.

The strategy is to perform a depth-first search. When the generating node reaches the specified depth, the static estimation is performed immediately. Once a non-endpoint node can determine the backward value, the value is immediately assigned, which saves the cost of extending other branches to the node.

 const MAXN = 1 << 28;
const MINN = -MAXN;
function abminimax(node, MAXDEEP, a, b) {
    if (node.deep >= MAXDEEP) return node
    let arr = node.childrenit();
    let bestV;
    if (node.deep % 2) {//min
        bestV = { mark: MAXN };
        for (let child of arr) {
            const val = abminimax(child, MAXDEEP, a, b)
            if (val.mark < bestV.mark) {
                bestV = val;
                b = bestV.mark
                if (a >= b) break;
            }
        }
    } else {
        bestV = { mark: MINN };
        for (let child of arr) {
            const val = abminimax(child, MAXDEEP, a, b)
            if (val.mark > bestV.mark) {
                bestV = val;
                b = bestV.mark
                if (a >= b) break;
            }
        }
    }
    return bestV;
}

negamax does alpha-beta pruning

Combining the advantages of the two, the code is simplified and pruned optimized

 const MINN = -(1 << 28);
const MAXDEEP = 2;
function abnegamax(node, a, b) {
    if (node.deep >= MAXDEEP) return node
    let arr = node.childrenit();
    let bestV = { mark: MINN };
    for (let child of arr) {
        const val = abnegamax(child, -b, -Math.max(a, bestV.mark));
        val.mark = -val.mark
        if (val.mark > bestV.mark) {
            bestV = val;
            if (bestV.mark >= b) break;
        }
    }

    return bestV;
}

source code

https://github.com/Seasonley/game/tree/main/five-in-a-row


seasonley
615 声望693 粉丝

一切皆数据