Deploy the wiki system wiki.js on KubeSphere and enable Chinese full-text search

Author: scwang18, mainly responsible for the technical architecture, has done a lot of research in the direction of container cloud.

background

wiki.js is an excellent open source Wiki system. Compared with xwiki, its functions are currently less complete than xwiki, but it is also making progress. Wiki writing, sharing, and permission management functions are still available, but the UI design is very beautiful, which can meet the basic knowledge management needs of small teams.

The following work is done with KubeSphere 3.2.1 + Helm 3 already deployed.

The official website of the method of deploying KuberSphere has a very detailed document introduction, which will not be repeated here.
https://kubesphere.com.cn/docs/installing-on-linux/introduction/multioverview/

prepare storageclass

We use OpenEBS as storage. The Local StorageSlass installed by default in OpenEBS is automatically deleted after the Pod is destroyed. It is not suitable for my MySQL storage. We make a slight modification on the basis of Local StorageClass and create a new StorageClass to allow PV content after the Pod is destroyed. Go ahead and decide manually what to do with it.

 apiVersion: v1
items:
- apiVersion: storage.k8s.io/v1
  kind: StorageClass
  metadata:
    annotations:
      cas.openebs.io/config: |
        - name: StorageType
          value: "hostpath"
        - name: BasePath
          value: "/var/openebs/localretain/"
      openebs.io/cas-type: local
      storageclass.beta.kubernetes.io/is-default-class: "false"
      storageclass.kubesphere.io/supported-access-modes: '["ReadWriteOnce"]'
    name: localretain
  provisioner: openebs.io/local
  reclaimPolicy: Retain
  volumeBindingMode: WaitForFirstConsumer
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Deploy the PostgreSQL database

PostgreSQL is also required in other projects of our team. In order to improve the utilization and unified management of the PostgreSQL database, we deploy PostgreSQL independently and configure it to use an external database when installing wiki.js.

Prepare username and password configuration

We use Secret to store sensitive information such as PostgreSQL user passwords.

 kind: Secret
apiVersion: v1
metadata:
  name: postgres-prod
data:
  POSTGRES_PASSWORD: xxxx
type: Opaque

The above POSTGRES_PASSWORD is prepared by itself and is base64 encoded data.

Prepare the database initialization script

Use ConfigMap to save the database initialization script. When the database is created, mount the database initialization script in ConfigMap to /docker-entrypoint-initdb.d, and the script will be automatically executed when the container is initialized.

 apiVersion: v1
kind: ConfigMap
metadata:
  name: wikijs-postgres-init
data:
  init.sql: |-
    CREATE DATABASE wikijs;
    CREATE USER wikijs with password 'xxxx';
    GRANT CONNECT ON DATABASE wikijs to wikijs;
    GRANT USAGE ON SCHEMA public TO wikijs;
    GRANT SELECT,update,INSERT,delete ON ALL TABLES IN SCHEMA public TO wikijs;
    ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO wikijs;

The passwords of the above wikijs users are prepared by themselves and saved in plain text.

ready to store

We use OpenEBS, which is installed by default in KubeSphere, to provide storage services. Persistent storage can be provided by creating PVCs.

A 10G PVC is declared here.

 kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: postgres-prod-data
  finalizers:
    - kubernetes.io/pvc-protection
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: localretain
  volumeMode: Filesystem

Deploy the PostgreSQL database

After preparing various configuration information and storage in the previous steps, you can start deploying the PostgreSQL service.

Our Kubernetes is not configured with a storage array, uses OpenEBS as storage, and deploys PostgreSQL in Deployment mode.

 apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: postgres-prod
  name: postgres-prod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres-prod
  template:
    metadata:
      labels:
        app: postgres-prod
    spec:
      containers:
        - name: db
          imagePullPolicy: IfNotPresent
          image: 'abcfy2/zhparser:12-alpine'
          ports:
            - name: tcp-5432
              protocol: TCP
              containerPort: 5432
          envFrom:
          - secretRef:
              name: postgres-prod
          volumeMounts:
            - name: postgres-prod-data
              readOnly: false
              mountPath: /var/lib/postgresql/data
            - name: wikijs-postgres-init
              readOnly: true
              mountPath: /docker-entrypoint-initdb.d
      volumes:
        - name: postgres-prod-data
          persistentVolumeClaim:
            claimName: postgres-prod-data
        - name: wikijs-postgres-init
          configMap:
            name: wikijs-postgres-init

Create a Service for other Pods to access

 apiVersion: v1
kind: Service
metadata:
  name: postgres-prod
spec:
  selector:
    app: postgres-prod
  ports:
    - protocol: TCP
      port: 5432
      targetPort: tcp-5432

Complete the PostgreSQL deployment

test slightly

deploy wiki.js

Prepare username and password configuration

We use Secret to store sensitive information such as username and password that wiki.js uses to connect to the database.

 apiVersion: v1
kind: Secret
metadata:
  name: wikijs
data:
  DB_USER: d2lraWpz
  DB_PASS: xxxx
type: Opaque

The above DB_PASS is prepared by itself and is base64 encoded data.

Prepare database connection configuration

We use a ConfigMap to store the database connection information for wiki.js.

 apiVersion: v1
kind: ConfigMap
metadata:
  name: wikijs
data:
  DB_TYPE: postgres
  DB_HOST: postgres-prod.infra
  DB_PORT: "5432"
  DB_NAME: wikijs
  HA_ACTIVE: "true"

Create database user and database

If the wikijs user and data are not created in the PostgreSQL database, the following work needs to be done manually:

Connect to the PostgreSQL database through the "Database Tool", execute the SQL statement, and complete the creation and authorization of the database and user.

 CREATE DATABASE wikijs;
CREATE USER wikijs with password 'xxxx';
GRANT CONNECT ON DATABASE wikijs to wikijs;
GRANT USAGE ON SCHEMA public TO wikijs;
GRANT SELECT,update,INSERT,delete ON ALL TABLES IN SCHEMA public TO wikijs;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO wikijs;

The above wikijs password can be modified by yourself.

Prepare the yaml deployment file for wiki.js

The yaml file for deploying wiki.js in Deployment mode is as follows:

 # wikijs-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: wikijs
  name: wikijs
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wikijs
  template:
    metadata:
      labels:
        app: wikijs
    spec:
      containers:
        - name: wikijs
          image: 'requarks/wiki:2'
          ports:
            - name: http-3000
              protocol: TCP
              containerPort: 3000
          envFrom:
          - secretRef:
              name: wikijs
          - configMapRef:
              name: wikijs

Create a service that accesses wiki.js in the cluster

 # wikijs-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: wikijs
spec:
  selector:
    app: wikijs
  ports:
    - protocol: TCP
      port: 3000
      targetPort: http-3000

Create an Ingress accessible outside the cluster

 # wikijs-ing.yaml

kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
  name: wikijs
spec:
  ingressClassName: nginx
  rules:
    - host: wiki.xxxx.cn
      http:
        paths:
          - path: /
            pathType: ImplementationSpecific
            backend:
              service:
                name: wikijs
                port:
                  number: 3000

The above host domain names need to be configured by themselves.

Execute deployment

 $ kubectl apply -f wikijs-deploy.yaml
$ kubectl apply -f wikijs-svc.yaml
$ kubectl apply -f wikijs-ing.yaml

Configure wiki.js to support Chinese full-text search

The full-text retrieval of wiki.js supports retrieval based on PostgreSQL, as well as Elasticsearch. Relatively speaking, PostgreSQL is relatively lightweight. In this project, we use PostgreSQL's full-text retrieval.

However, because PostgreSQL does not support Chinese word segmentation, additional plugins need to be installed and configured to enable Chinese word segmentation. The following describes how to start full-text search based on PostgreSQL database Chinese word segmentation for wiki.js.

Grant wikijs user temporary supervising permissions

Log in to the PostgreSQL user who has the supervising authority through the database management tool, and temporarily grant the wiki.js user the temporary supervising authority, so as to facilitate the activation of the Chinese word segmentation function.

 ALTER USER wikijs WITH SUPERUSER;

Enable database Chinese word segmentation capabilities

Use the database management tool to log in to the wikijs user of the PostgreSQL database, and execute the following command to start the Chinese word segmentation function of the database.

 CREATE EXTENSION pg_trgm;

CREATE EXTENSION zhparser;
CREATE TEXT SEARCH CONFIGURATION pg_catalog.chinese_zh (PARSER = zhparser);
ALTER TEXT SEARCH CONFIGURATION chinese_zh ADD MAPPING FOR n,v,a,i,e,l WITH simple;

-- 忽略标点影响
ALTER ROLE wikijs SET zhparser.punctuation_ignore = ON;
-- 短词复合
ALTER ROLE wikijs SET zhparser.multi_short = ON;

-- 测试一下
select ts_debug('chinese_zh', '青春是最美好的年岁，青春是最灿烂的日子。每一个人的青春都无比宝贵，宝贵的青春只有与奋斗为伴才最闪光、最出彩。');

Cancel the temporary supervising permission of the wikijs user

 ALTER USER wikijs WITH NOSUPERUSER;

Create a configuration ConfigMap that supports Chinese word segmentation

 # zh-parse.yaml

kind: ConfigMap
apiVersion: v1
metadata:
  name: wikijs-zhparser
data:
  definition.yml: |-
    key: postgres
    title: Database - PostgreSQL
    description: Advanced PostgreSQL-based search engine.
    author: requarks.io
    logo: https://static.requarks.io/logo/postgresql.svg
    website: https://www.requarks.io/
    isAvailable: true
    props:
      dictLanguage:
        type: String
        title: Dictionary Language
        hint: Language to use when creating and querying text search vectors.
        default: english
        enum:
          - simple
          - danish
          - dutch
          - english
          - finnish
          - french
          - german
          - hungarian
          - italian
          - norwegian
          - portuguese
          - romanian
          - russian
          - spanish
          - swedish
          - turkish
          - chinese_zh
        order: 1
  engine.js: |-
    const tsquery = require('pg-tsquery')()
    const stream = require('stream')
    const Promise = require('bluebird')
    const pipeline = Promise.promisify(stream.pipeline)

    /* global WIKI */

    module.exports = {
      async activate() {
        if (WIKI.config.db.type !== 'postgres') {
          throw new WIKI.Error.SearchActivationFailed('Must use PostgreSQL database to activate this engine!')
        }
      },
      async deactivate() {
        WIKI.logger.info(`(SEARCH/POSTGRES) Dropping index tables...`)
        await WIKI.models.knex.schema.dropTable('pagesWords')
        await WIKI.models.knex.schema.dropTable('pagesVector')
        WIKI.logger.info(`(SEARCH/POSTGRES) Index tables have been dropped.`)
      },
      /**
       * INIT
       */
      async init() {
        WIKI.logger.info(`(SEARCH/POSTGRES) Initializing...`)

        // -> Create Search Index
        const indexExists = await WIKI.models.knex.schema.hasTable('pagesVector')
        if (!indexExists) {
          WIKI.logger.info(`(SEARCH/POSTGRES) Creating Pages Vector table...`)
          await WIKI.models.knex.schema.createTable('pagesVector', table => {
            table.increments()
            table.string('path')
            table.string('locale')
            table.string('title')
            table.string('description')
            table.specificType('tokens', 'TSVECTOR')
            table.text('content')
          })
        }
        // -> Create Words Index
        const wordsExists = await WIKI.models.knex.schema.hasTable('pagesWords')
        if (!wordsExists) {
          WIKI.logger.info(`(SEARCH/POSTGRES) Creating Words Suggestion Index...`)
          await WIKI.models.knex.raw(`
            CREATE TABLE "pagesWords" AS SELECT word FROM ts_stat(
              'SELECT to_tsvector(''simple'', "title") || to_tsvector(''simple'', "description") || to_tsvector(''simple'', "content") FROM "pagesVector"'
            )`)
          await WIKI.models.knex.raw('CREATE EXTENSION IF NOT EXISTS pg_trgm')
          await WIKI.models.knex.raw(`CREATE INDEX "pageWords_idx" ON "pagesWords" USING GIN (word gin_trgm_ops)`)
        }

        WIKI.logger.info(`(SEARCH/POSTGRES) Initialization completed.`)
      },
      /**
       * QUERY
       *
       * @param {String} q Query
       * @param {Object} opts Additional options
       */
      async query(q, opts) {
        try {
          let suggestions = []
          let qry = `
            SELECT id, path, locale, title, description
            FROM "pagesVector", to_tsquery(?,?) query
            WHERE (query @@ "tokens" OR path ILIKE ?)
          `
          let qryEnd = `ORDER BY ts_rank(tokens, query) DESC`
          let qryParams = [this.config.dictLanguage, tsquery(q), `%${q.toLowerCase()}%`]

          if (opts.locale) {
            qry = `${qry} AND locale = ?`
            qryParams.push(opts.locale)
          }
          if (opts.path) {
            qry = `${qry} AND path ILIKE ?`
            qryParams.push(`%${opts.path}`)
          }
          const results = await WIKI.models.knex.raw(`
            ${qry}
            ${qryEnd}
          `, qryParams)
          if (results.rows.length < 5) {
            const suggestResults = await WIKI.models.knex.raw(`SELECT word, word <-> ? AS rank FROM "pagesWords" WHERE similarity(word, ?) > 0.2 ORDER BY rank LIMIT 5;`, [q, q])
            suggestions = suggestResults.rows.map(r => r.word)
          }
          return {
            results: results.rows,
            suggestions,
            totalHits: results.rows.length
          }
        } catch (err) {
          WIKI.logger.warn('Search Engine Error:')
          WIKI.logger.warn(err)
        }
      },
      /**
       * CREATE
       *
       * @param {Object} page Page to create
       */
      async created(page) {
        await WIKI.models.knex.raw(`
          INSERT INTO "pagesVector" (path, locale, title, description, "tokens") VALUES (
            ?, ?, ?, ?, (setweight(to_tsvector('${this.config.dictLanguage}', ?), 'A') || setweight(to_tsvector('${this.config.dictLanguage}', ?), 'B') || setweight(to_tsvector('${this.config.dictLanguage}', ?), 'C'))
          )
        `, [page.path, page.localeCode, page.title, page.description, page.title, page.description, page.safeContent])
      },
      /**
       * UPDATE
       *
       * @param {Object} page Page to update
       */
      async updated(page) {
        await WIKI.models.knex.raw(`
          UPDATE "pagesVector" SET
            title = ?,
            description = ?,
            tokens = (setweight(to_tsvector('${this.config.dictLanguage}', ?), 'A') ||
            setweight(to_tsvector('${this.config.dictLanguage}', ?), 'B') ||
            setweight(to_tsvector('${this.config.dictLanguage}', ?), 'C'))
          WHERE path = ? AND locale = ?
        `, [page.title, page.description, page.title, page.description, page.safeContent, page.path, page.localeCode])
      },
      /**
       * DELETE
       *
       * @param {Object} page Page to delete
       */
      async deleted(page) {
        await WIKI.models.knex('pagesVector').where({
          locale: page.localeCode,
          path: page.path
        }).del().limit(1)
      },
      /**
       * RENAME
       *
       * @param {Object} page Page to rename
       */
      async renamed(page) {
        await WIKI.models.knex('pagesVector').where({
          locale: page.localeCode,
          path: page.path
        }).update({
          locale: page.destinationLocaleCode,
          path: page.destinationPath
        })
      },
      /**
       * REBUILD INDEX
       */
      async rebuild() {
        WIKI.logger.info(`(SEARCH/POSTGRES) Rebuilding Index...`)
        await WIKI.models.knex('pagesVector').truncate()
        await WIKI.models.knex('pagesWords').truncate()

        await pipeline(
          WIKI.models.knex.column('path', 'localeCode', 'title', 'description', 'render').select().from('pages').where({
            isPublished: true,
            isPrivate: false
          }).stream(),
          new stream.Transform({
            objectMode: true,
            transform: async (page, enc, cb) => {
              const content = WIKI.models.pages.cleanHTML(page.render)
              await WIKI.models.knex.raw(`
                INSERT INTO "pagesVector" (path, locale, title, description, "tokens", content) VALUES (
                  ?, ?, ?, ?, (setweight(to_tsvector('${this.config.dictLanguage}', ?), 'A') || setweight(to_tsvector('${this.config.dictLanguage}', ?), 'B') || setweight(to_tsvector('${this.config.dictLanguage}', ?), 'C')), ?
                )
              `, [page.path, page.localeCode, page.title, page.description, page.title, page.description, content,content])
              cb()
            }
          })
        )

        await WIKI.models.knex.raw(`
          INSERT INTO "pagesWords" (word)
            SELECT word FROM ts_stat(
              'SELECT to_tsvector(''simple'', "title") || to_tsvector(''simple'', "description") || to_tsvector(''simple'', "content") FROM "pagesVector"'
            )
          `)

        WIKI.logger.info(`(SEARCH/POSTGRES) Index rebuilt successfully.`)
      }
    }

Update the Deployment of wikijs

The PostgreSQL-based full-text search engine configuration for wiki.js is located in /wiki/server/modules/search/postgres, and we load the ConfigMap configured earlier into this directory.

 # wikijs-zh.yaml

kind: Deployment
apiVersion: apps/v1
metadata:
  name: wikijs
  labels:
    app: wikijs
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wikijs
  template:
    metadata:
      labels:
        app: wikijs
    spec:
      volumes:
        - name: volume-dysh4f
          configMap:
            name: wikijs-zhparser
            defaultMode: 420
      containers:
        - name: wikijs
          image: 'requarks/wiki:2'
          ports:
            - name: http-3000
              containerPort: 3000
              protocol: TCP
          envFrom:
            - secretRef:
                name: wikijs
            - configMapRef:
                name: wikijs
          volumeMounts:
            - name: volume-dysh4f
              readOnly: true
              mountPath: /wiki/server/modules/search/postgres

Configure wiki.js to enable full-text search based on PostgreSQL

After reapplying the new Delployment file

 $ kubectl apply -f zh-parse.yaml
$ kubectl apply -f wikijs-zh.yaml

Open wiki.js Administration
click search engine
Select Database - PostgreSQL
Select chinese_zh from the drop-down menu of Dictionary Language.
Click Apply and rebuild the index.
Complete the configuration.

Summarize

The wiki.js deployment method introduced in this article supports Chinese full-text retrieval, and integrates PostgreSQL and zhparser Chinese word segmentation plugin.

Compared with the standard wiki.js installation and deployment process, the following configurations are mainly made:

The PostgreSQL image uses abcfy2/zhparser:12-alpine , which comes with the zhparser Chinese word segmentation plugin.
ConfigMap is attached to the wiki.js image, which is used to modify the information about the PostgreSQL search engine configuration in the original Docker image to support the chinese_zh option.

This article is published by OpenWrite , a multi-post blog platform!

Deploy the wiki system wiki.js on KubeSphere and enable Chinese full-text search

background

prepare storageclass

Deploy the PostgreSQL database

Prepare username and password configuration

Prepare the database initialization script

ready to store

Deploy the PostgreSQL database

Create a Service for other Pods to access

Complete the PostgreSQL deployment

deploy wiki.js

Prepare username and password configuration

Prepare database connection configuration

Create database user and database

Prepare the yaml deployment file for wiki.js

Create a service that accesses wiki.js in the cluster

Create an Ingress accessible outside the cluster

Execute deployment

Configure wiki.js to support Chinese full-text search

Grant wikijs user temporary supervising permissions

Enable database Chinese word segmentation capabilities

Cancel the temporary supervising permission of the wikijs user

Create a configuration ConfigMap that supports Chinese word segmentation

Update the Deployment of wikijs

Configure wiki.js to enable full-text search based on PostgreSQL

Summarize

KubeSphere

引用和评论

同学们看过来！开源之夏 KubeSphere 项目认领开启啦！

数据库的下一场革命：S3 延迟已降至原先的 10%，云数据库架构该进化了

在 ApeCloud （云猿生数据）实习是怎样的体验？跟行业大佬练技术修为的一年小记

基于 KubeBlocks 的 PikiwiDB(原Pika) 云化下一站

阿里云 ESA 游戏行业解决方案｜安全防护、加速、低延时的技术融合

Linux系统安装更新Python3.x版本详细步骤

K3s + KubeSphere + DeepSeek 全流程部署指南：轻量 K8s 与 AI 大模型私有化实践