Abstract : This article will take you from scratch to experience the application of quantum neural networks in natural language processing

This article is shared from the HUAWEI cloud community " Experience the Application of Quantum Neural Network in Natural Language Processing ", the original author: JeffDing.

This article will take you from scratch to experience the application of quantum neural networks in natural language processing.

1. Operating environment

CPU:Intel(R) Core(TM) i7-4712MQ CPU @ 2.30GHz

Memory: 4GB

Operating system: Ubuntu 20.10

MindSpore version: 1.2

Second, install Mindspore

Refer to the installation document on the official website: https://www.mindspore.cn/install/

Install MindQuantum reference document: https://gitee.com/mindspore/mindquantum/blob/r0.1/README_CN.md

View the version through Mindspore.__version__
image.png

3. Experience the application of quantum neural network in natural language processing

1. Environmental preparation

入包 import numpy as np import time from projectq.ops import QubitOperator import mindspore.ops as ops import mindspore.dataset as ds from mindspore import nn from mindspore.train.callback import LossMonitor from mindspore import Model from mindquantum.nn import MindQuantumLayer from mindquantum import Hamiltonian, Circuit, RX, RY, X, H, UN #数据预处理 def GenerateWordDictAndSample(corpus, window=2): all_words = corpus.split() word_set = list(set(all_words)) word_set.sort() word_dict = {w: i for i,w in enumerate(word_set)} sampling = [] for index, word in enumerate(all_words[window:-window]): around = [] for i in range(index, index + 2*window + 1): if i != index + window: around.append(all_words) sampling.append([around,all_words[index + window]]) return word_dict, sampling word_dict, sample = GenerateWordDictAndSample("I love natural language processing") print(word_dict) print('word dict size: ', len(word_dict)) print('samples: ', sample) print('number of samples: ', len(sample))

operation result:

[Note] The current simulator thread is 1. If your simulation speed is slow, please set OMP_NUM_THREADS to an appropriate number according to your model.
{'I': 0, 'language': 1, 'love': 2, 'natural': 3, 'processing': 4}
word dict size: 5
samples: [[['I', 'love', ' language', 'processing'], 'natural']]
Number of samples: 1

It can be seen from the above information that we get that the dictionary size of the sentence is 5, which can generate a sample point.

2. Coded circuit

def Genera**coderCircuit(n_qubits, prefix=''): if len(prefix) != 0 and prefix[-1] != '_': prefix += '_' circ = Circuit() for i in range(n_qubits): circ += RX(prefix + str(i)).on(i) return circ Genera**coderCircuit(3,prefix='e')

operation result:

RX(e_0|0)
RX(e_1|1)
RX(e_2|2)

We usually use |0⟩">|0⟩|0⟩ and |1⟩">|1⟩|1⟩ to mark the two states of two-level qubits. According to the principle of state superposition, qubits can be in these two states. Superposition state of two states:

|ψ⟩=α|0⟩+β|1⟩">|ψ⟩=α|0⟩+β|1⟩|ψ⟩=α|0⟩+β|1⟩

For the quantum state of n">nn bits, it will be in the Hilbert space of 2n">2n2n dimensions. For the above five-word dictionary, we only need ⌈log2⁡5⌉=3">⌈log25⌉=3⌈log2⁡5⌉=3 qubits to complete the encoding, which also reflects the superiority of quantum computing Sex.

For example, for the "love" in the above dictionary, the corresponding label is 2, and the binary representation of 2 is 010. We only need to set e_0, e_1, and e_2 in the encoding line to 0">00, π">ππ and 0">00 is fine.

#通过Evolution算子来验证 from mindquantum.nn import generate_evolution_operator from mindspore import context from mindspore import Tensor n_qubits = 3 # number of qubits of this quantum circuit label = 2 # label need to encode label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0') # binary form of label label_array = np.array([int(i)*np.pi for i in label_bin]).astype(np.float32) # parameter value of encoder encoder = Genera**coderCircuit(n_qubits, prefix='e') # encoder circuit encoder_para_names = encoder.parameter_resolver().para_name # parameter names of encoder print("Label is: ", label) print("Binary label is: ", label_bin) print("Parameters of encoder is: \n", np.round(label_array, 5)) print("Encoder circuit is: \n", encoder) print("Encoder parameter names are: \n", encoder_para_names) context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # quantum state evolution operator evol = generate_evolution_operator(param_names=encoder_para_names, circuit=encoder) state = evol(Tensor(label_array)) state = state.asnumpy() quantum_state = state[:, 0] + 1j * state[:, 1] amp = np.round(np.abs(quantum_state)**2, 3) print("Amplitude of quantum state is: \n", amp) print("Label in quantum state is: ", np.argmax(amp))

operation result:

Label is: 2
Binary label is: 010
Parameters of encoder is:
[0. 3.14159 0. ]
Encoder circuit is:
RX(e_0|0)
RX(e_1|1)
RX(e_2|2)
Encoder parameter names are:
['e_0', 'e_1', 'e_2']
Amplitude of quantum state is:
[0. 0. 1. 0. 0. 0. 0. 0.]
Label in quantum state is: 2

Through the above verification, we found that for the data with a label of 2, the position where the amplitude of the quantum state is finally obtained is also 2. Therefore, the obtained quantum state is exactly the encoding of the input label. We summarize the process of generating parameter values from data encoding into the following function.

def GenerateTrainData(sample, word_dict):
    n_qubits = np.int(np.ceil(np.log2(1 + max(word_dict.values()))))
    data_x = []
    data_y = []
    for around, center in sample:
        data_x.append([])
        for word in around:
            label = word_dict[word]
            label_bin = bin(label)[-1:1:-1].ljust(n_qubits,'0')
            label_array = [int(i)*np.pi for i in label_bin]
            data_x[-1].extend(label_array)
        data_y.append(word_dict[center])
    return np.array(data_x).astype(np.float32), np.array(data_y).astype(np.int32)
GenerateTrainData(sample, word_dict)

operation result:

(array([[0. , 0. , 0. , 0. , 3.1415927, 0. ,

     3.1415927, 0.       , 0.       , 0.       , 0.       , 3.1415927]],
   dtype=float32),

array([3], dtype=int32))

According to the above results, we merge the information encoded by the 4 input words into a longer vector, which is convenient for subsequent neural network calls.

3.Ansatz line

#定义如下函数生成Ansatz线路 def GenerateAnsatzCircuit(n_qubits, layers, prefix=''): if len(prefix) != 0 and prefix[-1] != '_': prefix += '_' circ = Circuit() for l in range(layers): for i in range(n_qubits): circ += RY(prefix + str(l) + '_' + str(i)).on(i) for i in range(l % 2, n_qubits, 2): if i < n_qubits and i + 1 < n_qubits: circ += X.on(i + 1, i) return circ GenerateAnsatzCircuit(5, 2, 'a')

operation result:

RY(a_0_0|0)
RY(a_0_1|1)
RY(a_0_2|2)
RY(a_0_3|3)
RY(a_0_4|4)
X(1 <-: 0)
X(3 <-: 2)
RY(a_1_0|0)
RY(a_1_1|1)
RY(a_1_2|2)
RY(a_1_3|3)
RY(a_1_4|4)
X(2 <-: 1)
X(4 <-: 3)

4. Measure

def GenerateEmbeddingHamiltonian(dims, n_qubits): hams = [] for i in range(dims): s = '' for j, k in enumerate(bin(i + 1)[-1:1:-1]): if k == '1': s = s + 'Z' + str(j) + ' ' hams.append(Hamiltonian(QubitOperator(s))) return hams GenerateEmbeddingHamiltonian(5, 5)

operation result:

[1.0 Z0, 1.0 Z1, 1.0 Z0 Z1, 1.0 Z2, 1.0 Z0 Z2]

5. Quantum version of the word vector embedding layer

Before running, please run export OMP_NUM_THREADS=4 in the terminal

def QEmbedding(num_embedding, embedding_dim, window, layers, n_threads):

    n_qubits = int(np.ceil(np.log2(num_embedding)))

    hams = GenerateEmbeddingHamiltonian(embedding_dim, n_qubits)

    circ = Circuit()

    circ = UN(H, n_qubits)

    encoder_param_name = []

    ansatz_param_name = []

    for w in range(2 * window):

        encoder = Genera**coderCircuit(n_qubits, 'Encoder_' + str(w))

        ansatz = GenerateAnsatzCircuit(n_qubits, layers, 'Ansatz_' + str(w))

        encoder.no_grad()

        circ += encoder

        circ += ansatz

        encoder_param_name.extend(list(encoder.parameter_resolver()))

        ansatz_param_name.extend(list(ansatz.parameter_resolver()))

    net = MindQuantumLayer(encoder_param_name,

                           ansatz_param_name,

                           circ,

                           hams,

                           n_threads=n_threads)

    return net



class CBOW(nn.Cell):

    def __init__(self, num_embedding, embedding_dim, window, layers, n_threads,

                 hidden_dim):

        super(CBOW, self).__init__()

        self.embedding = QEmbedding(num_embedding, embedding_dim, window,

                                    layers, n_threads)

        self.dense1 = nn.Dense(embedding_dim, hidden_dim)

        self.dense2 = nn.Dense(hidden_dim, num_embedding)

        self.relu = ops.ReLU()



    def construct(self, x):

        embed = self.embedding(x)

        out = self.dense1(embed)

        out = self.relu(out)

        out = self.dense2(out)

        return out

   

class LossMonitorWithCollection(LossMonitor):

    def __init__(self, per_print_times=1):

        super(LossMonitorWithCollection, self).__init__(per_print_times)

        self.loss = []



    def begin(self, run_context):

        self.begin_time = time.time()



    def end(self, run_context):

        self.end_time = time.time()

        print('Total time used: {}'.format(self.end_time - self.begin_time))



    def epoch_begin(self, run_context):

        self.epoch_begin_time = time.time()



    def epoch_end(self, run_context):

        cb_params = run_context.original_args()

        self.epoch_end_time = time.time()

        if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:

            print('')



    def step_end(self, run_context):

        cb_params = run_context.original_args()

        loss = cb_params.net_outputs



        if isinstance(loss, (tuple, list)):

            if isinstance(loss[0], Tensor) and isinstance(loss[0].asnumpy(), np.ndarray):

                loss = loss[0]



        if isinstance(loss, Tensor) and isinstance(loss.asnumpy(), np.ndarray):

            loss = np.mean(loss.asnumpy())



        cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1



        if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)):

            raise ValueError("epoch: {} step: {}. Invalid loss, terminating training.".format(

                cb_params.cur_epoch_num, cur_step_in_epoch))

        self.loss.append(loss)

        if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:

            print("\repoch: %+3s step: %+3s time: %5.5s, loss is %5.5s" % (cb_params.cur_epoch_num, cur_step_in_epoch, time.time() - self.epoch_begin_time, loss), flush=True, end='')

 



import mindspore as ms

from mindspore import context

from mindspore import Tensor

context.set_context(mode=context.GRAPH_MODE, device_target="CPU")

corpus = """We are about to study the idea of a computational process.

Computational processes are abstract beings that inhabit computers.

As they evolve, processes manipulate other abstract things called data.

The evolution of a process is directed by a pattern of rules

called a program. People create programs to direct processes. In effect,

we conjure the spirits of the computer with our spells."""



ms.set_seed(42)

window_size = 2

embedding_dim = 10

hidden_dim = 128

word_dict, sample = GenerateWordDictAndSample(corpus, window=window_size)

train_x,train_y = GenerateTrainData(sample, word_dict)



train_loader = ds.NumpySlicesDataset({

    "around": train_x,

    "center": train_y

},shuffle=False).batch(3)

net = CBOW(len(word_dict), embedding_dim, window_size, 3, 4, hidden_dim)

net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)

loss_monitor = LossMonitorWithCollection(500)

model = Model(net, net_loss, net_opt)

model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)

operation result:

epoch: 25 step: 20 time: 36.14, loss is 3.154
epoch: 50 step: 20 time: 36.51, loss is 2.945
epoch: 75 step: 20 time: 36.71, loss is 0.226
epoch: 100 step: 20 time: 36.56, loss is 0.016
Total time used: 3668.7517251968384

Print the value of the loss function during convergence:

import matplotlib.pyplot as plt plt.plot(loss_monitor.loss,'.') plt.xlabel('Steps') plt.ylabel('Loss') plt.show()

image.png

Print the parameters in the quantum circuit of the quantum embedded layer

net.embedding.weight.asnumpy()
array([-6.4384632e-02, -1.2658586e-01, 1.0083634e-01, -1.3011757e-01,

    1.4005195e-03, -1.9296107e-04, -7.9315618e-02, -2.9339856e-01,
    7.6259784e-02,  2.9878360e-01, -1.3091319e-04,  6.8271365e-03,
   -8.5563213e-02, -2.4168481e-01, -8.2548901e-02,  3.0743122e-01,
   -7.8157615e-04, -3.2907310e-03, -1.4412615e-01, -1.9241245e-01,
   -7.5561814e-02, -3.1189525e-03,  3.8330450e-03, -1.4486053e-04,
   -4.8195502e-01,  5.3657538e-01,  3.8986996e-02,  1.7286544e-01,
   -3.4090234e-03, -9.5573599e-03, -4.8208281e-01,  5.9604627e-01,
   -9.7009525e-02,  1.8312852e-01,  9.5267012e-04, -1.2261710e-03,
    3.4219343e-02,  8.0031365e-02, -4.5349425e-01,  3.7360430e-01,
    8.9665735e-03,  2.1575980e-03, -2.3871836e-01, -2.4819574e-01,
   -6.2781256e-01,  4.3640310e-01, -9.7688911e-03, -3.9542126e-03,
   -2.4010721e-01,  4.8120108e-02, -5.6876510e-01,  4.3773583e-01,
    4.7241263e-03,  1.4138421e-02, -1.2472854e-03,  1.1096644e-01,
    7.1980711e-03,  7.3047012e-02,  2.0803964e-02,  1.1490706e-02,
    8.6638138e-02,  2.0503466e-01,  4.7177267e-03, -1.8399477e-02,
    1.1631225e-02,  2.0587114e-03,  7.6739892e-02, -6.3548386e-02,
    1.7298019e-01, -1.9143591e-02,  4.1606693e-04, -9.2881303e-03],
  dtype=float32)

6. The classic version of the word vector embedding layer

class CBOWClassical(nn.Cell):

    def __init__(self, num_embedding, embedding_dim, window, hidden_dim):

        super(CBOWClassical, self).__init__()

        self.dim = 2 * window * embedding_dim

        self.embedding = nn.Embedding(num_embedding, embedding_dim, True)

        self.dense1 = nn.Dense(self.dim, hidden_dim)

        self.dense2 = nn.Dense(hidden_dim, num_embedding)

        self.relu = ops.ReLU()

        self.reshape = ops.Reshape()



    def construct(self, x):

        embed = self.embedding(x)

        embed = self.reshape(embed, (-1, self.dim))

        out = self.dense1(embed)

        out = self.relu(out)

        out = self.dense2(out)

        return out

train_x = []

train_y = []

for i in sample:

    around, center = i

    train_y.append(word_dict[center])

    train_x.append([])

    for j in around:

        train_x[-1].append(word_dict[j])

train_x = np.array(train_x).astype(np.int32)

train_y = np.array(train_y).astype(np.int32)

print("train_x shape: ", train_x.shape)

print("train_y shape: ", train_y.shape)

train_loader = ds.NumpySlicesDataset({

    "around": train_x,

    "center": train_y

},shuffle=False).batch(3)

net = CBOWClassical(len(word_dict), embedding_dim, window_size, hidden_dim)

net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)

loss_monitor = LossMonitorWithCollection(500)

model = Model(net, net_loss, net_opt)

model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)

operation result:

train_x shape: (58, 4)
train_y shape: (58,)
epoch: 25 step: 20 time: 0.077, loss is 3.156
epoch: 50 step: 20 time: 0.095, loss is 3.025
epoch: 75 step: 20 time: 0.115, loss is 2.996
epoch: 100 step: 20 time: 0.088, loss is 1.773
epoch: 125 step: 20 time: 0.083, loss is 0.172
epoch: 150 step: 20 time: 0.110, loss is 0.008
epoch: 175 step: 20 time: 0.086, loss is 0.003
epoch: 200 step: 20 time: 0.081, loss is 0.001
epoch: 225 step: 20 time: 0.081, loss is 0.000
epoch: 250 step: 20 time: 0.078, loss is 0.000
epoch: 275 step: 20 time: 0.079, loss is 0.000
epoch: 300 step: 20 time: 0.080, loss is 0.000
epoch: 325 step: 20 time: 0.078, loss is 0.000
epoch: 350 step: 20 time: 0.081, loss is 0.000
Total time used: 30.569124698638916

Convergence graph:
image.png

It can be seen from the above that the quantum version word embedding model obtained through quantum simulation can also complete the embedding task very well. When the data set is too large to bear the computing power of classical computers, quantum computers will be able to handle such problems easily.

Click to follow to learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量