Classifying classical data

Author(s): Luis Mantilla

QML can be used to process classical data. Notably, previous studies have applied quantum models for such tasks, and you can find more about these in [1], [2], [3], to name a few. In this tutorial, we’ll focus on using MB-QML for the task of classical data classification. The core idea is to implement an embedding \(x_i \mapsto |\phi(x_i)\rangle\) of a 2D dataset and use this map to formulate the kernel:

\[K(x_i, x_j) = \abs{\braket{\phi(x_i)}{\phi(x_j)}}^2,\]

The data embedding \(\phi\) can be implemented in several ways, and in MB-QML, we can use the measurement angles to encode the data. Let’s begin creating a simple dataset and splitting it into training and test sets.

In [1]: import matplotlib.pyplot as plt

In [2]: from sklearn import datasets

In [3]: from sklearn.model_selection import train_test_split

In [4]: import pandas as pd

In [5]: blobs = datasets.make_blobs(n_samples=200, centers=2, random_state=42, cluster_std=2)

In [6]: blobs_df = pd.DataFrame(data=blobs[0], columns=['feature1', 'feature2'])

In [7]: blobs_df['target'] = blobs[1]

In [8]: X = blobs_df.drop('target', axis=1)

In [9]: y = blobs_df['target']

In [10]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [11]: fig, ax = plt.subplots()

In [12]: ax.set_facecolor('white')

In [13]: plt.scatter(X_train['feature1'], X_train['feature2'], c=y_train, cmap='coolwarm')
Out[13]: <matplotlib.collections.PathCollection at 0x7f6466df5960>
savefig/scatter_classical_data.png

It is a good practice to normalize the data to avoid issues when embedding as the measurement angles.

In [14]: from sklearn.preprocessing import MinMaxScaler

In [15]: scaler = MinMaxScaler(feature_range=(0, 1))

In [16]: X_scaler = scaler.fit(X_train)

In [17]: X_train = X_scaler.transform(X_train)

In [18]: X_test = X_scaler.transform(X_test)

In [19]: X_train = np.nan_to_num(X_train)

In [20]: X_test = np.nan_to_num(X_test)

Now, we can define the kernel function with an MBQC circuit. We take inspiration from previous embeddings mentioned in [4] to define the measurement angles for each qubit.

gs = mp.templates.muta(2,1, one_column=True)
mp.draw(gs)
ps = mp.PatternSimulator(gs, backend='numpy-dm', window_size = 4)

def quantum_kernel(X, Y=None):

   if Y is None:
      Y = X

   K = np.zeros((X.shape[0], Y.shape[0]))

   for i, x in enumerate(X):
      angles1 = [x[0], 0,  0,0, x[1], np.cos(x[0])*np.cos(x[1]),0, 0]
      ps.reset()
      state1 = ps(angles1)

      for j, y in enumerate(Y):
         angles2 = [y[0], 0,  0,0, y[1], np.cos(y[0])*np.cos(y[1]),0,0]
         ps.reset()
         state2 = ps(angles2)

         K[i, j] = mp.calculator.fidelity(state1, state2)

   return K

Finally, we can create a SVM classifier and use this kernel to train the model.

from sklearn import svm
from sklearn.metrics import accuracy_score

clf = svm.SVC(kernel=quantum_kernel)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))

The decision boundary of the trained model can be visualized as follows:

from matplotlib.colors import ListedColormap

X_train_np = np.array(X_train)
y_train_np = np.array(y_train)

x_min, x_max = X_train_np[:, 0].min() - 0.2, X_train_np[:, 0].max() + 0.2
y_min, y_max = X_train_np[:, 1].min() - 0.2, X_train_np[:, 1].max() + 0.2
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.05),
                     np.arange(y_min, y_max, 0.05))

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

colors = ('red', 'blue')
cmap = ListedColormap(colors)

plt.figure(figsize=(8, 6))
contour = plt.contourf(xx, yy, 1-Z, alpha=0.4, cmap='coolwarm')
plt.scatter(X_train_np[:, 0], X_train_np[:, 1], c=1-y_train_np, cmap='coolwarm', edgecolors='k')
plt.colorbar(contour)

plt.show()

References