因项目需要,自己写了一个求协方差的函数。数据本身是n×3的矩阵,求的协方差是每一列作为观测量。但是自己写的函数与numpy本身的协方差函数的结果有误差。
下面是testcode:
import numpy as np
rdata = np.random.rand(1000, 3)
cloudx = rdata[:, 0]
cloudy = rdata[:, 1]
cloudz = rdata[:, 2]
coef_xx = np.mean(cloudx * cloudx)
coef_xy = np.mean(cloudx * cloudy)
coef_xz = np.mean(cloudx * cloudz)
coef_yy = np.mean(cloudy * cloudy)
coef_yz = np.mean(cloudy * cloudz)
coef_zz = np.mean(cloudz * cloudz)
coef_x, coef_y, coef_z = np.mean(rdata, axis=0)
cov = np.zeros((3, 3))
cov[0, 0] = coef_xx - coef_x**2
cov[1, 1] = coef_yy - coef_y**2
cov[2, 2] = coef_zz - coef_z**2
cov[0, 1] = cov[1, 0] = coef_xy - coef_x*coef_y
cov[0, 2] = cov[2, 0] = coef_xz - coef_x*coef_z
cov[1, 2] = cov[2, 1] = coef_yz - coef_y*coef_z
npcov = np.cov(rdata, rowvar=False)
print((npcov - cov) / npcov)
运行多次结果输出一直是
[[ 0.001 0.001 0.001]
[ 0.001 0.001 0.001]
[ 0.001 0.001 0.001]]
想问这是什么原因,是numpy的精度问题吗?我试了几个numpy.random
里面的分布函数,生成的数据误差都是0.001
,但是如果从文件里读数据就不是这个结果(会出现0.99e-5
之类的)