random 模块 ( http://docs.python.org/2/library/random.html ) 有几个固定的函数可以随机抽样。例如 random.gauss 将从具有给定均值和西格玛值的正态分布中抽取随机点。我正在寻找一种方法来尽可能快地使用我自己的分布在 --- 中提取给定间隔之间随机样本的数字 N python 。这就是我的意思： def my_dist(x): # Some distribution, assume c1,c2,c3 and c4 are known. f = c1*exp(-((x-c2)**c3)/c4) return f # Draw N random samples from my distribution between given limits a,b. N = 1000 N_rand_samples = ran_func_sample(my_dist, a, b, N) 其中 ran_func_sample 是我所追求的， a, b 是抽取样本的限制。 python 中是否有此类内容？原文由 Gabriel 发布，翻译遵循 CC BY-SA 4.0 许可协议

您需要使用逆变换采样方法来获得根据您想要的规律分布的随机值。使用此方法，您可以将倒函数应用于区间 [0,1] 中具有标准均匀分布的随机数。找到倒函数后，您将以这种明显的方式根据所需的分布得到 1000 个数字： [inverted_function(random.random()) for x in range(1000)] 有关逆变换采样的更多信息： http://en.wikipedia.org/wiki/Inverse_transform_sampling 此外，StackOverflow 上有一个与该主题相关的好问题：以不同概率选择列表元素的Pythonic方法原文由 Igor Chubin 发布，翻译遵循 CC BY-SA 3.0 许可协议

快速任意分布随机抽样（逆变换抽样）

2 个回答

得票最新

社区维基

1

发布于
2023-01-10

✓ 已被采纳

您需要使用 逆变换采样 方法来获得根据您想要的规律分布的随机值。使用此方法，您可以将 倒函数 应用于区间 [0,1] 中具有标准均匀分布的随机数。

找到倒函数后，您将以这种明显的方式根据所需的分布得到 1000 个数字：

 [inverted_function(random.random()) for x in range(1000)]

有关 逆变换采样 的更多信息：

http://en.wikipedia.org/wiki/Inverse_transform_sampling

此外，StackOverflow 上有一个与该主题相关的好问题：

以不同概率选择列表元素的Pythonic方法

原文由 Igor Chubin 发布，翻译遵循 CC BY-SA 3.0 许可协议

社区维基

1

发布于
2023-01-10

这段代码实现了nd个离散概率分布的采样。通过在对象上设置一个标志，它也可以用作分段常数概率分布，然后可以用来近似任意 pdf。嗯，具有紧凑支持的任意 pdf；如果您想要高效地采样极长的尾巴，则需要对 pdf 进行非统一描述。但这即使对于 airy-point-spread 函数（我最初为此创建它）之类的东西仍然有效。值的内部排序对于获得准确性绝对是至关重要的；尾部的许多小值应该有很大贡献，但如果不进行排序，它们将被淹没在 fp 精度中。

 class Distribution(object):
    """
    draws samples from a one dimensional probability distribution,
    by means of inversion of a discrete inverstion of a cumulative density function

    the pdf can be sorted first to prevent numerical error in the cumulative sum
    this is set as default; for big density functions with high contrast,
    it is absolutely necessary, and for small density functions,
    the overhead is minimal

    a call to this distibution object returns indices into density array
    """
    def __init__(self, pdf, sort = True, interpolation = True, transform = lambda x: x):
        self.shape          = pdf.shape
        self.pdf            = pdf.ravel()
        self.sort           = sort
        self.interpolation  = interpolation
        self.transform      = transform

        #a pdf can not be negative
        assert(np.all(pdf>=0))

        #sort the pdf by magnitude
        if self.sort:
            self.sortindex = np.argsort(self.pdf, axis=None)
            self.pdf = self.pdf[self.sortindex]
        #construct the cumulative distribution function
        self.cdf = np.cumsum(self.pdf)
    @property
    def ndim(self):
        return len(self.shape)
    @property
    def sum(self):
        """cached sum of all pdf values; the pdf need not sum to one, and is imlpicitly normalized"""
        return self.cdf[-1]
    def __call__(self, N):
        """draw """
        #pick numbers which are uniformly random over the cumulative distribution function
        choice = np.random.uniform(high = self.sum, size = N)
        #find the indices corresponding to this point on the CDF
        index = np.searchsorted(self.cdf, choice)
        #if necessary, map the indices back to their original ordering
        if self.sort:
            index = self.sortindex[index]
        #map back to multi-dimensional indexing
        index = np.unravel_index(index, self.shape)
        index = np.vstack(index)
        #is this a discrete or piecewise continuous distribution?
        if self.interpolation:
            index = index + np.random.uniform(size=index.shape)
        return self.transform(index)

if __name__=='__main__':
    shape = 3,3
    pdf = np.ones(shape)
    pdf[1]=0
    dist = Distribution(pdf, transform=lambda i:i-1.5)
    print dist(10)
    import matplotlib.pyplot as pp
    pp.scatter(*dist(1000))
    pp.show()

作为一个更真实的相关示例：

 x = np.linspace(-100, 100, 512)
p = np.exp(-x**2)
pdf = p[:,None]*p[None,:]     #2d gaussian
dist = Distribution(pdf, transform=lambda i:i-256)
print dist(1000000).mean(axis=1)    #should be in the 1/sqrt(1e6) range
import matplotlib.pyplot as pp
pp.scatter(*dist(1000))
pp.show()

原文由 Eelco Hoogendoorn 发布，翻译遵循 CC BY-SA 3.0 许可协议

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

快速任意分布随机抽样（逆变换抽样）

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Stack Overflow 翻译

快速任意分布随机抽样（逆变换抽样）

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。 请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Stack Overflow 翻译

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？