Note
This module has been deprecated. See the stats module.
The statistics module in SymPy implements standard probability distributions and related tools. Its contents can be imported with the following statement:
>>> from sympy import *
>>> from sympy.statistics import *
>>> init_printing(use_unicode=False, wrap_line=False, no_global=True)
Normal(mu, sigma) creates a normal distribution with mean value mu and standard deviation sigma. The Normal class defines several useful methods and properties. Various properties can be accessed directly as follows:
>>> N = Normal(0, 1)
>>> N.mean
0
>>> N.median
0
>>> N.variance
1
>>> N.stddev
1
You can generate random numbers from the desired distribution with the random method:
>>> N = Normal(10, 5)
>>> N.random()
4.914375200829805834246144514
>>> N.random()
11.84331557474637897087177407
>>> N.random()
17.22474580071733640806996846
>>> N.random()
9.864643097429464546621602494
The probability density function (pdf) and cumulative distribution function (cdf) of a distribution can be computed, either in symbolic form or for particular values:
>>> N = Normal(1, 1)
>>> x = Symbol('x')
>>> N.pdf(1)
___
\/ 2
--------
____
2*\/ pi
>>> N.pdf(3).evalf()
0.0539909665131880
>>> N.cdf(x)
/ ___ \
|\/ 2 *(x - 1)|
erf|-------------|
\ 2 / 1
------------------ + -
2 2
>>> N.cdf(-oo), N.cdf(1), N.cdf(oo)
(0, 1/2, 1)
>>> N.cdf(5).evalf()
0.999968328758167
The method probability gives the total probability on a given interval (a convenient alternative syntax for cdf(b)-cdf(a)):
>>> N = Normal(0, 1)
>>> N.probability(-oo, 0)
1/2
>>> N.probability(-1, 1)
/ ___\
|\/ 2 |
erf|-----|
\ 2 /
>>> N.probability(-1, 1).evalf()
0.682689492137086
You can also generate a symmetric confidence interval from a given desired confidence level (given as a fraction 0-1). For the normal distribution, 68%, 95% and 99.7% confidence levels respectively correspond to approximately 1, 2 and 3 standard deviations:
>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.confidence(0.997)
(-2.96773792534178, 2.96773792534178)
Plug the interval back in to see that the value is correct:
>>> N.probability(*N.confidence(0.95)).evalf()
0.950000000000000
Besides the normal distribution, uniform continuous distributions are also supported. Uniform(a, b) represents the distribution with uniform probability on the interval [a, b] and zero probability everywhere else. The Uniform class supports the same methods as the Normal class.
Additional distributions, including support for arbitrary user-defined distributions, are planned for the future.
Sample([x1, x2, x3, ...]) represents a collection of samples. Sample parameters like mean, variance and stddev can be accessed as properties. The sample will be sorted.
Examples
>>> from sympy.statistics.distributions import Sample
>>> Sample([0, 1, 2, 3])
Sample([0, 1, 2, 3])
>>> Sample([8, 3, 2, 4, 1, 6, 9, 2])
Sample([1, 2, 2, 3, 4, 6, 8, 9])
>>> s = Sample([1, 2, 3, 4, 5])
>>> s.mean
3
>>> s.stddev
sqrt(2)
>>> s.median
3
>>> s.variance
2
Base class for continuous probability distributions
Calculate the probability that a random number x generated from the distribution satisfies a <= x <= b
Examples
>>> from sympy.statistics import Normal
>>> from sympy.core import oo
>>> Normal(0, 1).probability(-1, 1)
erf(sqrt(2)/2)
>>> Normal(0, 1).probability(1, oo)
-erf(sqrt(2)/2)/2 + 1/2
random() – generate a random number from the distribution. random(n) – generate a Sample of n random numbers.
Examples
>>> from sympy.statistics import Uniform
>>> x = Uniform(1, 5).random()
>>> x < 5 and x > 1
True
>>> x = Uniform(-4, 2).random()
>>> x < 2 and x > -4
True
Normal(mu, sigma) represents the normal or Gaussian distribution with mean value mu and standard deviation sigma.
Examples
>>> from sympy.statistics import Normal
>>> from sympy import oo
>>> N = Normal(1, 2)
>>> N.mean
1
>>> N.variance
4
>>> N.probability(-oo, 1) # probability on an interval
1/2
>>> N.probability(1, oo)
1/2
>>> N.probability(-oo, oo)
1
>>> N.probability(-1, 3)
erf(sqrt(2)/2)
>>> _.evalf()
0.682689492137086
Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics import Normal
>>> Normal(1, 2).cdf(0)
-erf(sqrt(2)/4)/2 + 1/2
>>> from sympy.abc import x
>>> Normal(1, 2).cdf(x)
erf(sqrt(2)*(x - 1)/4)/2 + 1/2
Return a symmetric (p*100)% confidence interval. For example, p=0.95 gives a 95% confidence interval. Currently this function only handles numerical values except in the trivial case p=1.
For example, one standard deviation:
>>> from sympy.statistics import Normal
>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.probability(*_).evalf()
0.680000000000000
Two standard deviations:
>>> N = Normal(0, 1)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.probability(*_).evalf()
0.950000000000000
Create a normal distribution fit to the mean and standard deviation of the given distribution or sample.
Examples
>>> from sympy.statistics import Normal
>>> Normal.fit([1,2,3,4,5])
Normal(3, sqrt(2))
>>> from sympy.abc import x, y
>>> Normal.fit([x, y])
Normal(x/2 + y/2, sqrt((-x/2 + y/2)**2/2 + (x/2 - y/2)**2/2))
Uniform(a, b) represents a probability distribution with uniform probability density on the interval [a, b] and zero density everywhere else.
Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics import Uniform
>>> Uniform(1, 5).cdf(2)
1/4
>>> Uniform(1, 5).cdf(4)
3/4
Generate a symmetric (p*100)% confidence interval.
>>> from sympy import Rational
>>> from sympy.statistics import Uniform
>>> U = Uniform(1, 2)
>>> U.confidence(1)
(1, 2)
>>> U.confidence(Rational(1,2))
(5/4, 7/4)
Create a uniform distribution fit to the mean and standard deviation of the given distribution or sample.
Examples
>>> from sympy.statistics import Uniform
>>> Uniform.fit([1, 2, 3, 4, 5])
Uniform(-sqrt(6) + 3, sqrt(6) + 3)
>>> Uniform.fit([1, 2])
Uniform(-sqrt(3)/2 + 3/2, sqrt(3)/2 + 3/2)
If func is not normalized so that integrate(func, (x, a, b)) == 1, it can be normalized using PDF.normalize() method
Examples
>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a)/a, (x,0,oo))
>>> exponential.pdf(x)
exp(-x/a)/a
>>> exponential.cdf(x)
1 - exp(-x/a)
>>> exponential.mean
a
>>> exponential.variance
a**2
Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics.distributions import PDF
>>> from sympy import exp, oo
>>> from sympy.abc import x, y
>>> PDF(exp(-x/y), (x,0,oo)).cdf(4)
y - y*exp(-4/y)
>>> PDF(2*x + y, (x, 10, oo)).cdf(0)
-10*y - 100
Normalize the probability distribution function so that integrate(self.pdf(x), (x, a, b)) == 1
Examples
>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a), (x,0,oo))
>>> exponential.normalize().pdf(x)
exp(-x/a)/a
Return a probability distribution of random variable func(x) currently only some simple injective functions are supported
Examples
>>> from sympy.statistics.distributions import PDF
>>> from sympy import oo
>>> from sympy.abc import x, y
>>> PDF(2*x + y, (x, 10, oo)).transform(x, y)
PDF(0, ((_w,), x, x))