# The science of finance/Probabilities and evaluation of risks

## Probabilities in economics

To give an objective meaning to a probability, it is necessary and sufficient that it be measurable. For that it is necessary and sufficient that an experiment be reproducible. By reproducing the same experiment many times, the probabilities of its results are measured from their frequencies of appearance. But in economics experiments are never reproducible. Each economic event is unique. All agents are always different, and the conditions in which they are placed are never exactly the same. Therefore, there can be no objective meaning for economic probabilities. But probabilistic models are of fundamental importance for economics. Where is the error? Are probabilistic models in economics always false because the probabilities in economics have no objective significance?

The uncertainty of games of chance can be considered purely probabilistic, provided they are not faked, because they are generally reproducible experiments. Economic uncertainty is not strictly probabilistic, but economic decision-making about risk is very similar to gamblers' decision-making. When we face a risk, we reason as if we made a bet with reality. When we attribute probabilities to expected economic events, they can never be true and accurate, but they can still be estimates that make better decisions. An insurance company, for example, must estimate probabilities of death. It does so from the statistics of recorded deaths. If it did not do so, it would not be able to anticipate its payment obligations and it would run the risk of bankruptcy.

Probabilistic models in economics can never be strictly true, but they may be sufficiently similar to reality to merit comparison. All that is needed is a relevant analogy to justify a model. And models can be useful even when they are very wrong, because the gap between the model and reality can be very significant.

## The measurement of risk

Risk is commonly referred to as a measurable magnitude: the risk is zero in a situation of certainty and increases when uncertainty increases. The ratio of the standard deviation to the average profit of a project is often a good measure of its risk because it increases with uncertainty and because it makes it possible to compare projects of different average profits. But a single magnitude can never suffice to account for the diversity of risks. Standard deviation measures the spread of a probability distribution, but the shape of this distribution is also important for assessing risk. It determines the distribution of probabilities between very high and moderate gains and moderate and very high losses. The standard deviation measures spread, but it does not say anything about the diversity of forms. It is therefore useful especially for comparing distributions of the same form. If for example two distributions are normal (gaussian), the riskiest is clearly the one with the greatest standard deviation. But if distributions have different forms, comparing their standard deviations is not necessarily a good way to compare their risks. In addition, the standard deviation is not the only quantity that measures the spread of a distribution. The mean of the absolute values ​​of the deviations from the mean is sometimes a more natural estimate of the spread than the square root of the mean of the squares of the deviations from the mean, which places great importance on the large differences to the detriment of the small ones.

When we measure the spread of a distribution, we make an average both on the chances of winning, when the return is above its average, and on the chances of loss, when it is lower. The risks of loss are particularly important when one questions the viability of a company or project. We are particularly attentive to the left side of the probability distribution, while the spread of the right side, the chances of winning, does not inform us.

Value at risk, is an indicator of a risk of loss widely used to study the viability of companies.

The p value at risk of a random variable ${\displaystyle X}$ is by definition the number ${\displaystyle V_{X}^{R}(p)}$ such that ${\displaystyle Pr(X\leq -V_{X}^{R}(p))=p}$. The positive values of ${\displaystyle X}$ are profits, the negative ones are losses.

${\displaystyle Pr(X\leq x)=F_{X}(x)}$ is the probability that ${\displaystyle X\leq x}$. ${\displaystyle F_{X}}$ is the cumulative distribution function of the random variable ${\displaystyle X}$. ${\displaystyle F_{X}(x)=\int _{-\infty }^{x}f_{X}(y)dy}$ where ${\displaystyle f_{X}}$ is the probability density of ${\displaystyle X}$.

${\displaystyle V_{X}^{R}(p)=-F_{X}^{-1}(p)}$

The value at risk makes it possible to estimate the loss that may be incurred. We know that with a probability ${\displaystyle 1-p}$, the loss will not be greater than ${\displaystyle V_{X}^{R}(p)}$. ${\displaystyle p}$ is usually a small number, ${\displaystyle 0.05=5\%}$ or less. Losses that have a probability lower than ${\displaystyle p}$ can be considered extraordinary losses. The value at risk is used to estimate the maximum amount of ordinary losses, but it does not say anything about extraordinary losses. If losses have a probability lower than ${\displaystyle p}$, they have no influence on ${\displaystyle V_{X}^{R}(p)}$. Two companies, one of which is exposed to gigantic but rare losses, and not the other, may have the same value at risk.

Risk indicators such as standard deviation, or the mean of absolute values ​​of deviations from the mean, or the value at risk, can never provide complete information on the risks to face. It is necessary to know the probability density of the expected profits and losses, or in an equivalent way their cumulative distribution function, to be completely informed on the risk.

## The price of risk

In general, it is believed that risk decreases the value of a project. The risks of low profit and especially loss are generally dissuasive, even if the chances of a high profit make that the average profit is substantial. Of two projects that have the same average profit, the riskiest is usually the least valuable. This decrease in value is a risk premium. In order for a risky project to be attractive, it must generally yield more than the risk-free interest rate, it must have a sufficient average profit to offset risk taking. The risk premium can be calculated from the surplus profit rate required to offset the risk, ie the difference between the required rate of profit and the risk-free interest rate.

Risk aversion is not universal. There are many exceptions. The risk may not diminish the value of a project but instead be sought for itself. In most games of chance, for example, one always loses on average, the average profits are negative, but even very small chances of a large gain are enough to convince the gamblers to participate. The value they attribute to the sought risk offsets their average loss.

Since economic agents are not equal in the face of risks, they evaluate them very differently. This promotes risk trading. The same risk that is very dissuasive for one agent may be very little for another. The second then has an interest in selling the first one a guarantee against risk.

## Discounting uncertain projects

A risk-adjusted discount rate is sometimes defined by adding to the risk-free discount rate the surplus profit rate which measures the risk premium. But to calculate the net present value of an uncertain project, this definition makes sense only if there is only one initial cost and one final revenue. If there are uncertain losses, it is obviously foolish to depreciate them with a higher discount rate than the risk-free rate. And when uncertain payments are staggered over time there is usually no reason to discount them with the same risk-adjusted discount rate because they may have very different uncertainties.

The discount rate is the risk-free interest rate. It is estimated from the investment rates considered safe but this does not prevent it from being applied to the evaluation of uncertain projects. It is an exchange rate between the money paid today and the money paid later, a time exchange rate. That a project is certain and risk-free or very uncertain and very risky should not change anything in the way of discounting the payments. To evaluate risky projects, all revenues and costs must be discounted at the risk-free investment rate, because this is the best way to add payments staggered over time. Whether or not these payments were predicted with certainty or not does not change the case. The discount rate depends on the economic reality at a given moment, not the projects to which it is applied. It is the same for all projects facing the same reality, whether they are risky or not.

If a project is risky, and if the probabilities of revenue and costs are known, we can calculate with the true discount rate, not a risk-adjusted rate, an average discounted value of revenues, cash advances and other costs, and therefore an average profit, an average profit rate and an average net present value, but this is not the net present value of the project, because it does not take into account the risk premium.

The net present value of an uncertain project is its average net present value minus the risk premium. The risk premium is the present value of the average surplus profit required to offset the risk. It is the present value of the difference between the average profit required and the profit that would be obtained if one placed one's capital at the risk-free interest rate. Since the average net present value is the present value of the average surplus profit, the net present value of an uncertain project is the present value of the difference between the average surplus profit and the surplus profit required to offset the risk, hence the present value of the difference between the average profit and the profit required to offset the risk.

## Statistical independence, anticorrelation and risk reduction

There are two techniques to reduce risk. One makes use of the statistical independence of economic events, the other their anticorrelations.

• When many projects are statistically independent, their sum is much less risky than each individual project. A simple example suffices to be convinced: if ${\displaystyle n}$ statistically independent projects have the same expected average profit ${\displaystyle \pi }$ with the same standard deviation ${\displaystyle \sigma }$, ${\displaystyle \sigma /\pi }$ measures the risk of an individual project. Their sum has a mean expected profit ${\displaystyle n\pi }$ with a standard deviation ${\displaystyle \sigma {\sqrt {n}}}$, because the variance of a sum of independent variables is the sum of their variances. The risk of the sum of all projects is therefore measured by ${\displaystyle {\frac {\sigma {\sqrt {n}}}{n\pi }}={\frac {\sigma }{\pi {\sqrt {n}}}}}$ which approaches zero if ${\displaystyle n}$ is very large. In general, projects do not have the same expected average profits or the same risks, but if they are statistically independent, the risk of their sum is generally much lower than the risk of each of them taken in isolation, and approaches zero when the number of projects is very large.
• Two projects are anticorrelated, or negatively correlated, when the success of one is correlated with the failure of the other. The covariance of their profits is negative. For example, the price rise of a stock is correlated with the decline in the value of a put option on that stock. Positive and negative correlations between the values ​​of assets and corresponding options result in risk-free portfolios that form the basis of option pricing methodology of Merton, Black and Scholes presented later. Positive correlations can be used by betting down, which can be done by selling short, buying put options or selling calls. Selling short is selling shares that have been borrowed by committing to buy them later to return them. Risk hedging strategies generally involve betting at the same time on the increase of certain assets and the decline of other assets. But assets are often positively correlated, their values ​​often increase or decrease at the same time. The rise of some is therefore often anticorrelated with the decline of others. Hedging strategies therefore reduce the risks associated with changes in the value of assets. One tries to win on both counts, both when the market is rising and when it is falling.

## The casino economy

A project is usually affected by many uncertain events that can increase or decrease its profit. Two types of events must be distinguished, those that affect only the current profit of the project, unforeseen costs or revenues that momentarily affect the profit of a project but have no influence on its future profits, and those that permanently affect the project's ability to make a profit. The effects of events of the first type must be added to obtain their cumulative effect, while those of the second type must be multiplied.

Suppose that a project is affected only by events of the second type: many random events ${\displaystyle i}$ can influence its value ${\displaystyle V}$ during a time interval ${\displaystyle dt}$. Each event ${\displaystyle i}$ if it occurs multiplies ${\displaystyle V}$ by a factor ${\displaystyle R_{i}>1}$ if it promotes the success of the project and ${\displaystyle <1}$ if it is unfavorable. We assume that the events ${\displaystyle i}$ have probabilities ${\displaystyle p_{i}=\rho _{i}dt}$ and that they are independent. The variation of ${\displaystyle \ln V}$ is then a sum of random variations ${\displaystyle \ln R_{i}}$. According to the central limit theorem, if these variations are very numerous, small compared to their sum and independent, the distribution of probabilities of this sum is a normal law. Let ${\displaystyle \mu dt}$ be its average and ${\displaystyle \sigma ^{2}dt}$ its variance. If the project environment is constant, ${\displaystyle \mu }$ and ${\displaystyle \sigma }$ do not depend on ${\displaystyle t}$. On a time ${\displaystyle T}$, the variation of ${\displaystyle \ln V}$ then follows a normal distribution of mean ${\displaystyle \mu T}$ and variance ${\displaystyle \sigma ^{2}T}$.

Let ${\displaystyle V_{0}}$ be the initial value of the project. Since ${\displaystyle \ln V}$ follows a normal distribution of mean ${\displaystyle \ln V_{0}+\mu T}$ and of variance ${\displaystyle \sigma ^{2}T}$, ${\displaystyle V}$ follows a log-normal law whose probability density is:

${\displaystyle f(V)={\frac {1}{V\sigma {\sqrt {t}}{\sqrt {2\pi }}}}e^{-(\ln {\frac {V}{V_{0}}}-\mu T)^{2}/2T\sigma ^{2}}}$

The average value of ${\displaystyle V}$ is ${\displaystyle V_{0}e^{(\mu +\sigma ^{2}/2)T}}$ and its standard deviation is ${\displaystyle V_{0}e^{(\mu +\sigma ^{2}/2)T}{\sqrt {e^{\sigma ^{2}T}-1}}}$. If we measure the risk by the ratio of the standard deviation on the mean, we obtain ${\displaystyle {\sqrt {e^{\sigma ^{2}T}-1}}}$ which increases very quickly with ${\displaystyle T}$.

Suppose now that a project is affected only by events of the first type. Many random events ${\displaystyle i}$ can influence the current profit ${\displaystyle d\Pi }$ during a time interval ${\displaystyle dt}$. Each event ${\displaystyle i}$ if it occurs increases ${\displaystyle d\Pi }$ by an amount ${\displaystyle d\Pi _{i}>0}$ if it is a receipt and ${\displaystyle <0}$ if it is a cost. We assume that the events ${\displaystyle i}$ have probabilities ${\displaystyle p_{i}=\ rho_{i}dt}$ and that they are independent. The current profit ${\displaystyle d\Pi }$ during ${\displaystyle dt}$ is then a sum of random variations ${\displaystyle d\Pi _{i}}$. According to the central limit theorem, if these variations are very numerous, small compared to their sum and independent, the distribution of probabilities of this sum ${\displaystyle d\Pi =\sum _{i}d\Pi _{i}}$ is a normal law. Let ${\displaystyle \mu dt}$ be its mean and ${\displaystyle \sigma ^{2}dt}$ its variance. If the project environment is constant, ${\displaystyle \mu }$ and ${\displaystyle \sigma }$ do not depend on ${\displaystyle t}$. On a duration ${\displaystyle T}$, the profit ${\displaystyle \Pi }$ follows a normal distribution with mean ${\displaystyle \mu T}$ and variance ${\displaystyle \sigma ^{2}T}$. The ratio of the standard deviation on the average is ${\displaystyle {\frac {\sigma }{\mu {\sqrt {T}}}}}$ and approaches zero when ${\displaystyle T}$ is large.

Over long periods of time, the risk caused by events of the second type tends to dominate that of events of the first type. To evaluate the risks we can therefore often ignore events of the first type, which leads us to retain a log-normal distribution. But this rule is not universal. If the risks associated with events of the first type are important, they should not be ignored, especially if the time considered is short.

All the assumptions that justify the normal distributions of ${\displaystyle \Pi }$ or log-normal of ${\displaystyle V}$ are a priori very doubtful. The ${\displaystyle d\Pi _{i}}$ and the ${\displaystyle \ln R_{i}}$ are not necessarily small compared to their sum because a single event can have a great influence on the success or failure of a project. Nor are they necessarily independent, because failure can lead to more failures or more success, or failure can precede success by resilience, or success can lead to failure, because greatness sometimes precedes the fall. Moreover, there is generally no reason to assume that the probabilities ${\displaystyle p_{i}=\rho _{i}dt}$ are constant over time, because economic events depend on circumstances that vary. In fact, there is usually no reason that these probabilities can be properly defined because economic events are never exactly reproducible. The real economy can be very different from the casino economy. This is just a mathematical model that can help us understand reality but can also mislead us.

For a log-normal distribution, the ratio ${\displaystyle {\sqrt {e^{\sigma ^{2}T}-1}}}$ of the standard deviation on the mean is not necessarily a good indicator of the risk , especially when the distribution is very flattened, for values ​​of ${\displaystyle \sigma ^{2}T>1}$. In general, one prefers ${\displaystyle \sigma {\sqrt {T}}}$, which is also an indicator of the dispersion of expected values ​​around the mean. ${\displaystyle \sigma {\sqrt {T}}}$ is the standard deviation of the logarithmic yield. It is called volatility. If time is measured in years, ${\displaystyle \sigma }$ is the annual volatility.

In this section we have reasoned on the value of a project but we could have kept the same reasoning on the prices on the stock market. In this case the log-normal distribution is a fairly good approximation, but it slightly underestimates the probabilities of large deviations from the mean (Luenberger 1997).

To reason more easily, we will make several simplifying hypotheses in the following:

• Stock price changes are always represented by log-normal distributions whose parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma }$ parameters do not vary.
• We reason on stocks that do not pay dividends. The profits are therefore only the capital gains. This amounts to assuming that the paid dividends are systematically reinvested.
• Inflation is ignored. This means that we reason on the real values ​​and not on the nominal ones.
• Transaction costs and taxes are ignored.
• We assume that there is a single risk-free interest rate. We therefore ignore its term structure. Above all, we ignore that even the "risk free" interest rate is risky because even if there is no risk of default by the borrower, there is always a risk of inflation.

These simplifying hypotheses make it possible to concentrate the reasoning on some of the most important points, but if we want to apply the conclusions to the real world, we must of course be aware of the complications that the theory ignores.

## Arithmetic, logarithmic and average yields

Arithmetic yield, or simply yield, is the rate of profit of an asset. If ${\displaystyle V_{i}}$ is its initial value and ${\displaystyle V_{f}}$ its final value, the profit is ${\displaystyle V_{f}-V_{i}}$ and the profit rate is ${\displaystyle r={\frac {V_{f}-V_{i}}{V_{i}}}}$ therefore ${\displaystyle V_{f}=V_{i}(1+r)}$

The logarithmic yield is ${\displaystyle R=\ln({\frac {V_{f}}{V_{i}}})=\ln(1+r)}$ therefore ${\displaystyle V_{f}=V_{i}e^{R}}$

When ${\displaystyle r\ll 1}$, ${\displaystyle R\approx r}$

The logarithmic yield is more convenient than the arithmetic yield when one has to compose yields for successive periods:

${\displaystyle r_{12}=(1+r_{1})(1+r_{2})-1=r_{1}+r_{2}+r_{1}r_{2}}$

while

${\displaystyle R_{12}=\ln(e^{R_{1}}e^{R_{2}})=R_{1}+R_{2}}$

Arithmetic yield is more convenient than logarithmic yield when calculating a portfolio yield:

${\displaystyle r_{pf}={\frac {\sum _{n=1}^{N}V_{ni}r_{n}}{\sum _{n=1}^{N}V_{ni}}}=\sum _{n=1}^{N}\omega _{n}r_{n}}$

where the ${\displaystyle V_{ni}}$ are the initial values ​​of the ${\displaystyle N}$ assets that make up the portfolio and ${\displaystyle r_{n}}$ their respective yields. ${\displaystyle \omega _{n}}$ is the weight of the asset ${\displaystyle n}$ in the portfolio.

${\displaystyle R_{pf}=\ln(1+r_{pf})\neq \sum _{n=1}^{N}\omega _{n}R_{n}}$

When ${\displaystyle R}$ is a random variable, the average of logarithmic yield is not the logarithmic yield calculated from the average profit, that is why the logarithmic average yield ${\displaystyle R_{m}}$ is defined not by the average of logarithmic yield but by the logarithm of the ratio between the average final value and the initial value:

${\displaystyle R_{m}=\ln({\frac {\langle V_{f}\rangle }{V_{i}}})\neq \langle R\rangle }$

If ${\displaystyle {\frac {V_{f}}{V_{i}}}}$ has a log-normal distribution with parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma }$, the logarithmic average yield is ${\displaystyle \mu +\sigma ^{2}/2}$ because ${\displaystyle e^{\mu +\sigma ^{2}/2}=\langle {\frac {V_{f}}{V_{i}}}\rangle ={\frac {\langle V_{f}\rangle }{V_{i}}}}$ but the average of logarithmic yield is ${\displaystyle \mu }$, because ${\displaystyle \ln({\frac {V_{f}}{V_{i}}})}$ has a normal distribution with mean ${\displaystyle \mu }$. The standard deviation of the logarithmic yield is ${\displaystyle \sigma }$. It measures the dispersion of the logarithmic yield around its mean ${\displaystyle \mu }$ not around the logarithmic average yield ${\displaystyle \mu +\sigma ^{2}/2}$ but the difference is often quite small. For a risky asset ${\displaystyle \mu =0.1}$ and ${\displaystyle \sigma =0.15}$ are typical values ​​for the annual logarithmic yield, and ${\displaystyle \mu +\sigma ^{2}/2\approx 0.11}$

## The evolution of the expected value of an asset

It is assumed that the changes in the price of an asset are random and are described by the lognormal distribution presented above.

If ${\displaystyle V_{0}}$ is the present value of the asset, its future value ${\displaystyle V_{T}}$ on date ${\displaystyle T}$ has a log-normal probability density :

${\displaystyle f(V)={\frac {1}{V\sigma {\sqrt {T}}{\sqrt {2\pi }}}}e^{-(\ln V-\ln V_{0}-\mu T)^{2}/2T\sigma ^{2}}}$

The average, or expected value, of future prices is

${\displaystyle \langle V_{T}\rangle =e^{\ln V_{0}+\mu T+\sigma ^{2}T/2}=V_{0}e^{(\mu +\sigma ^{2}/2)T}}$

## The Capital Asset Pricing Model

The Capital Asset Pricing Model (CAPM) is a simplified model that shows how idealized financial markets could evaluate risk premiums. The main lesson of this model is that a risk premium does not depend on the standard deviation of the value of an asset but on its covariance with the general economic conditions. This measures a non-eliminable risk while the standard deviation includes a risk eliminable by diversification.

If all economic projects were statistically independent, we could always eliminate their risks through diversification. A mutual fund could be an almost risk-free portfolio from a very large number of risky assets. Since the risk is almost eliminated, its cost would be negligible and the risk premiums would be almost zero. All risky projects would then be attractive as soon as their yield is at least the risk-free interest rate. But economic projects are not usually independent. On the contrary, they are generally dependent on the same economic conditions, because all the agents are sensitive to general prosperity or its absence, so they are generally correlated with each other. It is not always possible to eliminate risk by diversification. Risky projects whose risk can not be eliminated must have a higher average yield. The Capital Asset Pricing Model measures non-eliminable risks and the risk premiums they generate. It is based on some simplifying assumptions:

• Agents always measure the performance of a portfolio based on its average yield, and the risk based on the standard deviation of that yield. They are all equally informed about average asset yields, their standard deviations and their covariances.
• They can lend and borrow at the risk free rate freely and without cost.
• They can always build a diversified portfolio. All weighted sums of assets are potential portfolios.

### The half-line of optimal portfolios

Let ${\displaystyle r_{m}}$ be a yield (arithmetic, not logarithmic) higher than the risk-free interest rate ${\displaystyle r_{f}}$. Let ${\displaystyle \Omega (r_{m})}$ be the set of all risky portfolios that have the same average yield. Let ${\displaystyle \sigma _{min}(r_{m})}$ be the standard deviation of the yield of an optimal portfolio of ${\displaystyle \Omega (r_{m})}$: for any portfolio in ${\displaystyle \Omega (r_{m})}$ the standard deviation of its yield is greater than or equal to ${\displaystyle \sigma _{min}(r_{m})}$. Then the half-line of optimal portfolios in the half-plane ${\displaystyle \sigma r}$ is given by the equation:

${\displaystyle r=r_{f}+(r_{m}-r_{f}){\frac {\sigma }{\sigma _{min}(r_{m})}}}$

All ${\displaystyle (\sigma ,r)}$ of possible portfolios are below this half-line. All points on this half-line are optimal ${\displaystyle (\sigma ,r)}$ of portfolios. They are the least risky for a given expected yield, and the most profitable for a given risk.

Proof: Let us first show that all the points of the half-line are possible portfolios. Consider a portfolio consisting of ${\displaystyle \alpha }$ at the risk-free rate and ${\displaystyle (1-\alpha )}$ of the optimal portfolio at the rate ${\displaystyle r_{m}}$. Its average yield is ${\displaystyle \alpha r_{f}+(1-\alpha )r_{m}}$, the standard deviation of this yield is ${\displaystyle (1-\alpha )\sigma _{min}(r_{m})}$, therefore it is on the half-line of the optimal portfolios. If ${\displaystyle 0\leq \alpha \leq 1}$, ${\displaystyle r}$ is between ${\displaystyle r_{f}}$ and ${\displaystyle r_{m}}$. If ${\displaystyle \alpha <0}$, ${\displaystyle r>r_{m}}$. In this case, one borrows funds at the risk-free rate ${\displaystyle r_{f}}$ to invest them at the risky rate ${\displaystyle r_{m}}$, and increases the average yield by leverage.

Let ${\displaystyle (\sigma ,r)}$ be a point above the half-line of optimal portfolios and assume that it represents a possible portfolio. By the same argument as above, the half-line that starts from ${\displaystyle (0,r_{f})}$ and goes through ${\displaystyle (\sigma ,r)}$ would also represent possible portfolios. In particular ${\displaystyle (\sigma {\frac {r_{m}-r_{f}}{r-r_{f}}},r_{m})}$ would represent a possible portfolio. Since ${\displaystyle (\sigma ,r)}$ is above the half-line of optimal portfolios:

${\displaystyle {\frac {r-r_{f}}{\sigma }}>{\frac {r_{m}-r_{f}}{\sigma _{min}(r_{m})}}}$

hence

${\displaystyle \sigma _{min}(r_{m})>\sigma {\frac {r_{m}-r_{f}}{r-r_{f}}}}$

But it is contrary to the definition of ${\displaystyle \sigma _{min}(r_{m})}$ which is the smallest of the standard deviations of the possible portfolios of the same yield ${\displaystyle r_{m}}$. So ${\displaystyle (\sigma ,r)}$ can not represent a possible portfolio if it is above the half-line of optimal portfolios.

It can be concluded that:

${\displaystyle {\frac {\sigma _{min}(r_{m})}{\sigma _{min}(r_{M})}}={\frac {r_{m}-r_{f}}{r_{M}-r_{f}}}}$

for all ${\displaystyle r_{m}}$ and ${\displaystyle r_{M}}$ greater than ${\displaystyle r_{f}}$, since ${\displaystyle (\sigma _{min}(r_{m}),r_{m})}$ and ${\displaystyle (\sigma _{min}(r_{M}),r_{M})}$ are both on the half-line of optimal portfolios.

### The hypothesis of optimal markets

The market portfolio includes in principle all the assets, risky or not, that can be traded in an economy. It includes all the wealth that can make a profit, securities or real estate, as soon as they have a market price. Its average yield depends on the prosperity of the economy. Its standard deviation represents the general risk that all agents face, because it represents the risk of a general lack of prosperity. Because the market portfolio is the most diversified, it eliminates all the risks that can be eliminated through diversification. Very diversified portfolios, such as the SP500, are also eliminating risks that can be eliminated through diversification.

The hypothesis of optimal markets postulates that the market portfolio is on the half-line of optimal portfolios.

This hypothesis is obviously wrong, since agents are not always rational when they invest their money. But as a first approximation we can assume that very diversified portfolios like the SP500 are not very different from an optimal portfolio. We then place the half-line of optimal portfolios by estimating the average yield of the SP500, or another well-diversified and well-chosen portfolio, and its standard deviation.

### The covariance with the general economic conditions

If we assume that the market portfolio is optimal and if ${\displaystyle r_{M}}$ is its average return, ${\displaystyle \sigma _{min}(r_{M})}$ is the standard deviation ${\displaystyle \sigma _{M}}$ of its yield.

The ${\displaystyle \beta _{i}}$ of an asset ${\displaystyle i}$ is by definition:

${\displaystyle \beta _{i}={\frac {Cov(x_{i},x_{M})}{\sigma _{M}^{2}}}}$

where ${\displaystyle x_{i}}$ and ${\displaystyle x_{M}}$ are the random yields of the asset ${\displaystyle i}$ and the market portfolio respectively.

${\displaystyle \beta _{i}}$ is proportional to the covariance with the yield of the market portfolio, so with general economic conditions.

We will show that

${\displaystyle \beta _{i}={\frac {r_{i}-r_{f}}{r_{M}-r_{f}}}}$

where ${\displaystyle r_{i}}$ is the average yield of the asset ${\displaystyle i}$.

Consider a portfolio that contains a value ${\displaystyle \alpha }$ of the asset ${\displaystyle i}$ and ${\displaystyle (1-\alpha )}$ of an optimal portfolio with the same average return ${\displaystyle r_{i}}$. The yield of this portfolio is ${\displaystyle r_{i}}$. Let ${\displaystyle v(\alpha )}$ be the variance of its yield. Since ${\displaystyle v(\alpha )}$ is minimal at ${\displaystyle \alpha =0}$, ${\displaystyle {\frac {dv}{d\alpha }}(0)=0}$ .

Let ${\displaystyle x_{i}}$ be the random yield of the asset ${\displaystyle i}$ and ${\displaystyle x_{opt}}$ the random yield of the optimal portfolio with the same average return.

${\displaystyle r_{i}=E(x_{i})=E(x_{opt})}$

${\displaystyle {\frac {dv}{d\alpha }}={\frac {d}{d\alpha }}[\alpha ^{2}\sigma _{i}^{2}+2\alpha (1-\alpha )cov(x_{i},x_{opt})+(1-\alpha )^{2}\sigma _{opt}^{2}]}$

${\displaystyle =2\alpha \sigma _{i}^{2}+2(1-2\alpha )cov(x_{i},x_{opt})+(2\alpha -2)\sigma _{opt}^{2}}$

${\displaystyle {\frac {dv}{d\alpha }}(0)=2Cov(x_{i},x_{opt})-2\sigma _{opt}^{2}=0}$

hence

${\displaystyle Cov(x_{i},x_{opt})=\sigma _{opt}^{2}=\sigma _{min}(r_{i})^{2}}$

The portfolio with yield ${\displaystyle x_{opt}}$ can consist of ${\displaystyle {\frac {r_{i}-r_{f}}{r_{M}-r_{f}}}=\beta _{i}'}$ of the market portfolio and ${\displaystyle {\frac {r_{M}-r_{i}}{r_{M}-r_{f}}}=(1-\beta _{i}')}$ of a risk-free asset. So

${\displaystyle Cov(x_{i},x_{opt})=\beta _{i}'Cov(x_{i},x_{M})}$

Now

${\displaystyle Cov(x_{i},x_{opt})=\sigma _{min}(r_{i})^{2}=\beta _{i}'^{2}\sigma _{M}^{2}}$

Therefore

${\displaystyle \beta _{i}'={\frac {Cov(x_{i},x_{M})}{\sigma _{M}^{2}}}=\beta _{i}}$

### ${\displaystyle \beta }$ measures the risk which cannot be eliminated by diversification

We define the random variable ${\displaystyle \epsilon _{i}}$ by

${\displaystyle x_{i}=r_{f}+\beta _{i}(x_{M}-r_{f})+\epsilon _{i}}$

Then

${\displaystyle E(\epsilon _{i})=E(x_{i})-\beta _{i}E(x_{M}-r_{f})-r_{f}=0}$

${\displaystyle Cov(\epsilon _{i},x_{M})=Cov(x_{i}-\beta _{i}x_{M}-(1-\beta _{i})r_{f},x_{M})=Cov(x_{i},x_{M})-\beta _{i}Cov(x_{M},x_{M})=Cov(x_{i},x_{M})-\beta _{i}\sigma _{M}^{2}=0}$

${\displaystyle \sigma _{i}^{2}=\beta _{i}^{2}\sigma _{M}^{2}+\sigma _{\epsilon _{i}}^{2}}$

The variance of ${\displaystyle x_{i}}$ is the sum of two terms. ${\displaystyle \beta _{i}^{2}\sigma _{M}^{2}}$ represents a risk that can not be eliminated by diversification. ${\displaystyle \sigma _{\epsilon _{i}}^{2}}$ represents the risk that can be eliminated by diversification, because it is the variance of a random variable that is not correlated with the market portfolio.

### The risk premium depends only on the covariance with the general economic conditions

The risk premium of an asset ${\displaystyle i}$ is the present value of the surplus profit required to offset its risk. It is obtained from the required surplus profit rate ${\displaystyle r_{i}-r_{f}}$. Since ${\displaystyle r_{i}-r_{f}=\beta _{i}(r_{M}-r_{f})}$, the risk premium of an asset ${\displaystyle i}$ depends only on ${\displaystyle \beta _{i}}$ therefore only on the covariance with the general economic conditions.

If ${\displaystyle \beta _{i}=0}$, the risk premium of the asset ${\displaystyle i}$ is zero. It is possible that this asset is also very risky, with a very high ${\displaystyle \sigma _{i}}$, but this risk can be eliminated by diversification with other assets whose ${\displaystyle \beta }$ is zero. That is why the risk premium is zero.

### When ${\displaystyle \beta }$ is negative

If ${\displaystyle \beta _{i}}$ is negative, ${\displaystyle r_{i}=r_{f}+\beta _{i}(r_{M}-r_{f}). Why agree to hold the risky asset ${\displaystyle i}$ when it exposes us on average to profits lower than the risk free profit or to losses?

The ${\displaystyle \beta }$ of a portfolio is the weighted average of the ${\displaystyle \beta _{j}}$ of the assets ${\displaystyle j}$ of which it is composed, because the covariance of a sum of random variables with the same variable is the sum of their covariances with that variable. An asset ${\displaystyle i}$ whose ${\displaystyle \beta _{i}}$ is negative decreases the ${\displaystyle \beta }$ of a portfolio in which it is included, so the risk non-eliminable by diversification. The decrease of the yield is offset by the decrease of the risk.

A put option is an example of an asset with negative ${\displaystyle \beta }$ if the ${\displaystyle \beta }$ of the underlying asset is positive, because a put option is negatively correlated with the underlying asset. The ${\displaystyle \beta }$ can even be negative enough for the average yield to be negative. Such an option is more expensive to buy than it yields on average. Why then buy put options? Because they reduce the risk of a portfolio that would be very risky without them.

### The three kinds of risk

• If ${\displaystyle \beta _{i}>0}$, the risk of the asset ${\displaystyle i}$ is positively correlated with the market. It is a risk that can not be eliminated without reducing profits.
• If ${\displaystyle \beta _{i}=0}$, the risk of the asset ${\displaystyle i}$ is decorrelated from the market and can be eliminated by diversification without reducing profits.
• If ${\displaystyle \beta _{i}<0}$, the risk of the asset ${\displaystyle i}$ is negatively correlated with the market. Anticorrelations of ${\displaystyle i}$ with the market reduce the risk of a portfolio whose ${\displaystyle \beta }$ is positive while reducing its yield.

### Optimizing a portfolio

The CAPM can be used even if the optimal markets hypothesis is abandoned because it provides a method for optimizing a portfolio:

Having initially chosen a sensible portfolio, the SP500 for example, we begin by estimating its average yield ${\displaystyle r_{I}}$ and its variance ${\displaystyle \sigma _{I}^{2}}$. For each asset ${\displaystyle i}$ that it does not contain, its average yield ${\displaystyle r_{i}}$ and its covariance ${\displaystyle cov_{i}}$ with the initial portfolio are then estimated. We then calculate the ${\displaystyle \beta _{i}}$ of this asset relative to the initial portfolio:

${\displaystyle \beta _{i}={\frac {cov_{i}}{\sigma _{I}^{2}}}}$

If ${\displaystyle {\frac {r_{i}-r_{f}}{r_{I}-r_{f}}}>\beta _{i}}$ then it is valuable to incorporate the asset ${\displaystyle i}$ in the initial portfolio. If the new assets so chosen are well diversified, this method gives a better portfolio than the initial one. It can have both higher yield and lower risk. By iterating the process, one can hope to find an optimal portfolio and beat the market.

The CAPM was introduced by Jack Treynor (1961, 1962), William F. Sharpe (1964), John Lintner (1965) and Jan Mossin (1966) independently, building on the earlier work of Harry Markowitz on diversification and modern portfolio theory. (Wikipedia)

## Expected values of yields are usually not measurable

To measure the expected value ${\displaystyle \mu }$ of a random variable ${\displaystyle X}$ with standard deviation ${\displaystyle \sigma }$, we must have at least ${\displaystyle 100({\frac {\sigma }{\mu }})^{2}}$ independent measurements.

Proof: The average ${\displaystyle \mu _{n}}$ measured with ${\displaystyle n}$ independent measurements is itself a random variable:

${\displaystyle \mu _{n}={\frac {1}{n}}\sum _{i=1}^{n}X_{i}}$ where the ${\displaystyle X_{i}}$ are independent random variables with the same law as ${\displaystyle X}$.

The expected value of ${\displaystyle \mu _{n}}$ is obviously ${\displaystyle \mu }$ and its standard deviation is

${\displaystyle \sigma _{n}={\frac {\sigma }{\sqrt {n}}}}$

For the measurement to be meaningful, ${\displaystyle \sigma _{n}}$ must be small in front of ${\displaystyle \mu _{n}}$. With ${\displaystyle {\frac {\sigma _{n}}{\mu _{n}}}<{\frac {1}{10}}}$, we can hope to have a correct estimate of ${\displaystyle \mu }$ but this remains a very imprecise measurement, because there is a non-negligible probability, approximately 0.04, that ${\displaystyle \mu _{n}}$ deviates from ${\displaystyle \mu }$ by more than 20%.

${\displaystyle {\frac {\sigma _{n}}{\mu }}<{\frac {1}{10}}}$ requires ${\displaystyle n>100({\frac {\sigma }{\mu }})^{2}}$

The annual yields of stocks are generally between 6% and 30% and sometimes lower. 12% is a typical value. The standard deviations of these yields are generally between 10% and 60%. 15% is a typical value. So ${\displaystyle \sigma /\mu \approx 1}$ or higher. It would take more than a century of annual yield measurements to get a rough estimate of its expected value. But we do not have a century, only a few years, because there is no reason for the expected value of a yield to remain constant for more than a few years. If we measure the monthly return, we increase the number of measurements by a factor of 12 but ${\displaystyle (\sigma /\mu )^{2}}$ also increases by a factor of 12. So the conclusion is not changed. In general, expected values of yields are not measurable (Luenberger 1998).

The measured average yields are not estimates of the theoretical average yields, unless the investments are low risk, because then ${\displaystyle \sigma /\mu }$ is small in front of 1. This implies that a probabilistic model of a variation of yield can not be directly confronted with reality, since one can not measure its most important parameter, the expected value of the yield. This is not surprising since, in general, theoretical probabilities have no objective significance in economics, as economic events are not reproducible.

Since probabilistic models of yield variations can not be directly confronted with reality, one is tempted to conclude that they have no scientific value, but it is an exaggerated conclusion, because these models can still lead to observable predictions. They are sometimes very useful to explain our observations even if we can not measure all their parameters.

The CAPM assumes that agents make their investment choices knowing the expected values of yields. Since these are usually not measurable, agents can not know them. In fact, these expected values of yields are purely theoretical and have no real existence. But it should not be concluded that the CAPM is completely unrealistic. Agents estimate expected yields and make decisions based on their estimates. If the agents are properly informed, the expected yields may be not too bad estimates of the yields finally achieved.

## Risk reduction by temporal diversification

Consider a very risky investment that allows one to multiply one's capital by a factor ${\displaystyle k>2}$ over a short period of time with a probability ${\displaystyle 0.5}$ or to lose everything. Its average efficiency is ${\displaystyle k/2-1}$. But if we invest all our capital for ${\displaystyle n}$ successive periods, we have a probability ${\displaystyle 1-(1/2)^{n}}$ of being ruined. If ${\displaystyle n}$ is large, the ruin is almost certain and we will not have benefited from the average yield. Is it not possible to take advantage of this average yield without taking risks?

Instead of risking all our capital, we can decide to manage it dynamically and to bet each period only a fraction ${\displaystyle \alpha }$ on this very risky investment. At each period, the average yield is only ${\displaystyle \alpha (k/2-1)}$. We have a chance on two to multiply our capital by ${\displaystyle 1-\alpha }$ and a chance on two to multiply it by ${\displaystyle 1+(k-1)\alpha }$. Let ${\displaystyle V_{0}=1}$ the initial value of the capital and ${\displaystyle V_{n}}$ its value after n periods. The logarithmic efficiency after n periods ${\displaystyle \ln(V_{n}/V_{0})=\ln(V_{n})}$ is a sum of independent random variables with mean ${\displaystyle [\ln(1-\alpha )+\ln(1+(k-1)\alpha )]/2}$ and variance ${\displaystyle [\ln(1-\alpha )-\ln(1+(k-1)\alpha )]^{2}/4}$. When n is large, the distribution of ${\displaystyle \ln V_{n}}$ tends to a normal distribution with mean ${\displaystyle n[\ln(1-\alpha )+\ln(1+(k-1)\alpha )]/2}$ and variance ${\displaystyle n[\ln(1-\alpha )-\ln(1+(k-1)\alpha )]^{2}/4}$.

The ratio ${\displaystyle {\frac {|\ln(1-\alpha )-\ln(1+(k-1)\alpha )|}{{\sqrt {n}}[\ln(1-\alpha )+\ln(1+(k-1)\alpha )]}}}$ of the standard deviation on the mean of logarithmic yield tends to zero when n is very large. If the average of the logarithmic yield per period is positive, ${\displaystyle [\ln(1-\alpha )+\ln(1+(k-1)\alpha )]/2>0}$ then we can do a risk free profit when n is very big. If ${\displaystyle \alpha <(k-2)/(k-1)}$ this dynamic management strategy delivers a certain profit as soon as the number of periods is large enough.

This risk reduction strategy is based on time diversification. It assumes that profits over a given period are independent of previous periods. This reduces risk by adding risky but statistically independent profits.

## The formulas of Black and Scholes

The formulas of Black and Scholes make it possible to calculate the present value of a European call or put option from the present value ${\displaystyle S_{t}}$ of the underlying asset, its volatility ${\displaystyle \sigma }$, the time ${\displaystyle T}$ to maturity, the option's strike price ${\displaystyle K}$, and the risk-free interest rate ${\displaystyle r}$.

The easiest way to demonstrate them is to reason on a risk-neutral economy. It means that risk premiums are always zero. This hypothesis is obviously completely false but very surprisingly it leads to correct formulas. The Black and Scholes equation presented below does not require the hypothesis of risk neutrality and justifies the formulas of the same name. More generally, the assumption of risk neutrality often makes it possible to correctly evaluate derivative products which are nevertheless very risky, because these products are valued on the basis of the present prices of their underlying assets. These market prices take into account the required risk premiums. If risk premiums change, because agents are more afraid of risk for example, market prices also change, but their relation with the prices of derivatives does not change (Hull 2011). This is why risk premiums can often be ignored when valuing derivative prices from the underlying prices. The binomial tree model presented later allows to justify this more clearly.

### Risk neutrality and value of an asset

It is assumed that agents are risk neutral. The present value of an asset is then equal to the average of the present values of its anticipated values:

${\displaystyle S_{0}=\langle e^{-rT}S_{T}\rangle }$

where ${\displaystyle r}$ is the risk-free interest rate. We deduce for a stock whose price variations are determined by a log-normal distribution with parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma }$ :

${\displaystyle \langle S_{T}\rangle =e^{rT}S_{0}=e^{(\mu +\sigma ^{2}/2)T}S_{0}}$

Hence

${\displaystyle r=\mu +\sigma ^{2}/2}$

### The present value of a European call option

If ${\displaystyle S_{T}}$ is the anticipated value of the underlying asset, ${\displaystyle max(S_{T}-K,0)}$ is the anticipated value ${\displaystyle C_{T}}$ of a call option whose strike price is ${\displaystyle K}$. Risk neutrality requires that its present value ${\displaystyle C_{0}}$ is the average of the present values of its anticipated values:

${\displaystyle C_{0}={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{K}^{+\infty }{\frac {S-K}{S}}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS}$

${\displaystyle ={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}(\int _{K}^{+\infty }e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS-\int _{K}^{+\infty }{\frac {K}{S}}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS)}$

${\displaystyle =C_{+}-C_{-}}$

With ${\displaystyle x=\ln {\frac {S}{S_{0}}}}$, ${\displaystyle S=S_{0}e^{x}}$, ${\displaystyle dS=S_{0}e^{x}dx}$

${\displaystyle C_{+}={\frac {S_{0}e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{\ln {\frac {K}{S_{0}}}}^{+\infty }e^{x-(x-\mu T)^{2}/2T\sigma ^{2}}dx}$

Now ${\displaystyle x-(x-\mu T)^{2}/2T\sigma ^{2}=-(x-(\mu +\sigma ^{2})T)^{2}/2T\sigma ^{2}+\mu T+\sigma ^{2}/2}$

Hence

${\displaystyle C_{+}=S_{0}e^{(-r+\mu +\sigma ^{2}/2)T}{\frac {1}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{\ln {\frac {K}{S_{0}}}}^{+\infty }e^{-(x-(\mu +\sigma ^{2})T)^{2}/2T\sigma ^{2}}dx=S_{0}{\frac {1}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{\ln {\frac {K}{S_{0}}}}^{+\infty }e^{-(x-(\mu +\sigma ^{2})T)^{2}/2T\sigma ^{2}}dx}$

since ${\displaystyle r=\mu +\sigma ^{2}/2}$

With ${\displaystyle y=(x-(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}}$, ${\displaystyle dx=\sigma {\sqrt {T}}dy}$

${\displaystyle C_{+}=S_{0}{\frac {1}{\sqrt {2\pi }}}\int _{(\ln {\frac {K}{S_{0}}}-(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}}^{+\infty }e^{-y^{2}/2}dy}$

Let ${\displaystyle N(.)}$ be the cumulative distribution function of the standard normal law:

${\displaystyle N(x)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{x}e^{-y^{2}/2}dy}$

Since ${\displaystyle {\frac {1}{\sqrt {2\pi }}}\int _{x}^{+\infty }e^{-y^{2}/2}dy={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{-x}e^{-y^{2}/2}dy=N(-x)}$

${\displaystyle C_{+}=S_{0}N[(\ln {\frac {S_{0}}{K}}+(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}]=S_{0}N[(\ln {\frac {S_{0}}{K}}+(r+\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

${\displaystyle C_{-}={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{K}^{+\infty }{\frac {K}{S}}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS={\frac {Ke^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{\ln {\frac {K}{S_{0}}}}^{+\infty }e^{-(x-\mu T)^{2}/2T\sigma ^{2}}dx}$

With ${\displaystyle z=(x-\mu T)/\sigma {\sqrt {T}}}$, ${\displaystyle dx=\sigma {\sqrt {T}}dz}$

${\displaystyle C_{-}=Ke^{-rT}{\frac {1}{\sqrt {2\pi }}}\int _{(\ln {\frac {K}{S_{0}}}-\mu T)/\sigma {\sqrt {T}}}^{+\infty }e^{-z^{2}/2}dz}$

${\displaystyle =Ke^{-rT}N[(\ln {\frac {S_{0}}{K}}+\mu T)/\sigma {\sqrt {T}}]=Ke^{-rT}N[(\ln {\frac {S_{0}}{K}}+(r-\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

Finally

${\displaystyle C_{0}=S_{0}N[(\ln {\frac {S_{0}}{K}}+(r+\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]-Ke^{-rT}N[(\ln {\frac {S_{0}}{K}}+(r-\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

This is the Black and Scholes formula for European call options.

### The present value of a European put option

If ${\displaystyle S_{T}}$ is the anticipated value of the underlying asset, ${\displaystyle max(K-S_{T},0)}$ is the anticipated value ${\displaystyle P_{T}}$ of a put option whose exercise price is ${\displaystyle K}$. Risk neutrality requires that its present value ${\displaystyle P_{0}}$ is the average of the present values of its anticipated values:

${\displaystyle P_{0}={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{0}^{K}{\frac {K-S}{S}}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS=P_{+}-P_{-}}$

${\displaystyle P_{+}={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{0}^{K}{\frac {K}{S}}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS={\frac {Ke^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{-\infty }^{\ln {\frac {K}{S_{0}}}}e^{-(x-\mu T)^{2}/2T\sigma ^{2}}dx}$

${\displaystyle =Ke^{-rT}{\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{(\ln {\frac {K}{S_{0}}}-\mu T)/\sigma {\sqrt {T}}}e^{-z^{2}/2}dz}$

${\displaystyle =Ke^{-rT}N[(\ln {\frac {K}{S_{0}}}-\mu T)/\sigma {\sqrt {T}}]=Ke^{-rT}N[(\ln {\frac {K}{S_{0}}}-(r-\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

${\displaystyle P_{-}={\frac {e^{-rT}}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{0}^{K}e^{-(\ln {\frac {S}{S_{0}}}-\mu T)^{2}/2T\sigma ^{2}}dS=S_{0}e^{(-r+\mu +\sigma ^{2}/2)T}{\frac {1}{\sigma {\sqrt {T}}{\sqrt {2\pi }}}}\int _{-\infty }^{\ln {\frac {K}{S_{0}}}}e^{-(x-(\mu +\sigma ^{2})T)^{2}/2T\sigma ^{2}}dx}$

${\displaystyle =S_{0}{\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{(\ln {\frac {K}{S_{0}}}-(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}}e^{-y^{2}/2}dy}$

${\displaystyle =S_{0}N[(\ln {\frac {K}{S_{0}}}-(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}]=S_{0}N[(\ln {\frac {K}{S_{0}}}-(r+\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

Finally

${\displaystyle P_{0}=Ke^{-rT}N[(\ln {\frac {K}{S_{0}}}-(r-\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]-S_{0}N[(\ln {\frac {K}{S_{0}}}-(r+\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

This is the Black and Scholes formula for European put options.

## The Black and Scholes equation

In principle, risk hedging strategies can be used to build risk-free portfolios with options and shares or other assets. The technique is simply to compensate for increases and decreases. Any increase above the risk-free rate for one or more items in the portfolio is offset by a decrease in other items, or a lower increase than the risk-free rate. Such a portfolio is therefore remunerated at the risk-free rate. Increases and decreases compensate mechanically. But risk-free portfolios are the same in an economy that pays risk premiums as in an economy that does not pay for them, and the Black and Scholes formulas determine their composition. Different formulas would necessarily lead to portfolios without risk of different compositions. Since risk-free portfolios are the same whether or not the economy is risk neutral, the validity of the Black and Scholes formulas does not depend on the existence of risk premiums.

More precisely, we will show from the existence of risk-free portfolios that the function that determines the value of an option must satisfy a partial differential equation, the Black and Scholes equation. The formulas of the same name are the only solutions in this equation that satisfy the boundary conditions.

Let ${\displaystyle V(t,S)}$ be the value of an option based on the current price ${\displaystyle S}$ of the underlying asset, at time ${\displaystyle t}$. ${\displaystyle V(t,S)}$ can be known empirically if it is a market price. A priori it could depend on risk premiums, because buying and selling options are risky transactions. Black and Scholes reasoning about a risk-free portfolio shows that ${\displaystyle V(t,S)}$ is in fact independent of risk premiums.

A risk-free portfolio is created at time ${\displaystyle t}$ by holding an option and ${\displaystyle -{\frac {\partial V}{\partial S}}}$ units of the underlying. If ${\displaystyle {\frac {\partial V}{\partial S}}>0}$ the underlying asset must be sold short.

As before, we assume that the random variations of the underlying asset are described by a log-normal distribution.

The ${\displaystyle dS}$ variation of ${\displaystyle S}$ during a time interval ${\displaystyle dt}$ is the sum of two terms:

${\displaystyle {\frac {dS}{S}}=mdt+\sigma \epsilon {\sqrt {dt}}}$

where ${\displaystyle \epsilon }$ is a standard normal random variable.

To compute ${\displaystyle dV}$, the formula ${\displaystyle dV=({\frac {\partial V}{\partial t}}+{\frac {\partial V}{\partial S}}{\frac {dS}{dt}})dt}$ is not appropriate, because ${\displaystyle {\frac {dS}{dt}}}$ diverges when ${\displaystyle t}$ tends to zero.

The following reasoning on orders of magnitude is not rigorous but it leads to an exact formula, which can be proved with Ito's lemma.

${\displaystyle dV={\frac {\partial V}{\partial t}}dt+{\frac {\partial V}{\partial S}}dS+{\frac {1}{2}}{\frac {\partial ^{2}V}{\partial S^{2}}}dS^{2}}$

${\displaystyle {\frac {dS^{2}}{S^{2}}}=m^{2}dt^{2}+m\sigma \epsilon dt^{3/2}+\sigma ^{2}\epsilon ^{2}dt}$

The first two terms are negligible compared to the third when ${\displaystyle dt}$ tends to zero.

Assuming that we can replace ${\displaystyle \epsilon ^{2}}$ with its average value ${\displaystyle 1}$ we obtain:

${\displaystyle dV=({\frac {\partial V}{\partial t}}+mS{\frac {\partial V}{\partial S}}+S^{2}{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}V}{\partial S^{2}}})dt+S{\frac {\partial V}{\partial S}}dW}$

The change in value of the risk-free portfolio is

${\displaystyle d\Pi =dV-{\frac {\partial V}{\partial S}}dS=({\frac {\partial V}{\partial t}}+S^{2}{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}V}{\partial S^{2}}})dt}$

The random term ${\displaystyle dW}$ has disappeared. This confirms that the portfolio is risk free.

${\displaystyle \Pi }$ must vary at the risk-free rate ${\displaystyle r}$:

${\displaystyle d\Pi =r\Pi dt=r(V-{\frac {\partial V}{\partial S}}S)dt}$

We obtain

${\displaystyle {\frac {\partial V}{\partial t}}+S^{2}{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}V}{\partial S^{2}}}=r(V-{\frac {\partial V}{\partial S}}S)}$

This is the Black and Scholes equation:

${\displaystyle {\frac {\partial V}{\partial t}}+S^{2}{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}V}{\partial S^{2}}}+rS{\frac {\partial V}{\partial S}}-rV=0}$

One can verify that the formulas of Black and Scholes are solutions of the equation of the same name. This equation does not require the hypothesis of a risk-neutral environment. The formulas of Black and Scholes are therefore true in an environment that is not indifferent to risk. The binomial tree method provides a simpler explanation of this result.

## Binomial trees and risk neutrality

The following examples are very simple and very unrealistic, but they are sufficient to understand why the relation between option and stock prices does not depend on risk premiums:

It is assumed that an action worth 100 today can be 110 or 90 after one period with the probabilities p and 1-p respectively. One wonders about the present price P of a put option whose exercise price is 110. It is assumed for simplicity that the risk-free interest rate is zero. The present price of a portfolio that contains an action and an option is 100 + P. Its future value is 110 in all cases. It is therefore risk free. We deduce that P = 10. We do not need to know the probability of a rise to know the price of the option. We do not need to know the risks taken by investors and their risk premiums.

More generally, a single period binomial tree is defined by the following parameters: p, u, d and r. p is the probability that the price of the stock is multiplied by u (up) after one period, 1-p the probability that it is multiplied by d (down). r is the risk-free interest rate over one period.

It is assumed that the present price of the action is 100, that u> 1 and d <1. To evaluate the present price P of a put option whose exercise price is K, we reason on a portfolio that contains ${\displaystyle \Delta }$ shares and one option. We assume that ${\displaystyle 100d\leq K\leq 100u}$. The current price of the portfolio is ${\displaystyle 100\Delta +P}$. Its future value is either ${\displaystyle 100u\Delta }$ or ${\displaystyle 100d\Delta +K-100d}$. This portfolio is risk free when

${\displaystyle 100u\Delta =100d\Delta +K-100d}$

that is to say

${\displaystyle \Delta ={\frac {K-100d}{100(u-d)}}}$

Since it is safe, one must have

${\displaystyle (100\Delta +P)(1+r)=100u\Delta }$

so

${\displaystyle P={\frac {100\Delta (u-1-r)}{1+r}}={\frac {(K-100d)(u-1-r)}{(u-d)(1+r)}}}$

To find out the present price C of a call option whose exercise price is K, we think about a portfolio consisting of having sold ${\displaystyle \Delta }$ shares short and having bought an option. Selling short is selling shares that have been borrowed by committing to buy them later to return them. The current price of the portfolio is ${\displaystyle -100\Delta +C}$. Its future value is either ${\displaystyle -100u\Delta +100u-K}$ or ${\displaystyle -100d\Delta }$. This portfolio is risk free when

${\displaystyle -100u\Delta +100u-K=-100d\Delta }$

that is to say

${\displaystyle \Delta ={\frac {K-100u}{100(d-u)}}}$

Since it is safe, one must have

${\displaystyle (-100\Delta +C)(1+r)=-100d\Delta }$

so

${\displaystyle C={\frac {100\Delta (1+r-d)}{1+r}}={\frac {(K-100u)(1+r-d)}{(d-u)(1+r)}}}$

As with the first example, we do not need to know the probability p of a rise to know the price of the options. So we do not need to know the risk premiums to evaluate the put and call options.

It is of course perfectly unrealistic to assume that a share can only have two values ​​after one period. But it is enough to reason on multiperiod binomial trees to make the model very realistic, as soon as the number of periods is large enough. With two periods, three possible future values ​​are obtained: 100uu, 100ud, and 100dd. With N periods, we obtain N + 1 possible future values. If N is very large, the variations of the share price follow a log-normal law. We can prove the formulas of Black and Scholes by reasoning on binomial trees. The independence of the price of options with respect to risk premiums established with binomial trees is therefore much more realistic than what the simplicity of the model suggests at the outset.

The article by Cox, Ross and Rubinstein, Option pricing, a simplified approach (1979) is the pioneering article that showed the value of this model.

## The average yield of a put option may be negative

A put option has a negative ${\displaystyle \beta }$ if the ${\displaystyle \beta }$ of the underlying stock is positive. We show on an example that the ${\displaystyle \beta }$ can be negative enough for the average yield to be negative:

Consider a stock whose average yield (arithmetic annual) is 10% and volatility 20%:

${\displaystyle \mu =0.08}$

${\displaystyle \sigma =0.2}$

${\displaystyle e^{\mu +\sigma ^{2}/2}-1\approx 0.1}$

We reason on a a six-month put option whose strike price is equal to the current price of the share plus its average yield.

Let ${\displaystyle V_{0}=100}$ be the present price of the action. The strike price of the option is

${\displaystyle K=V_{0}e^{(\mu +\sigma ^{2}/2)T}=100e^{(0.08+0.2^{2}/2)0.5}\approx 105.13}$

The present price of the option is obtained with the formula of Black and Scholes:

${\displaystyle P_{0}=Ke^{-rT}N[(\ln {\frac {K}{S_{0}}}-(r-\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]-S_{0}N[(\ln {\frac {K}{S_{0}}}-(r+\sigma ^{2}/2)T)/\sigma {\sqrt {T}}]}$

With ${\displaystyle r=0.02}$, ${\displaystyle P_{0}\approx 8.02}$,

The option will yield ${\displaystyle \langle P_{f}\rangle }$ on average on the day of exercise:

${\displaystyle \langle P_{f}\rangle =KN[(\ln {\frac {K}{S_{0}}}-\mu T)/\sigma {\sqrt {T}}]-S_{0}e^{rT}N[(\ln {\frac {K}{S_{0}}}-(\mu +\sigma ^{2})T)/\sigma {\sqrt {T}}]}$

${\displaystyle \approx 7.87}$

The average annual logarithmic yield of the option is therefore

${\displaystyle 2\ln {\frac {7.87}{8.02}}\approx -0.038=-3.8\%}$

## Mathematical supplements

### The normal law

#### Probability density

• A random variable ${\displaystyle X}$ follows a standard normal law when its probability density is:

${\displaystyle f_{X}(x)={\frac {1}{\sqrt {2\pi }}}e^{-x^{2}/2}}$

${\displaystyle {\frac {1}{\sqrt {2\pi }}}}$ is a normalization factor for the sum of probabilities to be equal to one:

${\displaystyle \int _{-\infty }^{+\infty }f_{X}(x)dx={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{+\infty }e^{-x^{2}/2}dx=1}$

It is proved by the following reasoning:

${\displaystyle \int _{-\infty }^{+\infty }\int _{-\infty }^{+\infty }e^{-(x^{2}+y^{2})}dxdy=\int _{-\infty }^{+\infty }e^{-x^{2}}dx\int _{-\infty }^{+\infty }e^{-y^{2}}dy=(\int _{-\infty }^{+\infty }e^{-x^{2}}dx)^{2}}$

${\displaystyle =\int _{0}^{2\pi }\int _{0}^{+\infty }e^{-r^{2}}rd\phi dr=2\pi [-{\frac {1}{2}}e^{-r^{2}}]_{0}^{+\infty }=\pi }$

hence ${\displaystyle \int _{-\infty }^{+\infty }e^{-x^{2}}dx={\sqrt {\pi }}}$

With ${\displaystyle x=y/{\sqrt {2}}}$, ${\displaystyle dx=dy/{\sqrt {2}}}$, we get ${\displaystyle \int _{-\infty }^{+\infty }e^{-x^{2}}dx={\frac {1}{\sqrt {2}}}\int _{-\infty }^{+\infty }e^{-y^{2}/2}dy}$

hence

${\displaystyle \int _{-\infty }^{+\infty }e^{-y^{2}/2}dy={\sqrt {2\pi }}}$

• A random variable ${\displaystyle X}$ follows a centered normal law when its probability density is:

${\displaystyle f_{X}(x)={\frac {1}{\sigma {\sqrt {2\pi }}}}e^{-x^{2}/2\sigma ^{2}}}$

With ${\displaystyle y=x/\sigma }$, ${\displaystyle dx=\sigma dy}$, we check that ${\displaystyle \int _{-\infty }^{+\infty }f_{X}(x)dx={\frac {\sigma }{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{-y^{2}/2}dy=1}$

• A random variable ${\displaystyle X}$ follows a normal law when its probability density is:

${\displaystyle f_{X}(x)={\frac {1}{\sigma {\sqrt {2\pi }}}}e^{-(x-\mu )^{2}/2\sigma ^{2}}}$

With ${\displaystyle y=x-\mu }$, ${\displaystyle dy=dx}$, we check that ${\displaystyle \int _{-\infty }^{+\infty }f_{X}(x)dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{-y^{2}/2\sigma ^{2}}dy=1}$

The red curve is the standard normal distribution.

#### Expected value

A random variable ${\displaystyle X}$ is centered when its mean, or expected value ${\displaystyle E(X)}$, is zero. It is easy to verify that centered normal laws have a zero mean since their probability densities are even. ${\displaystyle E(X)=\int _{-\infty }^{+\infty }xf_{X}(x)dx}$ is the integral of an odd function and is therefore zero.

With ${\displaystyle y=x-\mu }$, we obtain for the expected of the normal distribution:

${\displaystyle E(X)={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }xe^{-(x-\mu )^{2}/2\sigma ^{2}}dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }(y+\mu )e^{-y^{2}/2\sigma ^{2}}dy=\mu }$

#### Variance and standard deviation

• We obtain the variance of a standard normal random variable ${\displaystyle X}$ with an integration by parts:

${\displaystyle Var(X)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{+\infty }x^{2}e^{-x^{2}/2}dx={\frac {1}{\sqrt {2\pi }}}([-xe^{-x^{2}/2}]_{-\infty }^{+\infty }-\int _{-\infty }^{+\infty }-e^{-x^{2}/2}dx)=1}$

• With ${\displaystyle y=x/\sigma }$, ${\displaystyle dx=\sigma dy}$, we then obtain the variance of the centered normal law:

${\displaystyle Var(X)={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }x^{2}e^{-x^{2}/2\sigma ^{2}}dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }\sigma ^{2}y^{2}e^{-y^{2}/2}\sigma dy=\sigma ^{2}}$

• The variance of the normal distribution is identical:

${\displaystyle Var(X)={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }(x-\mu )^{2}e^{-(x-\mu )^{2}/2\sigma ^{2}}dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }y^{2}e^{-y^{2}/2\sigma ^{2}}dy=\sigma ^{2}}$

${\displaystyle \sigma }$ is therefore the standard deviation ${\displaystyle {\sqrt {Var(X)}}}$ of ${\displaystyle X}$.

For the normal distribution, ${\displaystyle \sigma }$ is also a good estimate of the mean of the absolute values of the deviations from the mean:

${\displaystyle {\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{+\infty }|x|e^{-x^{2}/2}dx=2{\frac {1}{\sqrt {2\pi }}}\int _{0}^{+\infty }xe^{-x^{2}/2}dx}$

${\displaystyle ={\sqrt {\frac {2}{\pi }}}[-e^{-x^{2}/2}]_{0}^{+\infty }={\sqrt {\frac {2}{\pi }}}}$

hence

${\displaystyle {\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }|x-\mu |e^{-(x-\mu )^{2}/2\sigma ^{2}}dx=\sigma {\sqrt {\frac {2}{\pi }}}\approx 0.8\sigma }$

### The lognormal law

#### Probability density

Let ${\displaystyle X}$ be a random variable such that ${\displaystyle \ln X}$ follows a normal law with mean ${\displaystyle \mu }$ and standard deviation ${\displaystyle \sigma }$.

The cumulative distribution function of ${\displaystyle X}$ is

${\displaystyle F_{X}(x)=Pr(X\leq x)=Pr(\ln X\leq \ln x)=F_{\ln X}(\ln x)}$

The probability density of ${\displaystyle X}$ is therefore

${\displaystyle f_{X}(x)={\frac {d}{dx}}F_{X}(x)={\frac {1}{x}}f_{\ln X}(\ln x)={\frac {1}{x\sigma {\sqrt {2\pi }}}}e^{-{\frac {(\ln x-\mu )^{2}}{2\sigma ^{2}}}}}$

Some log-normal density functions with identical parameter ${\displaystyle \mu }$ but differing parameters ${\displaystyle \sigma }$

#### Expected value

${\displaystyle \langle X\rangle ={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{0}^{+\infty }xf_{X}(x)dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{0}^{+\infty }e^{-(\ln x-\mu )^{2}/2\sigma ^{2}}dx}$

With ${\displaystyle y=\ln x}$, ${\displaystyle x=e^{y}}$, ${\displaystyle dy={\frac {dx}{x}}}$, ${\displaystyle dx=e^{y}dy}$

${\displaystyle \langle X\rangle ={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{y-(y-\mu )^{2}/2\sigma ^{2}}dy}$

Now ${\displaystyle y-(y-\mu )^{2}/2\sigma ^{2}=-(y-(\mu +\sigma ^{2}))^{2}/2\sigma ^{2}+\mu +\sigma ^{2}/2}$

Hence

${\displaystyle \langle X\rangle =e^{\mu +\sigma ^{2}/2}{\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{-(y-(\mu +\sigma ^{2}))^{2}/2\sigma ^{2}}dx}$

${\displaystyle =e^{\mu +\sigma ^{2}/2}}$

#### Variance and standard deviation

${\displaystyle \langle X^{2}\rangle ={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{0}^{+\infty }x^{2}f_{X}(x)dx={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{0}^{+\infty }xe^{-(\ln x-\mu )^{2}/2\sigma ^{2}}dx}$

With ${\displaystyle y=\ln x}$, ${\displaystyle x=e^{y}}$, ${\displaystyle dx=e^{y}dy}$

${\displaystyle \langle X^{2}\rangle ={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{2y-(y-\mu )^{2}/2\sigma ^{2}}dy}$

Now ${\displaystyle 2y-(y-\mu )^{2}/2\sigma ^{2}=-(y-(\mu +2\sigma ^{2}))^{2}/2\sigma ^{2}+2(\mu +\sigma ^{2})}$

Hence

${\displaystyle \langle X^{2}\rangle =e^{-2(\mu +\sigma ^{2})}{\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{+\infty }e^{-(y-(\mu +2\sigma ^{2}))^{2}/2\sigma ^{2}}dx}$

${\displaystyle =e^{2(\mu +\sigma ^{2})}}$

${\displaystyle Var(X)=\langle X^{2}\rangle -\langle X\rangle ^{2}=e^{2(\mu +\sigma ^{2})}-(e^{\mu +\sigma ^{2}/2})^{2}}$

${\displaystyle =(e^{2\mu +\sigma ^{2}})(e^{\sigma ^{2}}-1)}$

The standard deviation of ${\displaystyle X}$ is therefore

${\displaystyle {\sqrt {Var(X)}}=e^{\mu +\sigma ^{2}/2}{\sqrt {e^{\sigma ^{2}}-1}}}$