In this report, detailed information about the “correlation setting” pop up window is given. See Figure 1. This window is obtained by clicking on the radio button labelled “Known dep” in the main screen of Statool. We list the approaches which are used to calculate the theoretical correlation and expectation of XY. Also we demonstrate how to get constraints for linear programming from the different setting values.
This section illustrates how to figure out the possible range of the expectation of XY if the marginal distributions of X and Y are known.
Assume that the marginal distributions of X and Y are known, as listed in the following table.
|
Y ↓ X→ |
|
… |
|
|
|
|
|
… |
|
|
|
… |
… |
… |
… |
… |
|
|
|
… |
|
|
|
|
|
… |
|
1 |
According to the definition of
we have
.
Here Xi and Yj are interval values. Based
on interval multiplication, 
Let
and ![]()
Then ![]()
There also are the constraints on the
’s from
the marginal distributions. These are the rows and columns’ constraints, as
follows;
for j=1 to n
for i=1 to m
Here, only the
, i=1 to m, j=1 to n
are unknown. Our objective is to find the minimum and maximum values possible
for EXY. Since each
is non-negative,
the minimum value of EXY is obtained by minimizing
and the
maximum value of EXY is obtained by maximizing
. Therefore
two linear programs are constructed to get the minimum and maximum values of EXY.
Minimum value:
Minimize ![]()
Subject to:
for j=1 to n
for i=1 to m
Maximum value:
Maximize ![]()
Subject to:
for j=1 to n
for i=1 to m
After solving these two linear programmings, the minimum and maximum values of EXY are obtained and are recorded as Emin and Emax. These values are presented in the “Expectation of XY sub-window” of the “Correlation Setting” pop up window.
Although the marginal distributions don’t determine the exact correlation between two random variables, they often constrain it to some extent. In the following, we will show how to compute the possible correlation range from the marginal distributions.
From the definition of correlation,
where
Var(X) and Var(Y) are the variances of X and Y. Rearranging,
.
From the previous section, the theoretical range of EXY from the
definition of EXY is from Emin to Emax. Here we have another formula of EXY
from the definition of correlation. We consider computing the possible range of
EXY from this new definition. EXY can be written as

We define the function
This is an interval-valued
function. We write the corresponding real function as
where
i=1 to n,
j=1 to m, and
.
In this function, there are n+m+1 variables and every variable is restricted to
the specified interval range. We can use an optimization method to find the
minimum and maximum value for F(x,y) and record them as Fmin and Fmax. (This is
a nonlinear optimization problem).
Now we get two ranges for EXY from the different
formulas. Since both are true, we exploit both by intersecting them. Call the
low and high bounds the intersection Gmin and Gmax. Then
and
.
The values of
and
used
to compute Fmin and Fmax are used again here to compute
the bounds on
.
Since we just want to get a safe range for correlation, not necessarily the narrowest possible range, we are done.
A more accurate range for correlation can be gotten directly
from computing the min and max of
. This is a complex
nonlinear optimal problem. This range is presented in the “Correlation
Coefficient Subwindow” of the “Correlation Setting” popup window.
Theoretical ranges of mean and variance of operand are calculated by the program. These values are directly obtained according to the definitions.
From the definition, expectation of random variable X
is
. Since Xi
is an interval value,
. So the bounds on EX
are obtained. The similar method is used to handle operand Y. the bounds
on EY are
and
.
Variances of X and Y are a little more complex
to obtain. Based on the definition, variance of X is
.
Here each Xi is an interval value. This is a problem of evaluation
of an interval function. We define a real function
and each
i=1
to n. Since all Pxi are known, the optimization method can be
adapted to compute the min and max values of function V(x) as VXmin
and VXmax. The similar method is used to variance of Y. Let
and
j=1 to m. Then the
bounds of variance of Y are obtained, recorded as VYmin and VYmax.
These ranges are presented in the “Mean and Variance Subwindow” of the
“Correlation Setting” popup window.
In this section, we demonstrate how to get extra constraints if the user sets the range of correlation in the “Correlation Coefficient Subwindow” of the “Correlation Setting” popup window.
From section 2,
, since Pij
is non-negative.
From section 3, 
Using the real function
and
i=1
to n,
j=1 to m, and given
range for correlation
, the minimum and
maximum values of F(x,y) can be calculated by non-linear optimization as in
section 3. Call them Fmin and Fmax.
Based on Berleant & Zhang [1], two inequalities are defined:
and
.
These two inequalities form two extra constraints for linear programming since
only the Pij’s are unknown.
If the user sets “EXY range” in the “Expection of EXY” subwindow of the “Correlation Setting” popup windows, the values that the user provies, Fmin and Fmax, are used directly to define 2 constraints:
![]()
.
These constraints were justified in the section 5 and in Berleant & Zhang [1].
The user can set mean and /or variance in the “Mean and
Variance Subwindow” of the “Correlation Setting” popup window. Consider the
formula
. If the means and
variances of X and Y are known, the value of EXY can be
calculated if correlation is also known. From section 5, the range for correlation
is computable. We can use this range of correlation to calculate the range of EXY.
It is clear that computing EXY is interval, not a real number. Let the low
bound of EXY be called Fmin and the high bound be called Fmax. Then,
,
and
. These constraints
are then used by Statool.
In the some situations, the user may know partial information about both correlation, and either mean, variance or both. Here is how the user can choose values for correlation, and mean and/or variance.
First, the user should click on the checkbox button labelled “Input data in both the correlation subwindow and the mean and variance subwindow” in the “Correlation, Mean, and Variance Subwindow” of the “Correlation Setting” popup window. Then the user can set values in both the “Correlation” and “Mean, Variance” subwindows of the “Correlation Setting” popup window.
In section 7, we describe the situation where mean and/or
variance are known. If correlation is also input, we can directly use all three
in the formula
to get the value of
EXY. If either mean or variance is missing, a default range for it may
be obtained as described in section 4. Let the low bound of EXY be
called Fmin and the high bound be called Fmax. Then,
, and
.
These extra constraints are then added to the LP calls.
[1] D. Berleant and J. Zhang, “Using correlation to improve envelopes around derived distributions,” Reliable Computing, in press as of 3/27/03

Figure 1. "Correlation Setting" p1opup window.