_{pb}, Odds-Ratio, and Natural Log Odds-Ratio: Two Conversion Tables

_{bp}de Pearson, la razón de probabilidad y la razón de probabilidad con logaritmos naturales: dos tablas de conversión

The importance of the area under the normal curve (AUC) as a measure of the effect size (ES) in applied and forensic research has been highlighted in several articles and Tables for converting the AUC curve into Pearson’s r_{pb} and Cohen’s d have been published. No Tables have been published for converting AUC into other frequently used effect size indexes, such as OR and Ln(OR). In this article, two Tables for 500 AUC three-digit values provide the equivalent values for converting AUC into four ESs, including Cohen’s d values, Pearson’s r_{pb}, odds-ratio (OR), and Ln(OR). In the case of rpb, the values are given for four base rates (.5, .3, .2, and .1). In addition, the formulas for transforming AUC into these effect sizes are provided, so researchers can calculate other values which have not been included in the Tables.

La importancia del área bajo la curva normal (ACN) como medida del tamaño del efecto (TE) en la investigación aplicada y forense ha sido resaltada en varios artículos y han sido publicadas tablas para convertir el ACN en el coeficiente r_{bp} de Pearson y la d de Cohen. No han sido publicadas tablas para convertir la ABC en otros índices del tamaño del efecto frecuentemente usados, tales como OR y Ln(OR). En este artículo, se proporcionan dos tablas para 500 valores de tres dígitos del ACN con los valores equivalentes para convertir el ACN en cuatro TE, d de Cohen, r_{bp} de Pearson, OR y Ln(OR). En el caso de la rbp, se dan los valores para cuatro tasas base (.5, .3, .2, y .1). Además, se incluyen las fórmulas para la conversión en estos tamaños de efectos, de tal modo que los investigadores puedan calcular otros valores no incluidos en las Tablas.

_{pb}

_{bp}

Many research domains of life, social, and experimental sciences report their findings in terms of an effect size (ES) estimate. Nevertheless, there is not a single agreed definition of ES. In this regard,

The use of an ES was recommended by the American Psychological Association (APA) Task Force on Statistical Inference (

The number of ES measures is large, and the use of one or other coefficient depends, in part, on the traditions of a particular science or research area. The three most relevant factors for selecting a specific ES are: (a) the research question and the goals of the study; (b) the design type (e.g., experimental, correlational, cross-sectional, longitudinal, and so on), and (c) the nature of the variables and how they are measured (e.g., continuous, categorical, true dichotomous, dichotomized, and so on).

The most common ES estimates are the correlation coefficient (e.g., Pearson’s r), Cohen’s d, and z. Nevertheless, many other ES indexes are available to researchers. For example, ratios, odds-ratio (OR), natural log odds-ratio - ln(OR) -, relative operating characteristic (ROC) curves, among others, are used for reporting findings (see

For example, the family of Pearson correlation coefficients is widely used in validity studies in work and organizational (W/O) psychology (e.g., Lado &

Nevertheless, even within these research areas and domains, despite the fact that a specific ES index is more frequently employed, empirical studies use a variety of them for reporting their findings. Due to this, over the years, some formulas have been developed for the conversion of one ES coefficient into others. For example, there are formulas for converting r into d, and r into z, and vice versa (see, for example,

A problem with some empirical studies is that they do not inform about the ES and report the results in terms of alternative statistical coefficients, such as F, t, ^{2}, and so on. For this reason, researchers need to transform these values into an ES index. It is also important that researchers can transform the results reported in one metric into a different metric. For example, it is relatively common and easy for researchers to transform t, F, and ^{2} into Pearson’s r or into Cohen’s d and Pearson’s r into Cohen’s d or Cohen’s d into Pearson’s r (see _{pb}, into OR). For example,

Therefore, converting one ES into another is technically possible and several transformation formulas are available to the researchers. However, many of these transformations require a number of intermediate conversions that increase the computational work of the researchers, together with the additional complication that many of the conversions are not available by default in the most popular statistical packages (e.g., SPSS, SAS). Furthermore, some of these intermediate transformations consist of large series of potencies. This difficulty has the side-effect that in meta-analytic studies some researchers exclude some primary studies from the final database because they cannot transform the original statistical coefficient into the effect size used in the meta-analytic method (typically, r and d in psychology, for instance).

The objective of this paper is to provide researchers and practitioners with tables for converting AUC-values into Cohen’s d, r_{pb} for different base rates, OR, and Ln(OR). At present researchers and practitioners have the possibility of using the Table constructed by _{pb} values for 137 AUC values. However, there are no Tables for converting the AUC-values directly into OR and Ln(OR) (and vice versa), nor are there for r_{pb} for base rates other than 50%.

In a pioneering effort, ^{2}) into ES and, particularly, they highlighted the importance of the area under the curve (AUC) as an index in forensic psychology research where it is commonly used. The area under the ROC curve is also frequently used in diagnostic areas (e.g., radiology, clinical microbiology) and in psychophysics, among other research and practice fields (see

A ROC curve describes the functional relationships between the proportion of times that an alternative (say, h) is chosen when it occurs and the proportion of times that this alternative is chosen when another alternative (say, f) occurs. In signal-detection theory (SDT) terminology, h is termed hit rate (or success proportion, or true-positive proportion), and f is termed false-alarm rate (or failure proportion or false-positive proportion). The two most popular indices of discrimination are d and A_{z}, which describe how accurately experimental subjects and professionals perform a discrimination task (

In the proposal of the index,

This equation shows the difference between the means of the two probability density functions (PDF) in terms of the standard deviation of the PDF for f (or, equivalently, for h). The d index assumes equal-variance normal distributions. However, as there are cases in which the assumption of equal variances is not held, an index termed Δm was proposed which is consistent with unequal-variance, normal distributions (

It is obvious that if _{h} and_{f} are equal, then d´ and Δm are equal too. Interestingly, the Δm index is algebraically equivalent to the Δ coefficient given by (

Therefore, Glass’ Δ and Δm are analogous. In addition, Glass’ Δ can be considered a special case of Cohen’s d, because if the standard deviation of the control group (_{C}) is equal to the standard deviation of the treatment group (_{T}), Cohen’s d and Glass’ Δ are equal, therefore d´, Δm, Glass’ Δ, and Cohen’s d are equal too. Otherwise, Cohen’s d can be smaller or larger than Glass’ Δ depending on the value of the ratio _{T}/_{c}. If the ratio is greater than 1, Cohen’s d will be larger than Glass’ Δ and if the ratio is smaller than 1, Cohen’s d will be smaller Glass’ Δ.

The second SDT index, A_{z}, was defined by _{z} index can range from .5 to 1 and it assumes any form of distribution that can be transformed monotonically to the normal distribution (_{z} is the area under the normal curve, if the two variances are equal,

and, therefore, if the two variances are unequal

These last two equations are particularly relevant because, due the equivalence among d´, Δm, Glass’ Δ, and Cohen’s d shown above, they provide a way to transform the AUC into a Cohen’s d. In addition, as

A point that must be mentioned is that, in some cases, the observed A_{z} values are underestimates of the true A_{z} (_{z} for selection bias.

In order to facilitate comparisons across follow-up studies that report different ES indexes, _{pb}. _{pb}. One limitation of Rice & Harris’ Table is that it includes values for a 50% base rate only. In other words, p = q = .5. However, base rates different from 50% are not uncommon in practice and research (_{pb} value corresponding to a specific d value depends partially on the base rate. Therefore, for the same d value, different r_{pb} can exist depending on the base rate. Moreover, as soon the base rate differs from 50%, the r_{pb} values decrease, so the larger (or smaller) than 50% the base rate, the smaller the r_{pb}.

To create the Table,

Next, to transform the Cohen’s d value into the equivalent r_{pb} value, they used the well-known formula:

On this point, it is important to take into account that the conversion of d to a correlation is always r_{pb}, but this is true only when a true binary variable is used in the study. In other words, the nature of the dichotomization must be considered. If the dichotomization is natural (e.g., treatment vs. non-treatment; men vs. women), then the conversion from d produces r_{pb}, but if the dichotomization is artificial (i.e., a continuous variable is divided into two groups), then the conversion of d is an underestimation of r. _{pb} and, then, it would convert from r_{pb} to r. This clarification is relevant because as

Consequently, the d values under the first research design (i.e., true experiment, true binary variable) and the second one (i.e., artificial dichotomization of a continuous variable) provide different information and different estimates of the ES. For example, a Pearson’s r of .50 equals an r_{pb} of .40 (for p = q = .50) and equals a d of .87.

When a continuous variable is dichotomized, r_{pb} underestimates r by a factor of y/√(p*q), where p*q is the standard deviation of X (p = proportion of cases in the higher group or hits and q = 1 - p) and y is the ordinate of the normal-distribution curve at the point of division between p and q. The underestimation varies from about 1.25 when p = .5 to about 3.73 when p = .99 (

Two new extended Tables have been created using a process of six steps. In each step, a new transformation of the AUC is made and a different ES estimate is given. In addition to z, the new Tables provide the estimates of AUC as Cohen’s d for dichotomous variables, r_{pb}, OR, and the Ln(OR) for continuous and dichotomous variables.

The process of construction of

Where

Or it can be made using one of the functions listed by

The third step involved the conversion of a Cohen’s d into an r_{pb}. This was done for base rates of .5, .3, .2, and .1. The equation is:

Where K = 1/p*q. Therefore, K values for p = 5, .3, .2, and .1 are, 4, 4.76, 6.25, and 11.11, respectively. The K values clearly illustrate the decline of the r_{pb} as the p values decrease.

and

In summary, _{pb}, for four base rates, and

Note. AUC = area under the normal curve; z = normal deviate of AUC based on the assumption that X is normally distributed in both groups; d = transformation z into Cohen’s d; r_{pb}= transformation of Cohen’s d into point-biserial correlation; p = base rate or proportion of the treatment (experimental) group; q = 1-p.

Note. AUC = area under the normal curve; z = normal deviate of AUC based on the assumption that X is normally distributed in both groups; d = transformation z into Cohen’s d; LN(OR) = transformation of Cohen’s d into log natural of OR; OR = transformation of Cohen’s d into OR.

Reporting ESs is a more and more common practice in research and Pearson’s r and Cohen’s d are two of the most frequently used indices of ES. However, a large variety of ES indices are used. Some of them are particularly used in specific research domains. The AUC is one of the most typically reported in forensic psychology research and in diagnostic and clinical research (e.g., radiology). Odds-ratio (OR) and its varieties - e.g., Ln(OR), risk ratio - are also commonly found in articles reporting research findings. To encourage applied and forensic psychology researchers to report an ES and also the conversion of a variety of ES indices to a common metric, this article tabulated the conversion from AUC into Cohen’s d, OR, Ln(OR), and r_{pb} for base rates of 50%, 30% 20% and 10% for 500 AUC three-digit values. _{pb} values for base rates other than 50% nor for ESs other than Cohen’s d and r_{pb}. Therefore, the new tables can be seen, to a certain extent, as an extension of Rice and Harris’s Table.

Researchers should be aware of when it is correct and appropriate to transform a specific ES into another one and when the transformation is not appropriate, even when the formulas can be used in both cases. For example, d’ should not be transformed into OR when a within-group design is used, as OR requires a design with two independent samples. With regard to Cohen’s d transformation into a correlation coefficient, the nature (true binary vs. dichotomized) of the independent variable dictates what type of correlation coefficient should be obtained (see pages 10-11). Therefore, we recommend the researchers make a reasonable use of the transformation formulas.

In summary, this article presents two new Tables for obtaining four types of ESs from AUC-values and, simultaneously, converting an ES index in several others. In addition, researchers and practitioners can calculate by themselves the ES of interest with the formulas included here for the values not given in the Tables.

^{nd}edition. New York: Russell Sage Foundation.

^{rd}edition. Thousand Oaks, CA: Sage.