An Approach of Power Estimation for Linear Mixed Models for Clinical Studies
Alcon (China) Ophthalmic Product Co., Ltd., Beijing, China
To cite this article:
Weijia Feng. An Approach of Power Estimation for Linear Mixed Models for Clinical Studies. Science Journal of Applied Mathematics and Statistics. Vol. 4, No. 2, 2016, pp. 59-63. doi: 10.11648/j.sjams.20160402.17
Received: February 25, 2016; Accepted: April 7, 2016; Published: April 11, 2016
Abstract: This study aims to demonstrate a practical way of power estimation for linear mixed models in clinical studies. Approximation methods using z and t statistics are discussed and compared to the simulated results. It was found that the approximation methods generally provide a slight overestimation of power, relative to simulated results using the Kenward and Roger estimation of degree of freedom. However, results of approximation methods can be informative in certain scenarios. In conclusion, the z approximation and t approximation with a residual degree of freedom can be useful in certain situations. Simulation method can serve as a general solution.
Keywords: Power, Sample Size, Linear Mixed Model, Clinical Studies
Linear mixed models and generalized linear models are popular choices in the analyses of clinical trial data, especially for the studies with longitudinal designs (Cook and DeMets, 2007; Laird and Ware, 1982). At the study initiation stage, since the process of study planning is usually complex and involves various financial factors and parties, a quick estimation of sample size is considered very useful for the clinical team and decision makers.
Unfortunately, unlike the simpler cases of comparing group means or group proportions, such as classical t test of two sample means from normal populations, or the z test, for which the sample size calculation (Lachin, 1981; Chow et al., 2003; Julious, 2003) is even possible without using a calculator, the estimation of sample size for mixed linear models is typically complex and the approach is study-based (Dang et al., 2008; Chen et al., 2008; Vierron and Giraudeau, 2006; Heo et al., 2010; Kain et al., 2015; Candel and van Breukelen, 2015). Even for the cases of methods with some degree of generality (Lu et al., 2009; Liu and Liang, 1997; Self and Mauritsen, 1988), the actual implementation is more or less limited given their complexity.
In this report, the author presents an approach that is of a practical nature and show that approximation methods can be useful in the special, but definitely not rare cases, where the simplicity of estimation is important. Meanwhile as a general solution, a simulation method can be implemented which is not much more complicated than the proposed approximation methods.
A general linear mixed model is defined as bellow:
Where Y is a n.. by 1 vector. n.. is the total number of observations. X is the design matrix for the fixed effects. The fix effects are represented by an m by 1 vector β, where m is the number of effects. Z is the coefficient for the random effect vector γ. Error term ε is a n. by 1 vector.
It is assumed that γ and ε are normally distributed with zero means. The covariance matrix between the random vectors is , where G and R are the variance covariance matrix of γ and ε respectively. Zeros represent zero matrices of appropriate dimensions.
Define V as the variance-covariance matrix of y, then
In longitudinal clinical studies, usually researchers are primarily interested in the comparison of marginal means representing the treatment effect for all repeated measures (or visits), or for some repeated measures, or at one particular measure (Lu et al., 2009).Utilizing the mixed model, such marginal means can be compared by identifying and analyzing the linear combination of the fixed effect β, i.e., , when a vector l is properly defined.
A typical two-sided test of equivalence can take the form H0: lβ=0 and H1: lβ≠0. Whereas a non-inferiority test could take the form H0: lβ≤Δ and H1: lβ>Δ. where Δ represents a non-inferiority margin.
To determine the power of such tests, a straightforward approach is to consider the following statistic: t = (Lu et al., 2009; Heo et al., 2010), where SE standards for the standard error. When V (2) is unknown, the distribution of t is unknown (McCulloch and Searle, 2001). However, one can assume that using an estimate of V, t follows a central t distribution under the null hypothesis of the two sided test with an unknown degree of freedom (Rencher and Schaalje, 2008; Kenward and Roger, 1997). In such a manner, one can perform a t test with an estimated degree of freedom.
It is then natural to apply the method commonly used to determine the power of t tests to get the power estimates. This takes the approximate form (Chow et. al, 2008):
where α, β are type I and type II error rates. T(X,v|B) is the cumulative t distribution function for X, with v degree of freedom and a non-centrality parameter of B. is the ()x 100-th quantile of the t distribution of v degree of freedom.
Alternatively, a Z test for large samples (Rencher and Schaalje, 2008) can be similarly defined by assuming an asymptotic normal distribution of the test statistic z = , in this case the power can be approximately formulated as (Chow et. al, 2008),:
where Φ is the cumulative distribution function of the standard normal variable. is the ()x100-th quantile of the standard normal distribution.
Consider the maximum likelihood estimate of (Lu et al., 2009), the can be specified (McCulloch and Searle, 2001) by taking the square root of
The terms on the right hand side as well as the raw effect , are assumed to be known in the sense that the effect size is known, to be able to produce estimations of power.
3. A Simulation Study of a Typical Problem
In a hypothetical single-center, randomized study, subjects either receive the study drug or the placebo in a randomized manner, and are followed at two study visits. A continuous response (which could be the reduction of blood pressure, for example, if this is an antihypertensive drug study) is measured at the each of two study visits. A linear mixed model can be specified in the following manner using the (1):
where the last four elements of the vector represent the fixed effect for treatment of investigated drug and placebo, and the time effect of first visit and second visit, respectively. The model does not include an interaction effect for simplicity. Also this is for the consideration that an interaction effect is not always present for clinical trial data.
If there is no missing observation, the submatrices of X can be arranged as above. The capitalized subscript indicates the treatment (A=Investigated drug, B=Placebo) and i means the i=th subject in the respective treatment group. If there are missing observations for a subject, the number of rows for submatrix corresponding to the subject will be less than the number of visits.
The matrix X can be defined as ). The Z matrix is of the block diagonal form:
where J2 is a 2 by 1 vector of 1s, if there is no missing observations. If there are missing observations, 1 should replace J2 if one of the two observations is missing.
A random effect of subject is defined. For simplicity, define that:
where is the identify matrix of the rank n, which is number of subjects. and are variance components.
Unless otherwise specified, simulations were performed by using the following parameters: , , .
In every scenario where the sample size, the treatment assignment ratio and the missing data proportion etc. are varied, 1000 samples were generated using the linear mixed model. Data were fitted using PROC MIXED in SAS 9.3 software with a correct specification of the model and various options of degree of freedom methods.
The least squares means (Khuri, 2010) of the effect "treatment" were estimated and a test of equivalence of the treatment least squares means was performed as described earlier in the text. The simulated power is the proportion of rejected tests to the total number of tests performed. Analytically, the difference of least squares means for the treatment effect is , where .
|N||Df RES||Df KD Median||Power Z||Power T RES||Power T KD||Simulation RES||Simulation KD|
Table 1 shows the power estimates using z test, and t test method for the data without missing observations. The degree of freedom based on the residual method is shown. The median of degree of freedom based on the Kenward-Roger (KR) method is also shown. The t estimation of power given KR degree of freedom is based on such a median. Taking the simulated power as the reference, the overestimation of power by z test method can be observed. For the estimated degree of freedom method, the residual method leads to an overestimation of power relative to the KR method. There is corresponding change of power when the parameters of the model changes. As an example, by comparing the block at the top with the block at the middle of table 1, the effect of ratio of subject number between two treatments on power can be identified. The block at the bottom of table 1 shows the drop of power when the effect size decreases.
|N||Df RES||Df KD Median||Power Z||Power T RES||Power T KD||Simulation RES||Simulation KD|
|Ratio=1, LSDiff=1, P missing=0.1|
|Ratio=1, LSDiff=1, P missing=0.2|
Table 2 demonstrates the power estimates when some of the observations are missing. In this case missing completed at random is assumed, i.e. the missing status of an observation is independent of the observed or non-observed value of the data. As the percentage of the missing data increases, there is a corresponding drop of power as predicted. In line with table 1, the overestimation of power by the z test method and the overestimation of the degree of freedom by residual degree of freedom are demonstrated. When the missing data is introduced, one of the results in simulation is that the estimated degree of freedom tends to deviate further from an integer value, as can be seen for the median degree of freedoms of the KR method.
As demonstrated in the results, method of z test gives overestimation of the power. This is due to the assumption of large sample. Residual degree of freedom method is based on the assumption that correlational structure of the data is ignored (Schaalje et al., 2002). KR method is based on corrected estimation of , where is the estimation of β, taking into account the underestimation by using the asymptotic variance covariance matrix () for , and the bias of estimation when the estimated variance covariance matrix is used in estimating (Kenward and Roger, 1997; McCulloch and Searle, 2001). There are other methods of identifying the degree of freedoms, such containment methods, Satterthwaite method (Schaalje et al., 2002), those methods are not evaluated in the current study.
Assume that the KR method is the method of choice in data analysis, it is obvious that the most general approach for the power estimation is to get simulated result using the very method. From the perspective of the t approximation method, simulation is basically needed to get the estimated degree of freedom. This is because that, for the Satterthwaite type of degree of freedom (Satterthwaite, 1946; Kenward and Roger, 1997; Rencher and Schaalje, 2008), both the estimated variance covariance matrix of y, and the sample variance covariance matrix of the variance components are needed besides the knowledge of other terms which are specified anyway for the effect size. Even if the estimated variance covariance matrix of y can be assumed known, the sample variance covariance matrix of the variance components would still need to be estimated typically by using the residual maximum likelihood (REML) method (Rencher and Schaalje, 2008).
In comparison, no simulation is needed for both the method of z approximation and residual degree of freedom. For the later, the only knowledge needed is the total number of observations and rank of X. In practice, those methods can become useful when the simulation is considered too time consuming. As has been demonstrated, even an overestimation of power is predictable, given the uncertainty already supplied by the effect size estimations, the overestimation may be regarded insignificant.
In terms of generality, as long as that the model parameters are fully defined, there is no restriction of the applicability besides those of the linear mixed models. The specification of design matrices will need some programming efforts for data with a desired pattern of imbalance, but those can usually be handled. The rest of operations, such as finding the rank of design matrices, and generalized inverses can all be automated as well using for example, PROC IML in SAS software. The l vector (as appeared in (2)) can be identified and checked for estimability. The G and R matrices can take any form, whenever they can be fully specified and well-defined.
The approximation of the power/ sample size for a linear mixed model can be flexible. When a quick estimation is needed with rough effect size estimates, the z approximation method and the t approximation with residual degree of freedom method can be used. The method of simulation with an appropriate choice of denominator degree of freedom can be used as a general solution.