Follow the data
The brain is our most vital organ.
We help you treat it that way.
back to Our Science
The discrimination between
subjects with mild cognitive
impairment and normal controls
Seventy-one subjects, comprised of 32 cognitively normal volunteers, and 39 subjects with mild cognitive impairment (MCI), were included in an analysis of the ability of the Miro platform to distinguish subjects with MCI from normal subjects.

The subjects with MCI were a heterogeneous group with varying levels of performance. Some were referred by community neuropsychologists in Northern California and others identified by researchers at Johns Hopkins University. To account for the functional range in the MCI subjects, experts identified 21 MCI subjects as being “High Functioning MCI” and 18 subjects as “MCI”. “High Functioning MCI” includes subjects who perform within the normal range on standard tests of cognitive function, but who present with complaints of perceived cognitive deficits. “MCI” includes subjects whose performance on standardized tests fall within the range of Mild Cognitive Impairment. Automated MCI Risk Score classification was developed with data from 32 Normal Control subjects and the 18 subjects with MCI. Both MCI and High Functioning MCI groups were analyzed according to the MCI Risk Score.
Normal83/1765.4 (49-89)32
MCI47/5370.4 (51-92)18
High Functioning MCI70/3077.4 (52-95)21
Standardized versions of basic variable scores were combined to form an MCI Risk Score. This score is designed to specifically distinguish performance of normal subjects from the performance of subjects with Mild Cognitive Impairment. Basic variable scores were standardized based on the Normal Control data set to have means set to zero and standard deviations set to one. For a minimal subset of Miro modules with non-equivalent versions, basic scores were standardized independently, per version. These included: 1. Picture Description, wherein each picture to be described produces a unique lexicon; Category Fluency, wherein each category to be explored produces an independent word list; and Letter Fluency, wherein each letter to initiate word-production varies in difficulty and produces a unique word-list length.

Prior to combining variables into aggregate scores, each standardized variable score was quantile-normalized to mitigate undue influence of outliers or peculiar distributions of any individual scores. Missing values were imputed using low rank matrix completion. If a subject participated in multiple1 assessments (as for test-retest reliability), only initial (T1) assessment results were included in the discrimination analysis. The process for combining normalized variables to form an MCI risk score is described below:

The risk score was developed using L1-, L2- regularized “elastic net” logistic regression2 which is a modified version of logistic regression. In this application of L1-, L2- elastic net regression, the combination of normalized input variables were optimized based on the log-odds estimates of each individual’s performance pattern correlating with a predefined MCI performance pattern. The MCI performance pattern was defined by the MCI subject group mean and standard deviation per variable. The effect of the L1- penalty was to exclude input variables from the risk score if they were not particularly useful for inferring the odds of individuals being categorized as MCI. The effect of the L2- penalty was seen in situations where there are several highly correlated variables that each predict the odds of being categorized as MCI. With an L2- penalty, rather than picking a single input from the set of correlated input scores, a weighted combination was used — possibly smoothing out noise or measurement error. When datasets were limited, rather than using a portion of the data to optimize the weights put on the two penalties, the relative weights put on the two penalties were set equal (alpha=0.5) and the overall strength of the penalties was set to 0.2 (lambda=0.2). Results describe the maximum likelihood that the data was subject to a penalty on both: a) the sum of the absolute values of the weights put on each of the input variables (the L1- penalty); and b) on the sum of the squares of the weights (L2- penalty).

For an individual, the weighted combination of input variables corresponding to the learned penalized logistic regression is treated as a new score — that individual’s risk score for being categorized as MCI. To evaluate these risk scores for the Normal and MCI subjects used to learn the model, the weights on the input variables to combine into the risk score, leave-one-out cross-validation was used. Each of the Normal or MCI subjects in turn was left out of the analysis, the weights on the input variables for the risk score were reevaluated in the remaining subjects, and the resulting model was used to calculate the risk score for the left-out subject. MCI Risk Scores were calculated for all subjects.
The high intra-class correlation (0.79) for the MCI Risk Score indicates that this is a reliable measure of
individuals’ performances. The AUC of the MCI Risk Score to separate normal subjects from clinically
impaired MCI subjects is (0.94). When High Functioning MCI and MCI subjects were combined into a single
group, the AUC is (0.87).
Discrimination between MCI,
High Functioning MCI and
Normal Controls
For the 32 Normal Control
subjects, the cross-validated MCI
risk scores had mean values of
(1.49), with a
standard deviation
of (0.87). For the 18 MCI subjects,
the risk scores had a mean value
of (-0.44), with a standard
deviation of (0.92). For the 21
High Functioning MCI subjects,
the risk score had a mean value
of (0.41), with a
standard deviation of (0.77). For the combined group of
MCI subjects, the risk score had a
mean value of (0.03),
with a
standard deviation of (0.92).
Preliminary results show notable
improvement in the sensitivity and
specificity of discriminating between
healthy participants and those with mild
cognitive impairment as compared to
comparator clinical and research
practices. While the leave-one-out
machine learning method curtails
potential over-fitting of the model, the
collection of larger data sets will allow
further exploration of alternative
approaches, for example the use of
training and test sets. It is expected that
increased data set size will generally
correspond to improved performance.
1 Jian-Feng Cai, Emmanuel J. Candès, and Zuowei
Shen A Singular Value Thresholding Algorithm for
Matrix Completion. SIAM J. Optim.,
20(4), 1956–1982.
(27 pages)
2 Jerome Friedman, Trevor Hastie and Rob
Tibshirani. (2008). Regularization Paths for
Generalized Linear Models via Coordinate Descent.
Journal of Statistical Software, Vol. 33(1), 1-22 Feb
Follow the data