MiroHealth

Groundbreaking

science at your

fingertips

science at your

fingertips

The brain is our most vital organ.

We help you treat it that way.

We help you treat it that way.

back to

v1.0 On-device speech recognition algorithms

v1.0 Machine learning algorithms

v2.0 Reference data set

**Our Science**The discrimination between

subjects with mild cognitive

impairment and normal controls

subjects with mild cognitive

impairment and normal controls

Miro Versions

v2.0 MiroMindv1.0 On-device speech recognition algorithms

v1.0 Machine learning algorithms

v2.0 Reference data set

Seventy-one subjects, comprised of 32 cognitively normal volunteers, and 39 subjects with mild cognitive impairment (MCI), were included in an analysis of the ability of the Miro platform to distinguish subjects with MCI from normal subjects.

The subjects with MCI were a heterogeneous group with varying levels of performance. Some were referred by community neuropsychologists in Northern California and others identified by researchers at Johns Hopkins University. To account for the functional range in the MCI subjects, experts identified 21 MCI subjects as being “High Functioning MCI” and 18 subjects as “MCI”. “High Functioning MCI” includes subjects who perform within the normal range on standard tests of cognitive function, but who present with complaints of perceived cognitive deficits. “MCI” includes subjects whose performance on standardized tests fall within the range of Mild Cognitive Impairment. Automated MCI Risk Score classification was developed with data from 32 Normal Control subjects and the 18 subjects with MCI. Both MCI and High Functioning MCI groups were analyzed according to the MCI Risk Score.

The subjects with MCI were a heterogeneous group with varying levels of performance. Some were referred by community neuropsychologists in Northern California and others identified by researchers at Johns Hopkins University. To account for the functional range in the MCI subjects, experts identified 21 MCI subjects as being “High Functioning MCI” and 18 subjects as “MCI”. “High Functioning MCI” includes subjects who perform within the normal range on standard tests of cognitive function, but who present with complaints of perceived cognitive deficits. “MCI” includes subjects whose performance on standardized tests fall within the range of Mild Cognitive Impairment. Automated MCI Risk Score classification was developed with data from 32 Normal Control subjects and the 18 subjects with MCI. Both MCI and High Functioning MCI groups were analyzed according to the MCI Risk Score.

Demographics

GROUP | % F/M | MEAN AGE (RANGE) | TOTAL N = 70 |
---|---|---|---|

Normal | 83/17 | 65.4 (49-89) | 32 |

MCI | 47/53 | 70.4 (51-92) | 17 |

High Functioning MCI | 70/30 | 77.4 (52-95) | 21 |

Methods

Standardized versions of basic variable scores were combined to form an MCI Risk Score. This score is designed to specifically distinguish performance of normal subjects from the performance of subjects with Mild Cognitive Impairment. Basic variable scores were standardized based on the Normal Control data set to have means set to zero and standard deviations set to one. For a minimal subset of Miro modules with non-equivalent versions, basic scores were standardized independently, per version. These included: 1. Picture Description, wherein each picture to be described produces a unique lexicon; Category Fluency, wherein each category to be explored produces an independent word list; and Letter Fluency, wherein each letter to initiate word-production varies in difficulty and produces a unique word-list length.

Prior to combining variables into aggregate scores, each standardized variable score was quantile-normalized to mitigate undue influence of outliers or peculiar distributions of any individual scores. Missing values were imputed using low rank matrix completion. If a subject participated in multiple

The risk score was developed using L1-, L2- regularized “elastic net” logistic regression

For an individual, the weighted combination of input variables corresponding to the learned penalized logistic regression is treated as a new score — that individual’s risk score for being categorized as MCI. To evaluate these risk scores for the Normal and MCI subjects used to learn the model, the weights on the input variables to combine into the risk score, leave-one-out cross-validation was used. Each of the Normal or MCI subjects in turn was left out of the analysis, the weights on the input variables for the risk score were reevaluated in the remaining subjects, and the resulting model was used to calculate the risk score for the left-out subject. MCI Risk Scores were calculated for all subjects.

Prior to combining variables into aggregate scores, each standardized variable score was quantile-normalized to mitigate undue influence of outliers or peculiar distributions of any individual scores. Missing values were imputed using low rank matrix completion. If a subject participated in multiple

^{1}assessments (as for test-retest reliability), only initial (T1) assessment results were included in the discrimination analysis. The process for combining normalized variables to form an MCI risk score is described below:The risk score was developed using L1-, L2- regularized “elastic net” logistic regression

^{2}which is a modified version of logistic regression. In this application of L1-, L2- elastic net regression, the combination of normalized input variables were optimized based on the log-odds estimates of each individual’s performance pattern correlating with a predefined MCI performance pattern. The MCI performance pattern was defined by the MCI subject group mean and standard deviation per variable. The effect of the L1- penalty was to exclude input variables from the risk score if they were not particularly useful for inferring the odds of individuals being categorized as MCI. The effect of the L2- penalty was seen in situations where there are several highly correlated variables that each predict the odds of being categorized as MCI. With an L2- penalty, rather than picking a single input from the set of correlated input scores, a weighted combination was used — possibly smoothing out noise or measurement error. When datasets were limited, rather than using a portion of the data to optimize the weights put on the two penalties, the relative weights put on the two penalties were set equal (alpha=0.5) and the overall strength of the penalties was set to 0.2 (lambda=0.2). Results describe the maximum likelihood that the data was subject to a penalty on both: a) the sum of the absolute values of the weights put on each of the input variables (the L1- penalty); and b) on the sum of the squares of the weights (L2- penalty).For an individual, the weighted combination of input variables corresponding to the learned penalized logistic regression is treated as a new score — that individual’s risk score for being categorized as MCI. To evaluate these risk scores for the Normal and MCI subjects used to learn the model, the weights on the input variables to combine into the risk score, leave-one-out cross-validation was used. Each of the Normal or MCI subjects in turn was left out of the analysis, the weights on the input variables for the risk score were reevaluated in the remaining subjects, and the resulting model was used to calculate the risk score for the left-out subject. MCI Risk Scores were calculated for all subjects.

Results

The high intra-class correlation (0.79) for the MCI Risk Score indicates that this is a reliable measure of

individuals’ performances. The AUC of the MCI Risk Score to separate normal subjects from clinically

impaired MCI subjects is (0.92). When High Functioning MCI and MCI subjects were combined into a single

group, the AUC is (0.87).

individuals’ performances. The AUC of the MCI Risk Score to separate normal subjects from clinically

impaired MCI subjects is (0.92). When High Functioning MCI and MCI subjects were combined into a single

group, the AUC is (0.87).

Discrimination between MCI,

High Functioning MCI and

Normal Controls

High Functioning MCI and

Normal Controls

For the 32 Normal Control

subjects, the cross-validated MCI

risk scores had mean values of

(1.49), with a

standard deviation

of (0.87). For the 18 MCI subjects,

the risk scores had a mean value

of (-0.44), with a standard

deviation of (0.92). For the 21

High Functioning MCI subjects,

the risk score had a mean value

of (0.41), with a

standard deviation of (0.77). For the combined group of

MCI subjects, the risk score had a

mean value of (0.03),

with a

standard deviation of (0.92).

subjects, the cross-validated MCI

risk scores had mean values of

(1.49), with a

standard deviation

of (0.87). For the 18 MCI subjects,

the risk scores had a mean value

of (-0.44), with a standard

deviation of (0.92). For the 21

High Functioning MCI subjects,

the risk score had a mean value

of (0.41), with a

standard deviation of (0.77). For the combined group of

MCI subjects, the risk score had a

mean value of (0.03),

with a

standard deviation of (0.92).

Discussion

Preliminary results show notable

improvement in the sensitivity and

specificity of discriminating between

healthy participants and those with mild

cognitive impairment as compared to

comparator clinical and research

practices. While the leave-one-out

machine learning method curtails

potential over-fitting of the model, the

collection of larger data sets will allow

further exploration of alternative

approaches, for example the use of

training and test sets. It is expected that

increased data set size will generally

correspond to improved performance.

improvement in the sensitivity and

specificity of discriminating between

healthy participants and those with mild

cognitive impairment as compared to

comparator clinical and research

practices. While the leave-one-out

machine learning method curtails

potential over-fitting of the model, the

collection of larger data sets will allow

further exploration of alternative

approaches, for example the use of

training and test sets. It is expected that

increased data set size will generally

correspond to improved performance.

^{1 }Jian-Feng Cai, Emmanuel J. Candès, and Zuowei

Shen A Singular Value Thresholding Algorithm for

Matrix Completion. SIAM J. Optim., 20(4), 1956–1982.

(27 pages)

^{2}Jerome Friedman, Trevor Hastie and Rob

Tibshirani. (2008). Regularization Paths for

Generalized Linear Models via Coordinate Descent.

Journal of Statistical Software, Vol. 33(1), 1-22 Feb

2010.

Follow the data

GET STARTED