T001-001.xlsx (Subject 01, Simulation 01)
Frame | Time | Anger | Contempt | Disgust | Fear | Joy | Sad | Surprise | Neutral | ID |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.0000 | 0.0101 | 0.0218 | 0.0043 | 0.0541 | 0.5260 | 0.0959 | 0.0010 | 0.2868 | T001-001 |
1 | 0.0333 | 0.0101 | 0.0218 | 0.0043 | 0.0541 | 0.5260 | 0.0959 | 0.0010 | 0.2868 | T001-001 |
2 | 0.0667 | 0.0101 | 0.0218 | 0.0043 | 0.0541 | 0.5260 | 0.0959 | 0.0010 | 0.2868 | T001-001 |
3 | 0.1000 | 0.0080 | 0.0187 | 0.0032 | 0.0375 | 0.5353 | 0.1050 | 0.0011 | 0.2911 | T001-001 |
4 | 0.1333 | 0.0091 | 0.0380 | 0.0158 | 0.0036 | 0.6902 | 0.0177 | 0.0004 | 0.2252 | T001-001 |
5 | 0.1667 | 0.0104 | 0.0450 | 0.0139 | 0.0030 | 0.7157 | 0.0162 | 0.0003 | 0.1955 | T001-001 |
Start | End | Event.Switch | Event.Type | Event | ID |
---|---|---|---|---|---|
86.5 | 246.50 | 1 | 1 | Analytical Questions | T001-005 |
508.5 | 657.50 | 1 | 2 | Mathematical Questions | T001-005 |
107.5 | 269.25 | 1 | 3 | Emotional Questions | T001-006 |
521.0 | 674.75 | 1 | 3 | Emotional Questions | T001-006 |
81.0 | 240.00 | 1 | 4 | Texting | T001-007 |
510.0 | 671.00 | 1 | 4 | Texting | T001-007 |
Sample of Processed Data Showing an Event Transition
Subject | Trial | Age | Gender | Frame | Time | Event.Switch | Event | Action | Anger | Contempt | Disgust | Fear | Joy | Sad | Surprise | Neutral |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
T001 | 007 | Y | M | 2427 | 80.900 | 0 | No Event | 0 | 0.0909 | 0.0575 | 0.4205 | 3e-04 | 0.0011 | 0.1343 | 0 | 0.2954 |
T001 | 007 | Y | M | 2428 | 80.933 | 0 | No Event | 0 | 0.0612 | 0.0397 | 0.4293 | 4e-04 | 0.0011 | 0.1630 | 0 | 0.3052 |
T001 | 007 | Y | M | 2429 | 80.967 | 0 | No Event | 0 | 0.1034 | 0.0963 | 0.3186 | 2e-04 | 0.0013 | 0.0856 | 0 | 0.3946 |
T001 | 007 | Y | M | 2430 | 81.000 | 1 | Texting | 4 | 0.0363 | 0.4976 | 0.0171 | 1e-04 | 0.0024 | 0.0069 | 0 | 0.4396 |
T001 | 007 | Y | M | 2431 | 81.033 | 1 | Texting | 4 | 0.0059 | 0.7285 | 0.0027 | 4e-04 | 0.0068 | 0.0063 | 0 | 0.2493 |
T001 | 007 | Y | M | 2432 | 81.067 | 1 | Texting | 4 | 0.0058 | 0.6890 | 0.0035 | 4e-04 | 0.0077 | 0.0068 | 0 | 0.2868 |
Reproducible Research
Summary
Baseline Trial
Model Proposal
Neural Network Advantages
Tenets of Feed-Forward Neural Networks:
Neural Network Components
Step 1: Model is initialized with random weights.
Step 2: Calculate hidden weights and output node prediction.
Step 3: Update Weights Based on Error.
Step 4: Repeat Step 2 to update the hidden nodes and output prediction.
General Model Form
\[ \begin{align*} nnet(Texting \sim & \text{ } Subject + Age + Gender + Anger + Contempt \text{ } + \\ & \text{ } Digust + Fear + Joy + Sad + Surprise + Neutral)\\ \end{align*} \]
Modeling Strategy
Train the same general model on various slices of the data to see what works best.
12 training and testing sets were created from the combination of Data Processing and Data Split methods.
Data Processing
Data Split
Statistical Software
R's nnet package for feed-forward neural networks.
The Caret Package:
Validation Testing and Performance
Model Search Parameters
Model Performance with 100 Iteration Limit
Model | Data Processing | Data Split | MaxItr | Size | Decay | Training | Testing | AUC |
---|---|---|---|---|---|---|---|---|
Model 1: | Original | 365 Split | 100 | 50 | .10 | .776 | .693 | .754 |
Model 2: | Original | Entire Sim | 100 | 50 | .20 | .766 | .766 | .862 |
Model 3: | Differencing | 365 Split | 100 | 25 | .10 | .538 | .529 | .556 |
Model 4: | Differencing | Entire Sim | 100 | 25 | .10 | .548 | .548 | .610 |
Model 5: | Moving Avg | 365 Split | 100 | 10 | .00 | .503 | .503 | .530 |
Model 6: | Moving Avg | Entire Sim | 100 | 10 | .00 | .524 | .524 | .573 |
Model 7: | ½ Sec Cut | 365 Split | 100 | 50 | .10 | .819 | .694 | .766 |
Model 8: | ½ Sec Cut | Entire Sim | 100 | 50 | .10 | .801 | .789 | .881 |
Model 9: | ½ Sec Diff | 365 Split | 100 | 25 | .00 | .664 | .631 | .683 |
Model 10: | ½ Sec Diff | Entire Sim | 100 | 50 | .00 | .500 | .500 | .486 |
Model 11: | ½ Sec Cut Stat | 365 Split | 100 | 50 | .10 | .855 | .727 | .795 |
Model 12: | ½ Sec Cut Stat | Entire Sim | 100 | 50 | .20 | .833 | .815 | .900 |
\newpage
Model | Data Processing | Data Split | MaxItr | Size | Decay | Training | Testing | AUC |
---|---|---|---|---|---|---|---|---|
Model 8: | ½ Sec Cut | Entire Sim | 250 | 50 | .00 | .844 | .825 | .907 |
Model 8: | ½ Sec Cut | Entire Sim | 500 | 50 | .10 | .847 | .825 | .911 |
Model 8: | ½ Sec Cut | Entire Sim | 1000 | 50 | .10 | .853 | .828 | .917 |
Model 12: | ½ Sec Cut Stat | Entire Sim | 250 | 50 | .10 | .872 | .837 | .915 |
Model 12: | ½ Sec Cut Stat | Entire Sim | 500 | 50 | .20 | .880 | .839 | .912 |
Model 12: | ½ Sec Cut Stat | Entire Sim | 1000 | 50 | .10 | .900 | .840 | .913 |
## Set Cross Validation
fit.control = trainControl(method = "cv", number = 10)
## Create model parameters
search.grid = expand.grid(decay = c(0, .1, .2),
size = c(1, 10, 25, 50))
## Limit the iterations and weights
maxIt = 1000; maxWt = 15000
fit = train(Texting ~ . - Time, mdl.08.train,
method = "nnet",
trControl = fit.control,
tuneGrid = search.grid,
MaxNWts = maxWt,
maxit = maxIt)
40255 samples, 12 predictors, 2 classes: '0', '1'
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 436229, 36229, 36229, ...
Resampling results across tuning parameters:
------------------------------
Decay Size Accuracy Kappa
------------------------------
0.0 1 0.643 0.302
0.0 10 0.793 0.566
0.0 25 0.819 0.625
0.0 50 0.833 0.655
0.1 1 0.667 0.306
0.1 10 0.814 0.612
0.1 25 0.834 0.655
0.1 50 0.841 0.669
0.2 1 0.681 0.310
0.2 10 0.814 0.611
0.2 25 0.830 0.646
0.2 50 0.834 0.654
Reference
0 1
Pred 0 22736 4616
1 2943 14208
Accuracy : 0.8393
95% CI : (0.8356, 0.8429)
No Information Rate : 0.5856
Kappa : 0.6649
Sensitivity : 0.7656
Specificity : 0.8914
Pos Pred Value : 0.8331
Neg Pred Value : 0.8431
Prevalence : 0.4144
Balanced Accuracy : 0.8285
Area Under Curve (AUC): 0.917
Balanced Accuracy by Subject
T022 | T035 | T086 | T083 | T074 | T018 | T007 | T006 | T020 | T088 | T012 | T032 | T044 | T009 | T064 | T003 | T011 | T082 | T060 | Top | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train | 0.976 | 0.956 | 0.970 | 0.946 | 0.934 | 0.930 | 0.950 | 0.953 | 0.929 | 0.921 | 0.926 | 0.893 | 0.926 | 0.915 | 0.940 | 0.887 | 0.894 | 0.902 | 0.904 | .921 |
Test | 0.964 | 0.945 | 0.943 | 0.936 | 0.933 | 0.927 | 0.924 | 0.922 | 0.921 | 0.910 | 0.904 | 0.902 | 0.900 | 0.898 | 0.895 | 0.885 | 0.884 | 0.882 | 0.880 | .913 |
GenderMale | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 10 |
AgeOld | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 7 |
T013 | T081 | T080 | T016 | T079 | T051 | T039 | T015 | T046 | T066 | T076 | T010 | T005 | T008 | T042 | T024 | T077 | T001 | T029 | Middle | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train | 0.911 | 0.954 | 0.893 | 0.881 | 0.888 | 0.816 | 0.883 | 0.881 | 0.871 | 0.851 | 0.856 | 0.843 | 0.859 | 0.868 | 0.840 | 0.797 | 0.796 | 0.850 | 0.816 | .860 |
Test | 0.878 | 0.877 | 0.873 | 0.869 | 0.860 | 0.855 | 0.851 | 0.848 | 0.839 | 0.838 | 0.830 | 0.827 | 0.824 | 0.820 | 0.817 | 0.807 | 0.806 | 0.797 | 0.790 | .832 |
GenderMale | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 9 |
AgeOld | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 6 |
T017 | T014 | T054 | T031 | T040 | T061 | T033 | T019 | T021 | T036 | T084 | T004 | T073 | T002 | T041 | T025 | T047 | T034 | Bottom | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train | 0.784 | 0.805 | 0.803 | 0.788 | 0.792 | 0.787 | 0.776 | 0.738 | 0.764 | 0.785 | 0.804 | 0.722 | 0.749 | 0.730 | 0.707 | 0.758 | 0.719 | 0.669 | .760 |
Test | 0.783 | 0.782 | 0.762 | 0.753 | 0.752 | 0.747 | 0.742 | 0.739 | 0.728 | 0.722 | 0.718 | 0.715 | 0.703 | 0.694 | 0.691 | 0.689 | 0.665 | 0.638 | .723 |
GenderMale | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 9 |
AgeOld | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 11 |
Summary by Age and Gender
Male | Female | Old | Young | Young Male | Young Female | Old Male | Old Female | |
---|---|---|---|---|---|---|---|---|
Top | 35.7% | 32.1% | 29.1% | 37.5% | 35.7% | 41.1% | 35.7% | 9.1% |
Middle | 32.1% | 35.7% | 25.0% | 40.6% | 35.7% | 41.1% | 28.5% | 36.3% |
Bottom | 32.1% | 32.1% | 45.8% | 21.8% | 28.5% | 17.6% | 35.7% | 54.5% |
Evaluating Differences in Age and Gender
******************************************************************
Levene's Test for Homogeneity of Variance (Center = Median)
******************************************************************
Df F value Pr(>F)
group 1 0.7054 0.4047
54
******************************************************************
General Linear Model
******************************************************************
Deviance Residuals:
Min 1Q Median 3Q Max
-0.18434 -0.06279 0.01139 0.05423 0.16550
Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.77983 0.02471 31.558 <2e-16 ***
Old Male 0.04272 0.03302 1.294 0.2015
Young Female 0.07307 0.03171 2.304 0.0252 *
Young Male 0.05548 0.03302 1.680 0.0989 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 0.006717054)
Null deviance: 0.38640 on 55 degrees of freedom
Residual deviance: 0.34929 on 52 degrees of freedom
AIC: -115.4
******************************************************************
Shapiro-Wilk Normality Test
******************************************************************
data: mdl$residuals
W = 0.98393, p-value = 0.6586
Neutral, Surprise, and Anger were the most important emotions for successfully identifying texting.
Joy, Contempt, and Fear are the least important emotions for identifying texting.
Young Females had the best overall testing performance while Older Females had the worst overall testing performance.
After extending the training iterations, the difference between Model 8 (½ Sec Cut) and Model 12 (½ Sec Cut Stat) are negligible. This suggests that there is much more information in the average likelihood score than in the other descriptive statistics (sd, min, max, iqr).