* CITE AS: Zyphur, M. J., & Li, J. (2019). EDM for N of 1. Stata .do file retrieved online. * CONTACT: mzyphur@unimelb.edu.au * NOTES ON THE .DO FILE AND RESULTS: * This runs EDM analyses for N=1 time-series data (tsset) as follows: * 1) Initial simplex projections (SP), which is diagnostic and determines the E used in subsequent analyses * 2) Predictability at different prediction steps (TP) into the future using simplex projections * 3) S-mapping (SMAP) to test for nonlinearity at different theta values * 4) Coprediction to evaluate predictability for all unique variable pairs using simplex projections * 5) Convergent cross mapping (CCM) or xmap to evaluate causal effects vs. common drivers, etc. * Note 1: Requires these "scc install" packages: lgraph, labutil, heatplot, palettes, colrspace, findname, gtools * Note 2: If computations fail for any analysis, no record is created in saved results, so check output for errors * Note 3: User must first address each local macro below following the ALL CAPS HEADINGS * Note 4: The length of variable names must be 8 characters or less * EDM COMMAND OPTIONS AND SAVE RESULTS local options = "" // Options for the EDM command can be entered here with spaces between them; ensure options apply to both 'explore' and 'xmap' subcommands * Note 1: For example this can be left bank to use the parallel processing plugin or enter "mata" to run using original mata code * Note 2: If the default plugin is used, users may enter "nthreads(#)" to set the number of threads # for the plugin (overriding the default thread count) * Note 3: To create equivalent runs set a seed() option as per usual such as seed(12345) local saveresults = "yes" // Will save all results into local Excel file EDMresults.xlsx local useresults = "rho" // Set "rho" or "mae" to choose outputting rho or MAE results; consider matching to 'testval' and 'Ecriterion' below * Note 1: this option will also be used for any plots selected below * SET VARIABLE AND TIME VARIABLE INFO * Note 1: This added precision will simplify various equality checks below unab variables: // Create varlist by entering variable names after colon * Note 1: Coprediction produces: (a) results in a 'wide' format as square matrices of these variables; and * (b) results in a 'long' format with all variable pairs stacked and coded numerically (and as strings) for plotting local timevar = "" // Name of tsset time variable entered in quotes local dropzeros = "yes" // Set "yes" or "no" to drop E-length runs of zero for analyses AFTER the initial simplex projections * Note 1: Deyle et al., 2016a (Global... influenza): "long stretches of zero-values... will contain little or no information" * "we exclude all stretches of zeros at least... maximum tested E... from the prediction set to improve the sensitivity of the analysis." * Note 2: Dropping E-length runs of zero can speed up estimation, often with only small differences in results local prestd = "z." // Set "z." to standardize variables as z-scores for analysis, or leave blank "" for no standardization * Note 1: Standardization can help coprediction/CCM in small panel-specific (within) sample sizes with very different scales/variances * Note 2: Standardization helps interpretation of MAE (on a scale 0-1); for rho analogue this allows constructing 1 - MAE * Note 3: Standardization done before differencing, so to retain this MAE interpretation difference the variable before analysis with "z." * Note 4: Standardization done for whole variable, not on a per-panel ID basis; may want to pre-standardize per panel before analysis * Note 4: Standardization only essential when doing multivariate EDM embeddings, which is not programmed here local forcing = "force" // Set "force" to adjust analyses for overlapping points/values on manifold; otherwise set blank "" * Note 1: If forcing left blank "" then many overlapping points/values will cause the analysis to be skipped * SET MISSING DATA FEATURES AND EXTRA VARIABLES TO INCLUDE IN MANIFOLD local time_gap = "" // Set "dt" to include information about varying time gaps (i.e., unequal intervals of observation) or leave blank as "" * Note 1: This will include E - 1 variables that measure time gaps as dt (e.g., t1 - t2), with weighting equal to measured variable as SDx/SDdt * Note 2: If time gaps are equal but data are missing, then setting "dt" will incorporate info about missingness as time gap data * Note 3: In this case missing data will still be dropped unless the "missing" option below is used * Note 4: If dt is set then all CCM analyses are contemporaneous and the lag/leads options below are automatically reset to 0 local tgap_weight = "" // Set to override automatic weight (not recommended without prior exploratory analyses) or leave blank "" for default * Note 1: For optimal weight, first explore prediction quality in simplex projection with automatic weight and then vary this weight +/- increments manually * Note 2: This is generally not recommended in combination with the "extra" option below, as different variables may have different optimal weights local missings = "" // Set "allowmissing" to keep missing data in the manifold, with distance set to SDx where x is measured variable * Note 1: This is analogous to an imputation strategy with missing values replaced by whatever results in nearest-neighbor distances = SD*2/sqrt(pi) * Note 2: If data are missing and this is not set, then those data will be dropped to produce a manifold with up to E missing data vectors * Note 3: With unequal time gaps with "dt" and missing data, both options can be used to keep all data in for example CCM where x is missing but y is not local miss_distance = "" // Set to override automatic missing datance SD*2/sqrt(pi) (not recommended without exploratory analysis) or leave blank "" for default * Note 1: For optimal distance, exploratory analyses can be conducted using simplex projection but default may be advised to reflect uncertainty * Note 2: As with tgap_weight this is not recommended with the "extra" option below, as different variables may have different optimal distances local extras = "" // Set to include any additional susbtantive variables in the manifold (i.e., a multivariate embedding) * Note 1: To ensure equal weighting of variables set prestd = "z." above and standardization is automatically applied to variables listed as extras here * Note 2: When a difference operator is applied to main variables it is also automatically applied to the variables listed as extras local extras_raw = "" // Set to include any extra variables that should not be differenced during analyses, such as time-invariant variables * Note 1: With missing data using missing = "allowmissing", extras_raw can include missing data info as binary indicator of missingness * Note 2: The prestd = "z." prefix is not applied to extras_raw so if standardization is desired it should be done manuall using the z. prefix (e.g., "z.x") * SET INITIAL SIMPLEX PROJECTION (SP) METHOD * This analysis uses the default 50/50 random split of the sample for analysis local SPrun = "yes" // Set "yes" or "no" to run initial simplex projections; results required, so only set "no" if previously run * Note 1: That results from this analysis will be used to set E for all subsequent analyses * Note 2: Plots and results saved as variables are named SP_'par'_'var' * 'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name local eminimum = 2 // User sets minimum E used in exploratory simplex projections, 2 by default local emaximum = 20 // User sets maximum E used in exploratory simplex projections, 20 by default * Note 1: Set eminimum=emaximum overrides 'Ecriterion' to fix E at one value (not recommended as it doesn't choose E based on results) local drange = "0/1" // Set difference operators for initial evaluations, such as 0/1 for raw and first-differenced * Note 1: Should be presented as a range, so if no differencing required set "0/0" or if first only set "1/1" * Note 2: This value is automatically set to 0 if 'dt' used for time gaps, as differencing is not straightforward local dropzeros_SP = "yes" // Set "yes" or "no" to drop spells of zeros of length maximum E -- same notes as 'dropzeros' * Note 1: If "yes" then zeros will also be excluded from the permutations used for hypothesis tests of SP results * Note 1: If "yes" then options below are rendered irrelevant and are overridden local SPsampler = "rep" // Set "rep", "crossfold", or "full" * Note 1: Set "rep" for 50/50 random library/manifold splits, which is default * Note 2: Set "crossfold" for k-fold cross-validation * Note 3: With T < ~50 set "full" for leave-one-out cross-validation using full sample for training/manifold reconstruction and testing/prediction local SPreps = 50 // Sets number of replications for 50/50 random splits or the number of groups k for k-fold cross-validation; must be > 1 * Note 1: When SPsampler = "full" the SPreps option is overridden so is irrelevant * CHOOSE METHOD FOR SELECTING E FOR ALL SUBSEQUENT ANALYSES, AND SET THE DESIRED DIFFERENCE OPERATOR local Ecriterion = "rho" // Set "rho", "mae", "dmae", "R2", or "dR2" for E settings per panel ID * Note 1: "rho" and "mae" options choose max(rho) or min(MAE) across the range of E analyzed for each panel ID * Note 2: Selecting "rho" or "mae" offers enhanced descriptive and diagnostic plots of max(rho)-min(MAE) relationship * Note 3: Use MAE over rho for short time series where rho-E relationship noisy (MAE less sensitive to outliers; Deyle et al., 2013, online appendix) * Note 4: "dmae" uses proportion mae change (mae(E) - mae(E+1))/mae(E), starting at minimum E and choosing E+1 if difference > 'dmaeval' * Note 5: "R2" uses R2 change rho^2(E+1) - rho^2(E), starting at minimum E and choosing E+1 if difference > 'R2val' * Note 6: "dR2" uses proportion R2 change (rho^2(E+1) - rho^2(E))/rho^2(E), starting at minimum E and choosing E+1 if difference > 'dR2val' * Note 7: Consider "dmae", "R2", or "dR2" if prediction improves as E-->infinity, meaning there is no true max(rho) or min(MAE), merely max E analyzed * Note 8: In the unlikely case of negative rho, "dmae", "R2", and "dR2" account for the sign of rho, measuring improvements as rho moves towards 1 * Note 9: "dmae", "R2", and "dR2" methods compare E and E+1 starting at E=2, so if E=3 does not improve R2, then E=2 chosen even if big improvements at E>3 local dmaeval = ".05" // If Ecriterion = "dmae", dmaeval is the cutoff that proportion change in mae must exceed to reject E for E+1 (5% default) local R2val = ".01" // If Ecriterion = "R2", R2val is the cutoff that change in R2 must exceed to reject E for E+1 (.01 default) local dR2val = ".05" // If Ecriterion = "dR2", dR2val is the cutoff that proportion change in R2 must exceed to reject E for E+1 (5% default) local diff = 0 // Set difference operator for all subsequent analyses; * Note 1: Setting to 1 can eliminate any fixed between-group differences and linear trends * SET PREDICTION STEPS INTO THE FUTURE (TP) USING SIMPLEX PROJECTION METHOD * This analysis uses the default 50/50 random split of the sample for analysis local TPrun = "yes" // Set "yes" or "no" to run step-ahead prediction analysis * Note 1: Plots and results saved as variables are named TP_'par'_'var' * 'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name local TPmax = 5 // Set max value for step-ahead prediction (tp; treated as numlist), set as 1/'TPmax' local TPsampler = "rep" // Set "rep", "crossfold", or "full" * Note 1: Set "rep" for 50/50 random library/manifold splits, which is default * Note 2: Set "crossfold" for k-fold cross-validation * Note 3: With T < ~50 set "full" for leave-one-out cross-validation using full sample for training/manifold reconstruction and testing/prediction local TPreps = 50 // Sets number of replications for 50/50 random splits or the number of groups k for k-fold cross-validation; must be > 1 * Note 1: When SPsampler = "full" the SPreps option is overridden so is irrelevant * SET S-MAPPING (SMAP) METHOD * This analysis uses the default 50/50 random split of the sample for analysis local SMAPrun = "yes" // Set "yes" or "no" to run SMAPs * Note 1: Plots and results saved as variables are named SMAP_'par'_'var' * 'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name local theta = "0 .01 .1 .5 1 2 3 5" // Theta values to test nonlinearity (as numlist), default "0 .01 .1 .5 1 1.5 2 5" local kval = -1 // Number of neighbors to use, for all possible set to -1 but here set to 70 local SMAPsampler = "rep" // Set "rep", "crossfold", or "full" * Note 1: Set "rep" for 50/50 random library/manifold splits, which is default * Note 2: Set "crossfold" for k-fold cross-validation * Note 3: With T < ~50 set "full" for leave-one-out cross-validation using full sample for training/manifold reconstruction and testing/prediction local SMAPreps = 50 // Sets number of replications for 50/50 random splits or the number of groups k for k-fold cross-validation; must be > 1 * Note 1: When SPsampler = "full" the SPreps option is overridden so is irrelevant * SET CO-PREDICTION (COPRED) * This analysis automatically uses the "full" sample option to form the manifolds used for predictions local COPREDrun = "yes" // Set "yes" or "no" to run Copredictions * SET CONVERGENT CROSS MAPPING (CCM) OR XMAP METHOD local CCMrun = "yes" // Set "yes" or "no" to run CCM local lib1 = 2 // First library size E+'lib1' to test convergence; lib1 must be > 1 for minimum of E+1 neighbors and 1 target vector on manifold * Note 1: These libraries smaller than the maximum library size formed by randomly sampling E-length vectors of observations local CCMreps1 = 100 // Number of replications for the smallest library size lib(E+2) often used to find 95% CI, default 200; must be > 0 * Note 1: Many replications should not take very long with a small library size L, but compute times will grow exponentially as L increases local lib2 = 3 // Next library size beyond lib1, takes the form E+'lib2' so should be > lib1, numlist default 5 local lib3 = 10 // Next library size beyond lib2, takes the form E+'lib3' so should be > lib2, numlist default 10 local lib4 = 20 // Next library size beyond lib3, takes the form E+'lib4' so should be > lib3, numlist default 20 local CCMreps2 = 50 // Number of replications for the next set of library sizes lib(E+lib2 E+lib3 E+lib4), default 50; must be > 0 local fraction1 = 3 // Denominator for mid library sizes where N/fraction1, default 3 so library size is rounded 1/3 total possible local fraction2 = 2 // Denominator for mid library sizes where N/fraction2, default 2 so library size is rounded 1/2 total possible local CCMreps3 = 20 // Number of replications for the next set of library sizes lib(N/fraction1 N/fraction2), default 10; must be > 0 local CCMlags = "10" // Largest lag for CCM as "'CCMlags'(-1)1"; consider default of 'emaximum'+1 but can be any value > 0 * Note 1: For no lagged CCM analyses set to "NA" local CCMleads = "5" // largest lead for testing effects vs. common causes as numlist "0/'CCMleads'" * Note 1: For no lead analysis set to 0, which in combination with CCMlags = "NA" will run CCM only contemporaneously * SET SMAP OPTION FOR ESTIMATING LOCAL LINEAR EFFECTS FROM CCM local CCMSMAP = "yes" // Set "yes" or "no" to compute coefficients using SMAP method local CCMSMAPk = "-1" // Set to number of neighbors to use for deriving CCM SMAP coefficients, -1 is default as this uses all possible neighbors * Note 1: With large sample sizes compute times can be reduced by setting to some fraction of sample size * SET HYPOTHESIS TESTING OPTIONS * Note 1: Unless otherwise noted, tests are one-tailed to correspond with EDM hypotheses (e.g., rho > 0) * Note 2: Bootstrap uses random 50/50 splits to create a distribution for rho or mae; e.g., tests if two boostrapped rhos are different * Note 3: Permuting randomizes time index to form NHST-type pivotal null distribution; e.g., tests if observed rho > 95th percentile of null * Note 4: Permuting uses 'full' samples option whenever possible to ensure uncertainty relates only to randomized the time index * Note 5: When MAE is used the direction of tests is reversed; e.g., is observed MAE < 5th percentile of permuted null for p < .05? * Note 6: Replications are done per panel ID, so choosing many reps will be computationally heavy with large N and T * Note 8: All test results are graphed, but are saved only if the plots for their analyses are also saved below under PLOT OPTIONS * Note 9: All tests include p < .05 and .01 equivalents, but to interpret p < .01 tests consider many replications (e.g., >= 1,000) * Note 10: All plots are saved if their respective plotting options further below indicate plots should be saved local testval = "rho" // Set "rho" or "mae" to use for all tests; consider matching to 'Ecriterion' * Note 1: This parameter applies to all tests local SPtest = "yes" // Set "yes" or "no" for simplex projection test of prediction quality at chosen E using permutation-based null * Note 1: This test uses the 'full' sample option rather than the initial 50/50 random split for simplex projections * Note 2: If dropping zeros = "yes" this test drops zeros in relation to the optimal E found in initial simplex projections rather than 'emaximum' local SPtestreps = 1000 // Number of permutation replications to generate null distribution for SPtest * Note 1: With many initial bootstrap replications used for simplex projection, this can save substantial time with large N and T local SPtestplot = "yes" // Set "yes" or "no" to finalize and plot the results of the SPtest local SMAPtest = "yes" // Set "yes" or "no" for bootstrap test of improved prediction at best theta>0 versus theta=0 * Note 1: The seed for paired bootstrapped S-map tests is seed(12345) unless 'options' list contains a different seed() value local SMAPtestreps = 1000 // Number of bootstrap replications for SMAPtest * Note 1: This test reuses any SMAP bootstrap reps 'SMAPreps', so new reps for theta=0 and theta>0 are 2(SMAPtestreps - SMAPreps) local SMAPtestplot = "yes" // Set "yes" or "no" to finalize and plot the results of the SMAPtest local StepwiseCCMtest = "yes" // Set "yes" or "no" to make CCM testing and SMAP coefficient computations conditional on assumption checks * Note 1: This sequentially does the following: * 1) tests convergence only if optimal prediction at lag>=0; * 2) runs permutation test at max L only if convergence found at 95%; and * 3) limits SMAP coefficient analysis using CCM method to cases that pass permutation test at max L at 95% * Note 2: See: Ye et al. (2015). Distinguishing time-delayed causal interactions using convergent cross mapping. Scientific Reports. * Note 3: See: Wang et al., (2019). Improved CCM for variable causality detection in complex systems. Control Engineering Practice. local CCMctest = "yes" // Set "yes" or "no" for bootstrap test of convergence from min library size E+'lib1' versus max * Note 1: This test constructs a CI around rho/mae at minimum library size E+'lib1' and checks if max library size rho/mae outside the CI * Note 2: Best test of convergence typically has 'lib1' from above set to minimum value of 2 * Note 3: In some cases manifold reconstruction problems arrise with too many zeros and small libraries; if so consider lib1 > 2 local CCMctestreps = 1000 // Number of bootstrap replications at minimum library size for CCMctest * Note 1: This test reuses any CCM bootstrap reps 'CCMreps1', so new reps = CCMconvreuse - CCMreps1 local CCMctestplot = "yes" // Set "yes" or "no" to finalize and plot the results of the CCMctestplot local CCMLtest = "yes" // Set "yes" or "no" for permutation test of CCM predictions at max library better than null local CCMLtestreps = 1000 // Number of permutation replications for CCMLtest local CCMLtestplot = "yes" // Set "yes" or "no" to finalize and plot the results of the CCMLtest * SET GENERAL PLOT OPTIONS local plots_on = "yes" // Set "yes" or "no" to enable plotting; this does not affect hypothesis test plots, which are on if tests = "yes" * Note 1: When useresults="rho" or "mae" with "z." standardization, the y-axis of many plots scaled with range: -.25(.25)1 local exportplot = "png" // Set as blank "" to not export graphs, or to export specify format "png", "tif", "pdf", "emf", "wmf", "eps" or "ps" local CIs = 95 // Set 95 or 90 for 90% or 95% CI error bars; enter 0 for no CIs, which is advised with many panel IDs (N > 20) * Note 1: This option does not affect the way hypothesis tests are plotted, and instead merely pertains to descriptive output * Note 2: All confidence intervals are symetric as they use SD values, so not used for hypothesis testing merely description * Note 3: CIs for convergent cross mapping (CCM) not available * ANALYSIS-SPECIFIC PLOT OPTIONS * Note 1: Plots of hypothesis tests produced if plots are requested for their analyses (e.g., SPgraph = "yes" will plot 'SPtest' results) local SPgraph = "yes" // Set "yes" or "no" for initial simplex projection plots of average rho (y-axis) by E (x-axis) local SPsave = "yes" // Set "yes" or "no" to save and replace or not save the initial simplex projection plots local SPlinefit = "" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing local TPgraph = "yes" // Set "yes" or "no" for plots of average rho (y-axis) across prediction time tp (x-axis) local TPsave = "yes" // Set "yes" or "no" to save and replace or not save the prediction time plots local TPlinefit = "" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing local SMAPgraph = "yes" // Set "yes" or "no" for plots of average rho (y-axis) for SMAP theta values (x-axis) local SMAPsave = "yes" // Set "yes" or "no" to save and replace or not save the S-map plots local SMAPlinefit = "" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing * Note 1: For SP, TP, and SMAP the "separate(.001)" option is used to reduce clutter, but can set smaller (larger) for more (fewer) panels * Note 2: For SP, TP, and SMAP the "nolegend" option used to reduce clutter, but can be set to "legend(size(vsmall))" if of interest local COPREDgraph = "yes" // Set "yes" or "no" for plots of every variable pair using the coprediction method * Note 1: Coprediction plots are: 1) heatmap; and 2) two types of bar graphs (switching predictor-predictee settings) * Note 2: No CIs are programmed because coprediction methods use the full dataset by default, with leave-one-out predictions local COPREDsave = "yes" // Set "yes" or "no" to save and replace or not save the coprediction plots local COPREDvalues = "yes" // Set "yes" or "no" to display average rho or MAE values in heatmat plot cells local COPREDdecimals = 2 // Set number of decimals to display in values of heatmap plot * Note 1: With more than 10 variables this could be set to 1 for readability local CCMgraph = "yes" // Set "yes" or "no" for plots of average rho (y-axis) for CCM library size (x-axis) at lag = 0 local CCMsave = "yes" // Set "yes" or "no" to save and replace or not save the CCM plots local CCMlinefit = "fpfit" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing * Note 1: Line fitting helps when different panel IDs have different E or N values, and thus different minimum and maximum library sizes * Note 2: If library sizes and prediction quality differs across panel IDs, grand-mean lines will be unhelpful without line-fitting local CCMlaggraph = "yes" // Set "yes" or "no" for plots of average rho (y-axis) for CCM at different lags (x-axis) local CCMlagsave = "yes" // Set "yes" or "no" to save and replace or not save the CCM plots local CCMlaglinefit = "" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing local CCMSMAPgraph = "yes" // Set "yes" or "no" for plots of CCM coefficients derived from the SMAP method local CCMSMAPsave = "yes" // Set "yes" or "no" to save and replace or not save the CCM SMAP plots local CCMSMAPlinefit = "" // Set "lfit", "qfit", or "fpfit" for linear, quadratic, or fractional polynomial fit; default is nothing local CCMSMAPeffects = "10" // Set maximum number of effects to plot, where theoretical max is E but plots get cluttered with more than 5 * Note 1: Each effect is really one of E columns in the E-dimensional vectors used for SMAP predictions * Note 2: Because each vector is a lagged embedding, this is like the number of 'lagged effects' to plot from L0 to L'CCMSMAPeffects'-1 * "****************************" * "* *" * "* Begin Analysis Scripts *" * "* *" * "****************************" findname `variables', any(length("@") > 8) if "`r(varlist)'"!="" { ******************************************************************************************* * Note1: Sorry, you have at least one variable with a name longer than 8 characters. * * Note2: To solve this perhaps use clonevar and give it a shorter name. * * Note3: Also, please call your elected official to voice concerns about global warming! * ******************************************************************************************* exit } * Housekeeping set type double tempname A A1 A2 B B Blong C C2 C Clong D Dlong E Elong N Nlong Z1 Z2 Z21 Z22 Z3 RhoGM MaeGM /// Bpart Wpart O Amat CoPred1 CoPred2 NewMat Copredrho Copredmae Predrho Levelslove Test1 tempvar tempstD tempdt1 absvaluE randoM newtimE obsnumbeR obsnO panelvar if "`tgap_weight'"=="" loc tweight = "" else loc tweight = "dtweight(`tgap_weight')" if "`miss_distance'"=="" loc mdistance = "" else loc mdistance = "missingdistance(`miss_distance')" loc extraz1 = wordcount("`extras'") loc extraz2 = wordcount("`extras_raw'") loc extraz=`extraz1'+`extraz2' set more off if "`time_gap'" == "dt" { loc drange = 0 loc diff = 0 } if "`CCMlags'" == "NA" & "`CCMleads'"=="0" { loc CCMlaggraph = "no" } * Treat all panels as part of the same dynamic system local panelrange = "1/1" // Range of panel IDs to analyze (as a numlist of values on the panel ID variable), here 1-50 gen `panelvar'=1 * Parse 'extras' list of variables to apply standardization and difference operators loc exd = `diff' foreach dval of numlist `drange' { if `dval' > `diff' loc exd = `dval' } tokenize `extras' forval dn = 0/`exd' { loc extras`dn' = "" loc i = 1 while "``i''" != "" { loc extras`dn' = "`extras`dn'' `prestd'd`dn'.``i''" loc i = `i' + 1 } } * Override default S-map test seed setting if seed appears in 'options' list tokenize `options' loc smapseed = "seed(12345)" loc i = 1 while "``i''" != "" { loc first=substr("``i''",1,4) if "`first'" == "seed" loc smapseed = "" loc i = `i' + 1 } * Create variables to allow dropping zeros in initial Simplex Projections tsset if "`dropzeros_SP'"=="yes" { foreach var of varlist `variables' { tempvar `var'_nonzeroSPuse `var'_nonzeroSP qui gen ``var'_nonzeroSPuse' = . qui gen ``var'_nonzeroSP' = -1 replace ``var'_nonzeroSP' = 0 if `var' == 0 replace ``var'_nonzeroSP' = sum(``var'_nonzeroSP') replace ``var'_nonzeroSPuse' = 1 if ``var'_nonzeroSP'[_n] < ``var'_nonzeroSP'[_n-`emaximum'-`diff'] tsset loc `var'dropzesSP = "& \`\`var'_nonzeroSP'[_n] < \`\`var'_nonzeroSP'[_n-\`emaximum'-\`dval']" loc `var'dropzesSPtest = "& \`\`var'_nonzeroSPuse' == 1" } } else { foreach var of varlist `variables' { loc `var'dropzesSP = "" loc `var'dropzesSPtest = "" } } * "*************************************" * "* *" * "* Initial Simplex Projection (SP) *" * "* *" * "*************************************" * "Note 1: Results are used to set E for all subsequent analyses using max(rho) or min(MAE)" * "Note 2: Plots and results saved as variables are named SP_'par'_'var' * "The 'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name" if "`SPrun'"=="yes" { if "`SPsampler'" == "full" { loc SPreps = 2 loc SPsampler = "rep" loc fully = "full" } loc enum = `emaximum'-`eminimum'+1 /* Number of E values analyzed, used to create matrices to save results */ loc enumlong = `enum'*`SPreps' foreach var of varlist `variables' { tempname SP`var' SPlong`var' mata: `SP`var''=J(0,9,.) mata: `SPlong`var''=J(0,7,.) foreach id of numlist `panelrange' { mata: `C'=J(1,`enum',`id') /* Set matrix with panel ID identifiers, linked to saved results below */ mata: `Clong'=J(1,`enumlong',`id') foreach dval of numlist `drange' { /* Set difference operators used for analysis */ di "-------------------------------" di "Below D = `dval', VAR = `var'" di "-------------------------------" capture noisily { edm explore `prestd'd`dval'.`var' if `panelvar'==`id' ``var'dropzesSP', e(`eminimum'/`emaximum') `SPsampler'(`SPreps') `fully' `forcing' `time_gap' `missings' extra(`extras`dval'' `extras_raw') `tweight' `mdistance' reportrawe `options' } capture { mata: `D'=J(1,`enum',`dval') /* Set matrix with difference operator, linked to saved results below */ mata: `Dlong'=J(1,`enumlong',`dval') mata: `N'=J(1,`enum',st_numscalar("e(N)")) mata: `Nlong'=J(1,`enumlong',st_numscalar("e(N)")) mata: `B'=st_matrix("e(summary)") mata: `E'=`C'\ `D'\ `B''\ `N' mata: `SP`var''=`SP`var''\ `E'' mata: `Blong'=st_matrix("e(explore_result)") mata: `Elong'=`Clong'\ `Dlong'\ `Blong''\ `Nlong' mata: `SPlong`var''=`SPlong`var''\ `Elong'' } } } capture quietly { getmata (SP`var'*)=`SP`var'' (SPlong`var'*)=`SPlong`var'', force double label variable SP`var'1 "Simplex Projection panel ID of 1" rename SP`var'1 SP_id_`var' label variable SP`var'2 "Simplex Projection difference operator" rename SP`var'2 SP_d_`var' recast byte SP_d_`var' label variable SP`var'3 "Simplex Projection E value" rename SP`var'3 SP_e_`var' recast int SP_e_`var' drop SP`var'4 label variable SP`var'5 "Simplex Projection average rho (`SPreps' `SPsampler's)" rename SP`var'5 SP_rho_`var' label variable SP`var'6 "Simplex Projection rho SD (`SPreps' `SPsampler's)" rename SP`var'6 SP_rhoSD_`var' label variable SP`var'7 "Simplex Projection average MAE (`SPreps' `SPsampler's)" rename SP`var'7 SP_mae_`var' label variable SP`var'8 "Simplex Projection MAE SD (`SPreps' `SPsampler's)" rename SP`var'8 SP_maeSD_`var' label variable SP`var'9 "Simplex Projection sample size as usable observations - max E" rename SP`var'9 SP_N_`var' recast long SP_N_`var' label variable SPlong`var'1 "Simplex Projection long (all `SPsampler's), panel ID of 1" rename SPlong`var'1 SPlong_id_`var' label variable SPlong`var'2 "Simplex Projection long (all `SPsampler's), difference operator" rename SPlong`var'2 SPlong_d_`var' recast byte SPlong_d_`var' label variable SPlong`var'3 "Simplex Projection long (all `SPsampler's), E value" rename SPlong`var'3 SPlong_e_`var' recast int SPlong_e_`var' drop SPlong`var'4 label variable SPlong`var'5 "Simplex Projection long (all `SPsampler's), rho" rename SPlong`var'5 SPlong_rho_`var' label variable SPlong`var'6 "Simplex Projection long (all `SPsampler's), MAE" rename SPlong`var'6 SPlong_mae_`var' label variable SPlong`var'7 "Simplex Projection long (all `SPsampler's), sample size as usable observations - max E" rename SPlong`var'7 SPlong_N_`var' } } } clear mata * Initial simplex projection plots if "`SPgraph'"=="yes" & "`plots_on'" =="yes" { preserve foreach var of varlist `variables' { * Confidence intervals for full plots if `CIs'==95 | `CIs'==90 loc a="errortype(ci(`CIs'))" else loc a="" * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Set scale for rho and standardized MAE results if "`useresults'" == "rho" loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else if "`useresults'" == "mae" & "`prestd'" == "z." loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else loc ylabelscale="" foreach dval of numlist `drange' { * Saving results if "`SPsave'"=="yes" loc b="saving(SP_`useresults'_`var'_d`dval', replace)" else loc b="" * Plot command lgraph SPlong_`useresults'_`var' SPlong_e_`var' SPlong_id_`var' if SPlong_d_`var'==`dval', /// `a' `b' separate(.001) fit(`SPlinefit') `ylabelscale' /// title({stSerif:SP: Results for `var'}, color(black)) /// ytitle({stSerif:Prediction Accuracy} {`z'}) xtitle({stSerif:Dimensionality E}) /// note({stSerif:Note: Grand-mean line is bolded -- differencing at d`dval'}) /// loptions(1 lwidth(*6)) foptions(1 lwidth(*6)) graphregion(color(white)) bgcolor(white) /// name(SP_`useresults'_`var'_d`dval', replace) nolegend if "`exportplot'" != "" { graph export SP_`useresults'_`var'_d`dval'.`exportplot', replace } serset clear } } restore } serset clear clear mata * Identify E at max(rho), min(MAE), proportional change in mae, change in R2, or proportional change dR2 for each variable preserve * First case is only one E value or "chosen E" used for all analyses if `eminimum'==`emaximum' { foreach var of varlist `variables' { foreach id of numlist `panelrange' { loc emax`var'`id'=`eminimum' } } } * Second case is selecting E based on maximum rho or minimum MAE else if "`Ecriterion'"=="rho" | "`Ecriterion'"=="mae" { qui gen long `obsnO'=_n foreach var of varlist `variables' { foreach id of numlist `panelrange' { capture { quietly summarize SP_`Ecriterion'_`var' if SP_id_`var'==`id' & SP_d_`var'==`diff', meanonly if "`Ecriterion'"=="rho" loc a="max" else if "`Ecriterion'"=="mae" loc a="min" sca `Levelslove'=r(`a') qui levelsof `obsnO' if SP_`Ecriterion'_`var' == `Levelslove' & SP_id_`var'==`id' & SP_d_`var'==`diff' loc emax`var'`id'=SP_e_`var'[`r(levels)'] } } } } * Third case is selecting E based on change in R2 or proportion change in R2 (dR2) else if "`Ecriterion'" == "R2" | "`Ecriterion'"=="dR2" { qui gen long `obsnO'=_n foreach var of varlist `variables' { tempvar R2`var' eR2`var' qui gen `R2`var''=. qui gen `eR2`var''=. replace `eR2`var''=2 if SP_e_`var'==2 qui gen rhosign`var'=1 qui replace rhosign`var'=-1 if SP_rho_`var'<0 gen max_rho_`var'=. label variable max_rho_`var' "The rho using `Ecriterion' criterion at d`diff'" gen emax_rho_`var'=. label variable emax_rho_`var' "The E using `Ecriterion' criterion at d`diff'" capture { if "`Ecriterion'"=="R2" { qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace `R2`var''=SP_rho_`var'^2*rhosign`var'-SP_rho_`var'[_n-1]^2*rhosign`var'[_n-1] /// if SP_e_`var' > 2 & SP_d_`var'==`diff' qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace `eR2`var''=SP_e_`var' if `R2`var''>`R2val' & !missing(`eR2`var''[_n-1]) & SP_d_`var'==`diff' tsset } else if "`Ecriterion'"=="dR2" { qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace `R2`var''=(SP_rho_`var'^2*rhosign`var'-SP_rho_`var'[_n-1]^2*rhosign`var'[_n-1])/SP_rho_`var'[_n-1]^2 /// if SP_e_`var' > 2 & SP_d_`var'==`diff' qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace `eR2`var''=SP_e_`var' if `R2`var''>`dR2val' & !missing(`eR2`var''[_n-1]) & SP_d_`var'==`diff' tsset } } foreach id of numlist `panelrange' { capture { qui summarize `eR2`var'' if SP_id_`var'==`id' & SP_d_`var'==`diff', meanonly sca `Levelslove'=r(max) qui levelsof `obsnO' if `eR2`var'' == `Levelslove' & SP_id_`var'==`id' & SP_d_`var'==`diff' loc emax`var'`id'=`eR2`var''[`r(levels)'] qui replace max_rho_`var' = SP_rho_`var' if SP_id_`var'==`id' & SP_d_`var'==`diff' & SP_e_`var'==`eR2`var''[`r(levels)'] qui replace emax_rho_`var' = SP_e_`var' if SP_id_`var'==`id' & SP_d_`var'==`diff' & SP_e_`var'==`eR2`var''[`r(levels)'] } } } } * Fourth case for selecting E is proportional change in MAE (dmae) else if "`Ecriterion'" == "dmae" { qui gen long `obsnO'=_n foreach var of varlist `variables' { qui gen dmae`var'=. qui gen emae`var'=. replace emae`var'=2 if SP_e_`var'==2 qui gen rhosign`var'=1 qui replace rhosign`var'=-1 if SP_rho_`var'<0 gen max_mae_`var'=. label variable max_mae_`var' "The mae using `Ecriterion' criterion at d`diff'" gen emax_mae_`var'=. label variable emax_mae_`var' "The E using `Ecriterion' criterion at d`diff'" qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace dmae`var'=(SP_mae_`var'[_n-1]*rhosign`var'[_n-1]-SP_mae_`var'*rhosign`var')/(SP_mae_`var'[_n-1]*rhosign`var'[_n-1]) /// if SP_e_`var' > 2 & SP_d_`var'==`diff' qui bysort SP_id_`var' SP_d_`var' (SP_e_`var'): /// replace emae`var'=SP_e_`var' if dmae`var'>`dmaeval' & !missing(emae`var'[_n-1]) & SP_d_`var'==`diff' tsset foreach id of numlist `panelrange' { qui summarize emae`var' if SP_id_`var'==`id' & SP_d_`var'==`diff', meanonly sca `Levelslove'=r(max) qui levelsof `obsnO' if emae`var' == `Levelslove' & SP_id_`var'==`id' & SP_d_`var'==`diff' loc emax`var'`id'=emae`var'[`r(levels)'] qui replace max_mae_`var' = SP_mae_`var' if SP_id_`var'==`id' & SP_d_`var'==`diff' & SP_e_`var'==emae`var'[`r(levels)'] qui replace emax_mae_`var' = SP_e_`var' if SP_id_`var'==`id' & SP_d_`var'==`diff' & SP_e_`var'==emae`var'[`r(levels)'] } } } restore * Create variables to allow dropping zeros given the selected E tsset if "`dropzeros'"=="yes" { foreach var of varlist `variables' { tempvar `var'_nonzerouse `var'_nonzero qui gen ``var'_nonzerouse' = . qui gen ``var'_nonzero' = -1 replace ``var'_nonzero' = 0 if `var' == 0 replace ``var'_nonzero' = sum(``var'_nonzero') sort `timevar' foreach id of numlist `panelrange' { replace ``var'_nonzerouse' = 1 if ``var'_nonzero'[_n] < ``var'_nonzero'[_n-`emax`var'`id''-`diff'] } tsset loc `var'dropzes = "& \`\`var'_nonzerouse' == 1" loc `var'dropzes1 = "& \`\`var1'_nonzerouse' == 1" } } else { foreach var of varlist `variables' { loc `var'dropzes = "" loc `var'dropzes1 = "" } } * SP Prediction Tests using `emax`var'`id'' if "`SPtest'" == "yes" { tsset gen `randoM'=. gen `newtimE'=. foreach var of varlist `variables' { tempname SPtestl`var' mata: `SPtestl`var''=J(0,6,.) tempvar d`diff'`var' r`var' gen `d`diff'`var''=d`diff'.`var' gen `r`var''=. foreach i of numlist 1/`SPtestreps' { replace `randoM'=uniform() if "`dropzeros_SP'"=="yes" { bysort ``var'_nonzeroSPuse' (`randoM'): replace `newtimE'=_n if ``var'_nonzeroSPuse'==1 bysort ``var'_nonzeroSPuse' (`timevar'): replace `r`var''=`d`diff'`var''[`newtimE'] if ``var'_nonzeroSPuse'==1 tsset } else { sort `randoM' replace `newtimE'=_n sort `timevar' replace `r`var''=`d`diff'`var''[`newtimE'] tsset } foreach id of numlist `panelrange' { if "`emax`var'`id''"!="" { if `emax`var'`id''>0 { di "----------------------------------------------" di "Below VAR = `var', Rep == `i' of `SPtestreps'" di "----------------------------------------------" capture { edm explore `prestd'`r`var'' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') full `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' } capture { mata: `Clong'=`id' mata: `Nlong'=st_numscalar("e(N)") mata: `Blong'=st_matrix("e(explore_result)") mata: `Elong'=`Clong'\ `Blong''\ `Nlong' mata: `SPtestl`var''=`SPtestl`var''\ `Elong'' } } } } } capture quietly { getmata (SPtestl`var'*)=`SPtestl`var'', force double label variable SPtestl`var'1 "Simplex Projection test (all reps), panel ID = 1" rename SPtestl`var'1 SPtestl_id_`var' label variable SPtestl`var'2 "Simplex Projection test (all reps), E value" rename SPtestl`var'2 SPtestl_e_`var' recast int SPtestl_e_`var' drop SPtestl`var'3 label variable SPtestl`var'4 "Simplex Projection test (all reps), rho" rename SPtestl`var'4 SPtestl_rho_`var' label variable SPtestl`var'5 "Simplex Projection test (all reps), MAE" rename SPtestl`var'5 SPtestl_mae_`var' label variable SPtestl`var'6 "Simplex Projection test (all reps), sample size as usable observations- max E" rename SPtestl`var'6 SPtestl_N_`var' drop `d`diff'`var'' `r`var'' } } drop `randoM' `newtimE' } * Hypothesis testing calculations and plots if "`SPtestplot'" == "yes" { preserve if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var of varlist `variables' { tempname SPpct`var' mata: `SPpct`var'' = J(0,5,.) loc `var'counter=0 loc `var't95counter=0 loc `var't99counter=0 foreach id of numlist `panelrange' { if "`emax`var'`id''"!="" { if `emax`var'`id''>0 { loc `var'counter=``var'counter'+1 capture { qui edm explore `prestd'd`diff'.`var' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') full `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' mat `Amat' = e(explore_result) if "`testval'"=="rho" sca `Test1' = `Amat'[1,3] else if "`testval'"=="mae" sca `Test1' = `Amat'[1,4] _pctile SPtestl_`testval'_`var' if SPtestl_id_`var'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `Test1'-r(r2) > 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="rho" & `Test1'-r(r3) > 0 loc `var't99counter=``var't99counter'+1 if "`testval'"=="mae" & `Test1'-r(r2) < 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & `Test1'-r(r1) < 0 loc `var't99counter=``var't99counter'+1 mata: `Test1'=st_numscalar("`Test1'") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `Test1') mata: `C2' = (`Test1'-st_numscalar("r(r1)"), `Test1'-st_numscalar("r(r2)"), `Test1'-st_numscalar("r(r3)"), 0) mata: `D' = J(1,4,st_numscalar("e(N)")) mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `SPpct`var''=`SPpct`var''\ `E'' } } } } capture quietly { getmata (SPpct`var'*)=`SPpct`var'', force double label variable SPpct`var'1 "SP Percentile panel ID of 1" rename SPpct`var'1 SPpct`var' label variable SPpct`var'2 "SP Percentile" rename SPpct`var'2 SPpct`var'_pctile label variable SPpct`var'3 "SP test values" rename SPpct`var'3 SPpct`var'_vals label variable SPpct`var'4 "SP test Values Minus Observed" rename SPpct`var'4 SPpct`var'_valsc label variable SPpct`var'5 "SP Percentile N" rename SPpct`var'5 SPpct`var'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var't95counter'/``var'counter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var't99counter'/``var'counter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum SPpct`var'_vals if SPpct`var'_pctile==110, meanonly loc testvalGM=r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum SPpct`var'_vals if SPpct`var'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum SPpct`var'_vals if SPpct`var'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum SPpct`var'_vals if SPpct`var'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' if "`testval'" == "rho" loc notess=`"note("{stSerif:Note 1: y=0 is permuted} {&rho}{stSerif:, reject H0 if the 99th and/or 95th percentiles are positive}" "{stSerif:Note 2: Of ``var'counter' panel IDs, ``var't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means are: Observed} {&rho} {stSerif:= `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Null mean = `nullmeanGMd'}")"' else if "`testval'" == "mae" loc notess=`"note("{stSerif:Note 1: y=0 is permuted MAE, reject H0 if the 1st and/or 5th percentiles are negative}" "{stSerif:Note 2: Of ``var'counter' panel IDs, ``var't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means are: Observed MAE = `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Null mean = `nullmeanGMd'}")"' if "`SPsave'" == "yes" loc savers1="saving(SP_test_`var'_`testval'1_d`diff', replace)" if "`SPsave'" == "yes" loc savers2="saving(SP_test_`var'_`testval'2_d`diff', replace)" tempvar titled`var' gen `titled`var'' = "" label variable `titled`var'' "Percentiles of Permuted Null Distribution" replace `titled`var''="Permuted Mean Diff" if SPpct`var'_pctile==50 replace `titled`var''="`fivepct'th Pctile Diff" if SPpct`var'_pctile==`fivepct' replace `titled`var''="`onepct'`onesuffix' Pctile Diff" if SPpct`var'_pctile==`onepct' qui labmask SPpct`var'_pctile, values(`titled`var'') twoway (scatter SPpct`var'_valsc SPpct`var' if SPpct`var'_pctile==50) /// (scatter SPpct`var'_valsc SPpct`var' if SPpct`var'_pctile==`fivepct') /// (scatter SPpct`var'_valsc SPpct`var' if SPpct`var'_pctile==`onepct'), /// legend(label(1 {stSerif:Permuted Mean Diff}) label(2 {stSerif:`fivepct'th Pctile Diff}) label(3 {stSerif:`onepct'`onesuffix' Pctile Diff})) /// title({stSerif:SP: `var' Confidence Intervals Centered on Permuted} {`z'}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// /*xlab(`panelrange')*/ name(SP_test_`var'_`testval'1_d`diff', replace) yline(0) `notess' graphregion(color(white)) `savers1' xline(`panelrange', lwidth(vthin) lpattern(dash)) if "`exportplot'" != "" { graph export SP_test_`var'_`testval'1_d`diff'.`exportplot', replace } serset clear lgraph SPpct`var'_valsc SPpct`var' SPpct`var'_pctile if SPpct`var'_pctile!=110, graphregion(color(white)) /// yline(0) ytitle({stSerif:Prediction Accuracy} {`z'}) title({stSerif:SP: `var' Confidence Intervals Centered on Permuted} {`z'}, color(black)) /// /*xlab(`panelrange')*/ name(SP_test_`var'_`testval'2_d`diff', replace) `notess' `savers2' xline(`panelrange', lwidth(vthin) lpattern(dot)) if "`exportplot'" != "" { graph export SP_test_`var'_`testval'2_d`diff'.`exportplot', replace } serset clear } } restore } clear mata serset clear * "**************************************" * "* *" * "* Time of Prediction (TP) Analysis *" * "* *" * "**************************************" * "Note 1: Predictability at different prediction steps (tp) into the future using simplex projections" * "Note 2: plots and results saved as variables are named TP_'par'_'var'" * "'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name" if "`TPrun'"=="yes" { if "`TPsampler'" == "full" { loc TPreps = 2 loc TPsampler = "rep" loc fully = "full" } foreach var of varlist `variables' { tempname TP`var' TPlong`var' mata: `TP`var''=J(0,9,.) mata: `TPlong`var''=J(0,7,.) foreach id of numlist `panelrange' { if "`emax`var'`id''"!="" { if `emax`var'`id''>0 { mata: `C'=`id' mata: `Clong'=J(1,`TPreps',`id') forval i = 1/`TPmax' { di "--------------------------------" di "Below VAR = `var', tp==`i'" di "--------------------------------" capture noisily { edm explore `prestd'd`diff'.`var' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') `TPsampler'(`TPreps') `fully' tp(`i') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' } capture { mata: `D'=`i' mata: `N'=st_numscalar("e(N)") mata: `B'=st_matrix("e(summary)") mata: `E'=`C'\ `D'\ `B''\ `N' mata: `TP`var''=`TP`var''\ `E'' } capture { mata: `Dlong'=J(1,`TPreps',`i') mata: `Nlong'=J(1,`TPreps',st_numscalar("e(N)")) mata: `Blong'=st_matrix("e(explore_result)") mata: `Elong'=`Clong'\ `Dlong'\ `Blong''\ `Nlong' mata: `TPlong`var''=`TPlong`var''\ `Elong'' } } } } } capture quietly { getmata (TP`var'*)=`TP`var'', force double label variable TP`var'1 "Test of prediction tp steps ahead (TP), ID = 1" rename TP`var'1 TP_id_`var' label variable TP`var'2 "tp step-ahead value from 1 to `TPmax'" rename TP`var'2 TP_tp_`var' recast int TP_tp_`var' label variable TP`var'3 "tp E value (max rho from original SP)" rename TP`var'3 TP_e_`var' recast int TP_e_`var' drop TP`var'4 label variable TP`var'5 "tp average rho (`TPreps' `TPsampler's)" rename TP`var'5 TP_rho_`var' label variable TP`var'6 "to rho SD (`TPreps' `TPsampler's)" rename TP`var'6 TP_rhoSD_`var' label variable TP`var'7 "tp average MAE (`TPreps' `TPsampler's)" rename TP`var'7 TP_mae_`var' label variable TP`var'8 "tp MAE SD (`TPreps' `TPsampler's)" rename TP`var'8 TP_maeSD_`var' label variable TP`var'9 "tp sample size as usable observations (to form E-dimensional vectors) - E" rename TP`var'9 TP_N_`var' getmata (TPlong`var'*)=`TPlong`var'', force double label variable TPlong`var'1 "Simplex Projection long (all `TPsampler's), panel ID = 1" rename TPlong`var'1 TPlong_id_`var' label variable TPlong`var'2 "tp step-ahead value long (all `TPsampler's), from 1 to `TPmax'" rename TPlong`var'2 TPlong_tp_`var' recast int TPlong_tp_`var' label variable TPlong`var'3 "Simplex Projection long (all `TPsampler's), E value" rename TPlong`var'3 TPlong_e_`var' recast int TPlong_e_`var' drop TPlong`var'4 label variable TPlong`var'5 "Simplex Projection long (all `TPsampler's), rho" rename TPlong`var'5 TPlong_rho_`var' label variable TPlong`var'6 "Simplex Projection long (all `TPsampler's), MAE" rename TPlong`var'6 TPlong_mae_`var' label variable TPlong`var'7 "Simplex Projection long (all `TPsampler's), sample size as usable observations - max E" rename TPlong`var'7 TPlong_N_`var' } } } clear mata * Prediction step (TP) plots if "`TPgraph'"=="yes" & "`plots_on'" =="yes" { * Generate grand-mean as a separate panelvar ID preserve foreach var of varlist `variables' { * Confidence intervals for full plots if `CIs'==95 | `CIs'==90 loc a="errortype(ci(`CIs'))" else loc a="" * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Set scale for rho and standardized MAE results if "`useresults'" == "rho" loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else if "`useresults'" == "mae" & "`prestd'" == "z." loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else loc ylabelscale="" * Saving results if "`TPsave'"=="yes" loc b="saving(TP_`var'_`useresults'_d`diff', replace)" else loc b="" * Plot command lgraph TPlong_`useresults'_`var' TPlong_tp_`var' TPlong_id_`var', /// `a' `b' separate(.001) fit(`TPlinefit') `ylabelscale' /// title({stSerif:TP: Results for `var'}, color(black)) /// ytitle({stSerif:Prediction Accuracy} {`z'}) xtitle({stSerif:Prediction Step in Future tp}) /// note({stSerif:Note: Time of prediction (tp) grand-mean line is bolded -- differencing at d`diff'}) /// loptions(1 lwidth(*6)) foptions(1 lwidth(*6)) graphregion(color(white)) bgcolor(white) /// name(TP_`var'_`useresults'_d`diff', replace) nolegend xlabel(1(1)`TPmax') if "`exportplot'" != "" { graph export TP_`var'_`useresults'_d`diff'.`exportplot', replace } serset clear } restore } serset clear clear mata * "*******************************************" * "* *" * "* S-Mapping (SMAP) Test of Nonlinearity *" * "* *" * "*******************************************" * "Note 1: As theta increases from 0, estimation is more state-dependent (i.e.. nonlinear)" * "Note 2: Plots and results saved as variables named SMAP_'par'_'var'" * "The 'par' is parameter name (e.g., rho, mae, etc.) and 'var' is variable name" if "`SMAPrun'"=="yes" { if "`SMAPsampler'" == "full" { loc SMAPreps = 2 loc SMAPsampler = "rep" loc fully = "full" } local count_theta: word count `theta' foreach var of varlist `variables' { tempname SMAP`var' SMAPlong`var' mata: `SMAP`var''=J(0,7,.) mata: `SMAPlong`var''=J(0,5,.) foreach id of numlist `panelrange' { if "`emax`var'`id''"!="" { if `emax`var'`id''>0 { mata: `C'=J(1,`count_theta',`id') loc newSMAPreps = `SMAPreps'*`count_theta' mata: `Clong'=J(1,`newSMAPreps',`id') di "----------------------------------" di "Below VAR = `var', SMAP" di "----------------------------------" capture noisily { edm explore `prestd'd`diff'.`var' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') `fully' `SMAPsampler'(`SMAPreps') alg(smap) theta(`theta') k(`kval') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' } capture { mata: `B'=st_matrix("e(summary)") mata: `Blong'=st_matrix("e(explore_result)") mata: `E'=`C'\ `B'' mata: `Elong'=`Clong'\ `Blong'' mata: `SMAP`var''=`SMAP`var''\ `E'' mata: `SMAPlong`var''=`SMAPlong`var''\ `Elong'' } } } } capture quietly { getmata (SMAP`var'*)=`SMAP`var'', force double label variable SMAP`var'1 "SMAP results, ID = 1" rename SMAP`var'1 SMAP_id_`var' label variable SMAP`var'2 "SMAP E value (max rho from original SP)" rename SMAP`var'2 SMAP_e_`var' recast int SMAP_e_`var' label variable SMAP`var'3 "SMAP theta value" rename SMAP`var'3 SMAP_theta_`var' label variable SMAP`var'4 "SMAP average rho (`SMAPreps' `SMAPsampler's)" rename SMAP`var'4 SMAP_rho_`var' label variable SMAP`var'5 "SMAP rho SD (`SMAPreps' `SMAPsampler's)" rename SMAP`var'5 SMAP_rhoSD_`var' label variable SMAP`var'6 "SMAP average MAE (`SMAPreps' `SMAPsampler's)" rename SMAP`var'6 SMAP_mae_`var' label variable SMAP`var'7 "SMAP MAE SD (`SMAPreps' `SMAPsampler's)" rename SMAP`var'7 SMAP_maeSD_`var' getmata (SMAPlong`var'*)=`SMAPlong`var'', force double label variable SMAPlong`var'1 "SMAP long results, ID = 1" rename SMAPlong`var'1 SMAPlong_id_`var' label variable SMAPlong`var'2 "SMAP long E value (max rho from original SP)" rename SMAPlong`var'2 SMAPlong_e_`var' recast int SMAPlong_e_`var' label variable SMAPlong`var'3 "SMAP long theta value" rename SMAPlong`var'3 SMAPlong_theta_`var' label variable SMAPlong`var'4 "SMAP long rho (`SMAPreps' `SMAPsampler's)" rename SMAPlong`var'4 SMAPlong_rho_`var' label variable SMAPlong`var'5 "SMAP long mae (`SMAPreps' `SMAPsampler's)" rename SMAPlong`var'5 SMAPlong_mae_`var' } } } clear mata * S-map analysis plots if "`SMAPgraph'"=="yes" & "`plots_on'" =="yes" { * Generate grand-mean as a separate panelvar ID preserve foreach var of varlist `variables' { * Confidence intervals for full plots if `CIs'==95 | `CIs'==90 loc a="errortype(ci(`CIs'))" else loc a="" * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Set scale for rho and standardized MAE results if "`useresults'" == "rho" loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else if "`useresults'" == "mae" & "`prestd'" == "z." loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else loc ylabelscale="" * Saving results if "`SMAPsave'"=="yes" loc b="saving(SMAP_`var'_`useresults'_d`diff', replace)" else loc b="" * Plot command lgraph SMAPlong_`useresults'_`var' SMAPlong_theta_`var' SMAPlong_id_`var', /// `a' `b' separate(.001) fit(`SMAPlinefit') `ylabelscale' /// title({stSerif:SMAP: Results for `var'}, color(black)) /// ytitle({stSerif:Prediction Accuracy} {`z'}) xtitle({stSerif:Degree of State-Dependence or Nonlinearity {&theta}}) /// note({stSerif:Note: Grand-mean line is bolded and greater theta means greater nonlinearity -- differencing at d`diff'}) /// loptions(1 lwidth(*6)) foptions(1 lwidth(*6)) graphregion(color(white)) bgcolor(white) /// name(SMAP_`var'_`useresults'_d`diff', replace) nolegend if "`exportplot'" != "" { graph export SMAP_`var'_`useresults'_d`diff'.`exportplot', replace } serset clear } restore } serset clear clear mata * SMAP Prediction Tests using `emax`var'`id'' if "`SMAPtest'"=="yes" { tsset * Identify theta > 0 with best predictions qui gen long `obsnO'=_n foreach var of varlist `variables' { foreach id of numlist `panelrange' { capture { quietly summarize SMAP_`testval'_`var' if SMAP_id_`var'==`id' & SMAP_theta_`var' > 0, meanonly if "`testval'"=="rho" loc a="max" else if "`testval'"=="mae" loc a="min" sca `Levelslove'=r(`a') qui levelsof `obsnO' if SMAP_`testval'_`var' == `Levelslove' & SMAP_id_`var'==`id' tempname theta`var'`id' sca `theta`var'`id''=SMAP_theta_`var'[`r(levels)'] } } } drop `obsnO' * Run SMAPs with theta = 0 and theta = `theta`var'`id'' if needed foreach var of varlist `variables' { tempname SMAPtestl`var' mata: `SMAPtestl`var''=J(0,6,.) foreach id of numlist `panelrange' { capture confirm scalar `theta`var'`id'' if _rc==0 & "`emax`var'`id''"!="" { if `emax`var'`id''>0 { foreach i of numlist 0 `=`theta`var'`id''' { di "-------------------------------------------------------------------------------" di "S-map test theta==`i', Variable == `var', SMAP test" di "-------------------------------------------------------------------------------" capture { qui edm explore `prestd'd`diff'.`var' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') `smapseed' alg(smap) rep(`SMAPtestreps') theta(`i') k(`kval') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' } capture { mata: `Blong'=st_matrix("e(explore_result)") mata: `Clong'=J(1,`SMAPtestreps',`id') mata: `Dlong'=(1..`SMAPtestreps') mata: `Elong'=`Clong'\ `Dlong'\ `Blong'' mata: `SMAPtestl`var''=`SMAPtestl`var''\ `Elong'' } } } } } capture quietly { getmata (SMAPtestl`var'*)=`SMAPtestl`var'', force double label variable SMAPtestl`var'1 "SMAP long test results, ID = 1" rename SMAPtestl`var'1 SMAPtestl_id_`var' recast long SMAPtestl_id_`var' label variable SMAPtestl`var'2 "SMAP long test results, paired replication number" rename SMAPtestl`var'2 SMAPtestl_rep_`var' recast long SMAPtestl_rep_`var' label variable SMAPtestl`var'3 "SMAP long test E value (optimal E from original SP)" rename SMAPtestl`var'3 SMAPtestl_e_`var' recast int SMAPtestl_e_`var' label variable SMAPtestl`var'4 "SMAP long test theta value (optimal theta > 0 given `testval')" rename SMAPtestl`var'4 SMAPtestl_theta_`var' recast float SMAPtestl_theta_`var' label variable SMAPtestl`var'5 "SMAP long test rho (`SMAPreps' reps)" rename SMAPtestl`var'5 SMAPtestl_rho_`var' label variable SMAPtestl`var'6 "SMAP long test mae (`SMAPreps' reps)" rename SMAPtestl`var'6 SMAPtestl_mae_`var' } } } * SMAP hypothesis testing calculations and plots if "`SMAPtestplot'" == "yes" { * Identify theta > 0 with best predictions qui gen long `obsnO'=_n foreach var of varlist `variables' { foreach id of numlist `panelrange' { capture { quietly summarize SMAP_`testval'_`var' if SMAP_id_`var'==`id' & SMAP_theta_`var' > 0, meanonly if "`testval'"=="rho" loc a="max" else if "`testval'"=="mae" loc a="min" sca `Levelslove'=r(`a') qui levelsof `obsnO' if SMAP_`testval'_`var' == `Levelslove' & SMAP_id_`var'==`id' loc besttheta`var'=SMAP_theta_`var'[`r(levels)'] } } } drop `obsnO' preserve if "`testval'"=="rho" loc pctiles = "1, 5, 50" else if "`testval'"=="mae" loc pctiles = "50, 95, 99" foreach var of varlist `variables' { loc `var'counter=0 loc `var't95counter=0 loc `var't99counter=0 tempname SMAPpct`var' tempvar SMAPtestval`var' mata: `SMAPpct`var'' = J(0,3,.) gen `SMAPtestval`var''=. capture { qui bysort SMAPtestl_id_`var' SMAPtestl_rep_`var' (SMAPtestl_theta_`var'): replace `SMAPtestval`var''=SMAPtestl_`testval'_`var'-SMAPtestl_`testval'_`var'[_n-1] if SMAPtestl_theta_`var' > 0 & SMAPtestl_theta_`var'[_n-1]==0 & SMAPtestl_theta_`var'!=. } foreach id of numlist `panelrange' { loc `var'counter=``var'counter'+1 capture { _pctile `SMAPtestval`var'' if SMAPtestl_id_`var'==`id', percentiles(`pctiles') if "`testval'"=="rho" & r(r1) > 0 loc `var't99counter=``var't99counter'+1 if "`testval'"=="rho" & r(r2) > 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & r(r2) < 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & r(r3) < 0 loc `var't99counter=``var't99counter'+1 mata: `A' = J(1,3,`id') mata: `B' = (`pctiles') mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)")) mata: `E' = `A'\ `B'\ `C' mata: `SMAPpct`var''=`SMAPpct`var''\ `E'' } } drop `SMAPtestval`var'' capture quietly { getmata (SMAPpct`var'*)=`SMAPpct`var'', force double label variable SMAPpct`var'1 "SMAP Percentile panel ID of 1" rename SMAPpct`var'1 SMAPpct`var' label variable SMAPpct`var'2 "SMAP Percentile" rename SMAPpct`var'2 SMAPpct`var'_pctile label variable SMAPpct`var'3 "SMAP test values" rename SMAPpct`var'3 SMAPpct`var'_vals } capture { if "`testval'"=="mae" loc fivepct = 95 else if "`testval'"=="rho" loc fivepct = 5 if "`testval'"=="mae" loc onepct = 99 else if "`testval'"=="rho" loc onepct = 1 if "`testval'"=="mae" loc onesuffix = "th" else if "`testval'"=="rho" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var't95counter'/``var'counter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var't99counter'/``var'counter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.0001) loc onepctGMd: di %9.4f `onepctGMr' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.0001) loc fivepctGMd: di %9.4f `fivepctGMr' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.0001) loc nullmeanGMd: di %9.4f `nullmeanGMr' if "`testval'" == "rho" loc notess=`"note("{stSerif:Note 1:} {&rho} {stSerif:at best theta > 0, here `besttheta`var'', minus} {&rho} {stSerif:at theta = 0, positive values support nonlinearity}" "{stSerif:Note 2: Of ``var'counter' panel IDs, ``var't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Grand-means of differences are: `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Mean = `nullmeanGMd'}" "{stSerif:Note 4: Missing percentiles implies manifold reconstruction problems (e.g., missing data or many zeros)}")"' else if "`testval'" == "mae" loc notess=`"note("{stSerif:Note 1: MAE at best theta > 0 minus MAE at theta = 0, negative values support nonlinearity}" "{stSerif:Note 2: Of ``var'counter' panel IDs, ``var't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Grand-means of differences are: `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Mean = `nullmeanGMd'}" "{stSerif:Note 4: Missing percentiles implies manifold reconstruction problems (e.g., missing data or many zeros)}")"' if "`SMAPsave'" == "yes" loc savers1="saving(SMAPtest_`var'_`testval'1_d`diff', replace)" if "`SMAPsave'" == "yes" loc savers2="saving(SMAPtest_`var'_`testval'2_d`diff', replace)" tempvar titled`var' gen `titled`var'' = "" label variable `titled`var'' "Percentiles of Difference: (theta > 0) - (theta == 0)" replace `titled`var''="Mean of Difference" if SMAPpct`var'_pctile==50 replace `titled`var''="`fivepct'th Percentile" if SMAPpct`var'_pctile==`fivepct' replace `titled`var''="`onepct'`onesuffix' Percentile" if SMAPpct`var'_pctile==`onepct' qui labmask SMAPpct`var'_pctile, values(`titled`var'') drop `titled`var'' twoway (scatter SMAPpct`var'_vals SMAPpct`var' if SMAPpct`var'_pctile==50) /// (scatter SMAPpct`var'_vals SMAPpct`var' if SMAPpct`var'_pctile==`fivepct') /// (scatter SMAPpct`var'_vals SMAPpct`var' if SMAPpct`var'_pctile==`onepct'), /// legend(label(1 {stSerif:Average Difference}) label(2 {stSerif:`fivepct'th Pctile Diff}) label(3 {stSerif:`onepct'`onesuffix' Pctile Diff})) /// title({stSerif:SMAP: `var' Confidence Intervals of Difference in} {`z'}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// /*xlab(`panelrange')*/ name(SMAPtest_`var'_`testval'1_d`diff', replace) yline(0) `notess' graphregion(color(white)) `savers1' xline(`panelrange', lwidth(vthin) lpattern(dash)) if "`exportplot'" != "" { graph export SMAPtest_`var'_`testval'1_d`diff'.`exportplot', replace } serset clear lgraph SMAPpct`var'_vals SMAPpct`var' SMAPpct`var'_pctile, graphregion(color(white)) /// yline(0) ytitle({stSerif:Prediction Accuracy} {`z'}) title({stSerif:SMAP: `var' Confidence Interval of Difference in} {`z'}, color(black)) /// /*xlab(`panelrange')*/ name(SMAPtest_`var'_`testval'2_d`diff', replace) `notess' `savers2' xline(`panelrange', lwidth(vthin) lpattern(dot)) if "`exportplot'" != "" { graph export SMAPtest_`var'_`testval'2_d`diff'.`exportplot', replace } serset clear } } restore } clear mata serset clear * "*******************************************" * "* *" * "* Coprediciton Using Simplex Projection *" * "* *" * "*******************************************" * "Note 1: Coprediction to evaluate predictability for all unique variable pairs using simplex projections" * "Note 2: Results saved in 'Long' format where predictor and outcome coded in two variables with rho and mae in two others" * "Note 3: Results also saved in a 'Wide' format that is essentially a correlation matrix of results * "Note 4: Columns of these matrices index the predictor var1 used to construct the E-dimensional manifold" * "Note 5: Rows index the predictee var2 that is predicted by the predictor var1" * "Note 6: Thus, diagonal elements of these matrices are typical simplex projection rhos (a variable predicting itself)" * "Note 7: panelvar ID saved as variable to tie each matrix to its panel ID" * "Note 8: Wide format plots and results saved as variables named CoPred_'var'" * "The 'var' is the variable name that serves as the predictor of other variables" * "Note 9: The 'variables' list determines the Wide format outcome/predictee variable for each row of results" * "Given the 'variables' list here, there are 'nv' unique rows of results in order of 'variables' list" if "`COPREDrun'"=="yes" { loc nv: word count `variables' /* Number of variables "nv" to form matrices for coprediction results */ loc copredmatsize = `nv'+2 /* Set columns of coprediction matrix, which is nv+1 to include panel IDs as the first column */ mata: `CoPred1'=J(0,`copredmatsize',.) /* Generate matrix to hold panel IDs and nv*nv square matrix of results */ mata: `CoPred2'=J(0,`copredmatsize',.) /* Generate matrix to hold panel IDs and nv*nv square matrix of results */ mata: `NewMat'=J(0,5,.) foreach id of numlist `panelrange' { loc aa=2 /* Setup counters linked to matrix elements to contain results */ loc bb=0 loc aaa=0 mata: `A1'=J(1,`nv',`id') /* Set first column to person ID from panelvar */ mata: `A2'=(1..`nv') mata: `B'=J(`nv',`nv',.) /* Create nv*nv matrix to contain each person's coprediction matrix */ mata: `C'=`A1'\ `A2'\ `B' mata: `Copredrho'=`C'' mata: `Copredmae'=`C'' foreach var1 of varlist `variables' { loc aa=`aa'+1 /* This is column number in coprediction matrix after panel ID, meaning the predictor var1 */ loc aaa=`aaa'+1 /* This codes for predictor in the non-matrix saved results NewMat as varlist variables order */ foreach var2 of varlist `variables' { loc bb=`bb'+1 /* Predictee var2: 1=ihap; 2=irlx; ...; 8=iNA */ loc cc=`bb'-(`aa'-3)*`nv' /* This is row number in coprediction matrix for predictee var2 */ di "-------------------------------------------------" di "Below predictor==`var1', predictee==`var2'" di "-------------------------------------------------" capture noisily { if "`emax`var1'`id''"!="" { if `emax`var1'`id''>0 { edm explore `prestd'd`diff'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') copredictvar(`prestd'd`diff'.`var2') copredict(`var1'_pred_`var2'_`panelvar') full `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' } } } if "`time_gap'"=="dt" { preserve drop if missing(`var2') } capture { gen `tempdt1' = `var2'[_n+1] if `panelvar'==`id' & `panelvar'[_n+1]==`id' /* Here is a change */ qui cor d`diff'.`tempdt1' `var1'_pred_`var2'_`panelvar' if `panelvar'==`id' /* Here is a change */ mata: `Predrho'=st_matrix("r(C)") mata: `Copredrho'[`cc',`aa']=`Predrho'[2,1] } capture { if "`prestd'" == "z." { qui egen `tempstD' = std(d`diff'.`tempdt1') qui gen `absvaluE'=abs(`var1'_pred_`var2'_`panelvar'-`tempstD') if `panelvar'==`id' /* Here is a change */ summarize `absvaluE', meanonly mata: `Copredmae'[`cc',`aa']=st_numscalar("r(mean)") mata: `NewMat'=`NewMat'\(`id',`aaa',`cc',`Predrho'[2,1],st_numscalar("r(mean)")) drop `absvaluE' `tempstD' `var1'_pred_`var2'_`panelvar' `tempdt1' } else { qui gen `absvaluE'=abs(`var1'_pred_`var2'_`panelvar'-d`diff'.`tempdt1') if `panelvar'==`id' /* Here is a change */ summarize `absvaluE', meanonly mata: `Copredmae'[`cc',`aa']=st_numscalar("r(mean)") mata: `NewMat'=`NewMat'\(`id',`aaa',`cc',`Predrho'[2,1],st_numscalar("r(mean)")) drop `absvaluE' `var1'_pred_`var2'_`panelvar' `tempdt1' } } if "`time_gap'"=="dt" { restore } } } mata: `CoPred1'=`CoPred1'\ `Copredrho' mata: `CoPred2'=`CoPred2'\ `Copredmae' } capture quietly { getmata (CoPred1*)=`CoPred1',force double getmata (CoPred2*)=`CoPred2',force double getmata (NewMat*)=`NewMat',force double label variable CoPred11 "Coprediction, ID from `panelvar'" label variable CoPred12 "Coprediction, predictee" label variable NewMat1 "Coprediction ID from `panelvar'" label variable NewMat2 "Coprediction Predictor" label variable NewMat3 "Coprediction Predictee" label variable NewMat4 "Coprediction rho" label variable NewMat5 "Coprediction MAE" drop CoPred21 CoPred22 rename CoPred11 CoPredWide rename CoPred12 CoPredWide_Predictee recast long CoPredWide_Predictee rename NewMat1 CoPredLong rename NewMat2 CoPredLong_Predictor recast long CoPredLong_Predictor rename NewMat3 CoPredLong_Predictee recast long CoPredLong_Predictee rename NewMat4 CoPredLong_rho rename NewMat5 CoPredLong_mae tostring CoPredWide_Predictee, generate(CoPredWide_Predictee_String) tostring CoPredLong_Predictor, generate(CoPredLong_Predictor_String) tostring CoPredLong_Predictee, generate(CoPredLong_Predictee_String) } * Replace string numeric values with predictee variable names loc a=0 foreach var of varlist `variables' { loc a=`a'+1 loc varnum`a' = "`var'" } foreach i of numlist 1/`nv' { qui replace CoPredWide_Predictee_String="`prestd'`varnum`i''" if CoPredWide_Predictee_String=="`i'" qui replace CoPredLong_Predictor_String="`prestd'`varnum`i''" if CoPredLong_Predictor_String=="`i'" qui replace CoPredLong_Predictee_String="`prestd'`varnum`i''" if CoPredLong_Predictee_String=="`i'" } qui labmask CoPredWide_Predictee, values(CoPredWide_Predictee_String) qui labmask CoPredLong_Predictor, values(CoPredLong_Predictor_String) qui labmask CoPredLong_Predictee, values(CoPredLong_Predictee_String) * Label variables with predictor names loc a=2 foreach var of varlist `variables' { loc a=`a'+1 loc varnum`a' = "`var'" } foreach i of numlist 3/`copredmatsize' { label variable CoPred1`i' "Coprediction rho `prestd'`varnum`i'' as predictor, ID `panelvar'" label variable CoPred2`i' "Coprediction MAE `prestd'`varnum`i'' as predictor, ID `panelvar'" rename CoPred1`i' CoPredWide_rho_`varnum`i'' rename CoPred2`i' CoPredWide_mae_`varnum`i'' } } clear mata * Coprediction plots if "`COPREDgraph'"=="yes" & "`plots_on'" =="yes" { if "`COPREDvalues'" == "yes" { loc a="values(format(%9.`COPREDdecimals'f))" } else loc a="" if "`COPREDsave'"=="yes" { loc b="saving(COPREDheatmap_`useresults'_d`diff', replace)" loc c="saving(COPREDbar_`useresults'_d`diff'_xpredictee, replace)" loc d="saving(COPREDbar_`useresults'_d`diff'_xpredictor, replace)" } else { loc b="" loc c="" loc d="" } * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Coprediction heatmap plot heatplot CoPredLong_`useresults' CoPredLong_Predictee_String CoPredLong_Predictor_String, `a' `b' /// colors(hcl, bluered reverse) discrete fast graphregion(color(white)) yscale(reverse) /// title({stSerif:Coprediction Results Heatmap}, color(black)) name(COPREDheatmap_`useresults'_d`diff', replace) if "`exportplot'" != "" { graph export COPREDheatmap_`useresults'_d`diff'.`exportplot', replace } serset clear * Coprediction line plots graph bar CoPredLong_`useresults' if !missing(CoPredLong_`useresults'), `c' /// over(CoPredLong_Predictor_String) over(CoPredLong_Predictee_String) nofill asyvars /// title({stSerif:Coprediction Results with Predictee on x-Axis}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// legend(subtitle("Predictor Variable")) name(COPREDbar_`useresults'_d`diff'_xpredictee, replace) ymtick(##4, grid) graphregion(color(white)) if "`exportplot'" != "" { graph export COPREDbar_`useresults'_d`diff'_xpredictee.`exportplot', replace } serset clear graph bar CoPredLong_`useresults' if !missing(CoPredLong_`useresults'), `d' /// over(CoPredLong_Predictee_String) over(CoPredLong_Predictor_String) nofill asyvars /// title({stSerif:Coprediction Results with Predictor on x-Axis}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// legend(subtitle("Predictee Variable")) name(COPREDbar_`useresults'_d`diff'_xpredictor, replace) ymtick(##4, grid) graphregion(color(white)) if "`exportplot'" != "" { graph export COPREDbar_`useresults'_d`diff'_xpredictor.`exportplot', replace } serset clear } serset clear clear mata * "********************************************" * "* *" * "* Convergent Cross Mapping (CCM) or XMAP *" * "* *" * "********************************************" * "Note 1: For evaluating causal effects vs. common drivers, etc." if "`CCMrun'"=="yes" { foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`var1'" != "`var2'" { tempname ID_`var1'_pred_`var2'_CCM mata: `ID_`var1'_pred_`var2'_CCM' = J(0,6,.) foreach id of numlist `panelrange' { if "`emax`var1'`id''"!="" { if `emax`var1'`id''>0 { di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" if "`CCMlags'"!="NA" & "`time_gap'"!="dt" { forval z=`CCMlags'(-1)1 { capture { di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'l`z'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') direction(oneway) `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z2'=(`id')\(-`z')\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z2'' } } } else if "`CCMlags'"!="NA" & "`time_gap'"=="dt" { preserve drop if missing(`var1') & missing(`var2') forval z=`CCMlags'(-1)1 { gen `var1'llag`z' = `var1'[_n-`z'] if `panelvar'==`panelvar'[_n-`z'] capture { di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1'llag`z' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') direction(oneway) `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z2'=(`id')\(-`z')\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z2'' } } restore } if "`time_gap'"!="dt" { forval z=0/`CCMleads' { capture { di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'f`z'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') direction(oneway) `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { if `z'==0 loc idNfull = e(N) mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z2'=(`id')\(`z')\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z2'' } } } else if "`time_gap'"=="dt" { preserve drop if missing(`var1') & missing(`var2') forval z=0/`CCMleads' { gen `var1'forwlag`z' = `var1'[_n+`z'] if `panelvar'==`panelvar'[_n+`z'] capture { di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1'forwlag`z' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') direction(oneway) `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { if `z'==0 loc idNfull = e(N) mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z2'=(`id')\(`z')\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z2'' } } restore } capture noisily { loc lib11 = `emax`var1'`id''+`lib1' di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') oneway lib(`lib11') rep(`CCMreps1') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { mata: `Z21' = J(1,`CCMreps1',`id') mata: `Z22' = J(1,`CCMreps1',0) mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z3'=`Z21'\ `Z22'\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z3'' } capture noisily { loc lib22 = `emax`var1'`id''+`lib2' loc lib33 = `emax`var1'`id''+`lib3' loc lib44 = `emax`var1'`id''+`lib4' di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') oneway lib(`lib22' `lib33' `lib44') rep(`CCMreps2') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { loc matsizetemp2 = 3*`CCMreps2' mata: `Z21' = J(1,`matsizetemp2',`id') mata: `Z22' = J(1,`matsizetemp2',0) mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z3'=`Z21'\ `Z22'\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z3'' } capture noisily { loc libthird = round(`idNfull'/`fraction1') loc libhalf = round(`idNfull'/`fraction2') di "-------------------------------------------------" di "Below CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------" edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') oneway lib(`libthird' `libhalf') rep(`CCMreps3') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { loc matsizetemp3 = 2*`CCMreps3' mata: `Z21' = J(1,`matsizetemp3',`id') mata: `Z22' = J(1,`matsizetemp3',0) mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z3'=`Z21'\ `Z22'\ `Z1'' mata: `ID_`var1'_pred_`var2'_CCM' = `ID_`var1'_pred_`var2'_CCM'\ `Z3'' } } } } * Now save results to variables for plotting etc. capture quietly { getmata (ID_`var1'_pred_`var2'_CCM*)=`ID_`var1'_pred_`var2'_CCM', force double label variable ID_`var1'_pred_`var2'_CCM1 "CCM `var1' causing `var2', ID = 1" rename ID_`var1'_pred_`var2'_CCM1 CCM_id_`var1'_`var2' label variable ID_`var1'_pred_`var2'_CCM2 "CCM `var1' causing `var2' time lag, negative values imply a regular lag and positive values imply forward stepping" rename ID_`var1'_pred_`var2'_CCM2 CCM_lag_`var1'_`var2' recast int CCM_lag_`var1'_`var2' drop ID_`var1'_pred_`var2'_CCM3 label variable ID_`var1'_pred_`var2'_CCM4 "CCM `var1' causing `var2', library size" rename ID_`var1'_pred_`var2'_CCM4 CCM_L_`var1'_`var2' recast long CCM_L_`var1'_`var2' label variable ID_`var1'_pred_`var2'_CCM5 "CCM `var1' causing `var2', rho" rename ID_`var1'_pred_`var2'_CCM5 CCM_rho_`var1'_`var2' label variable ID_`var1'_pred_`var2'_CCM6 "CCM `var1' causing `var2', MAE" rename ID_`var1'_pred_`var2'_CCM6 CCM_mae_`var1'_`var2' } } } } } clear mata * Convergent cross mapping (CCM) or xmap analysis plots if "`CCMgraph'"=="yes" & "`plots_on'" =="yes" { * Generate grand-mean as a separate panelvar ID preserve * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Set scale for rho and standardized MAE results if "`useresults'" == "rho" loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else if "`useresults'" == "mae" & "`prestd'" == "z." loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else loc ylabelscale="" * Generate grand mean for inclusion as a note on the plots gen long `obsnO'=_n foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`var1'" != "`var2'" { tempvar GMCCM`var1'`var2' gen `GMCCM`var1'`var2''=. foreach id of numlist `panelrange' { capture { quietly summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`useresults'_`var1'_`var2'), meanonly loc levelz=r(max) qui levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 replace `GMCCM`var1'`var2''=CCM_`useresults'_`var1'_`var2'[`r(levels)'] if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 } } sum `GMCCM`var1'`var2'', meanonly loc CCMrhoz=r(mean) loc CCMrhoz1=round(`CCMrhoz',.001) loc CCMrhoz11: di %6.3f `CCMrhoz1' * Saving results if "`CCMsave'"=="yes" { loc b="saving(CCM_`var1'_`var2'_`useresults'_l0, replace)" else loc b="" } * Plot command lgraph CCM_`useresults'_`var1'_`var2' CCM_L_`var1'_`var2' CCM_id_`var1'_`var2' /// if CCM_lag_`var1'_`var2'==0 & !missing(CCM_`useresults'_`var1'_`var2'), /// /// errortype(ci(`CIs')) `b' separate(.001) fit(`CCMlinefit') `ylabelscale' /// title({stSerif:CCM: Results for `var1' CCM causing `var2'}, color(black)) /// ytitle({stSerif:Prediction Accuracy} {`z'}) xtitle({stSerif:Library Size L}) /// note({stSerif:Note: Grand-mean line is bolded -- differencing at d`diff'}) /// loptions(1 lwidth(*6)) foptions(1 lwidth(*6)) graphregion(color(white)) bgcolor(white) /// name(CCM_`var1'_`var2'_`useresults'_l0, replace) nolegend /// note({stSerif:Note: Grand Mean of} {&rho} {stSerif:at Maximum Library Size = `CCMrhoz11' -- differencing at d`diff'}) if "`exportplot'" != "" { graph export CCM_`var1'_`var2'_`useresults'_l0.`exportplot', replace } serset clear } } } restore } clear mata serset clear * Convergent cross mapping (CCM) or xmap LAGGED analysis plots if "`CCMlaggraph'"=="yes" & "`plots_on'" =="yes" { * Generate grand-mean as a separate panelvar ID preserve foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`var1'" != "`var2'" { * Find max library size L for each panel ID and each lag, so only max L results are plotted qui gen CCM_L_`var1'_`var2'max=. quietly levelsof CCM_lag_`var1'_`var2', local(levels) foreach lag of local levels { foreach id of numlist `panelrange' { quietly summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==`lag' qui replace CCM_L_`var1'_`var2'max=r(max) if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==`lag' } } } } } * Use rho or MAE labeling scheme if "`useresults'" == "rho" loc z="&rho" else if "`useresults'" == "mae" loc z="stSerif:MAE" * Set scale for rho and standardized MAE results if "`useresults'" == "rho" loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else if "`useresults'" == "mae" & "`prestd'" == "z." loc ylabelscale="ylabel(-.25(.25)1) yscale(range(-.2 1))" else loc ylabelscale="" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`var1'" != "`var2'" { * Saving results if "`CCMlagsave'"=="yes" { loc b="saving(CCM_`var1'_`var2'_`useresults'_lags, replace)" else loc b="" * Plot command lgraph CCM_`useresults'_`var1'_`var2' CCM_lag_`var1'_`var2' CCM_id_`var1'_`var2' /// if !missing(CCM_`useresults'_`var1'_`var2') & CCM_L_`var1'_`var2'max==CCM_L_`var1'_`var2', /// /// errortype(ci(`CIs')) `b' separate(.001) fit(`CCMlaglinefit') `ylabelscale' /// title({stSerif:CCM: Lagged results for `var1' CCM causing `var2'}, color(black)) /// ytitle({stSerif:Prediction Accuracy} {`z'}) xtitle({stSerif:Time Lag (Negative is Past)}) /// note({stSerif:Note: Grand-mean line is bolded -- differencing at d`diff'}) /// loptions(1 lwidth(*6)) foptions(1 lwidth(*6)) graphregion(color(white)) bgcolor(white) /// name(CCM_`var1'_`var2'_`useresults'_lags, replace) nolegend xlabel(-`CCMlags'(1)`CCMleads') if "`exportplot'" != "" { graph export CCM_`var1'_`var2'_`useresults'_lags.`exportplot', replace } serset clear } } } } restore } clear mata serset clear * Setup check for max(rho) or min(mae) at lag=0 or greater as assumption check before testing convergence if "`CCMctest'" == "yes" { gen long `obsnO'=_n tempname Xit Lag_best foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCM_id_`var1'_`var2' if !_rc { loc count`var1'`var2' = 0 quietly levelsof CCM_id_`var1'_`var2', local(levels) foreach id of local levels { loc count`var1'`var2'`id'=0 if "`StepwiseCCMtest'" == "yes" { capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') loc levelo = `r(levels)' } capture { putmata `Xit'=(CCM_`testval'_`var1'_`var2' CCM_lag_`var1'_`var2') if CCM_id_`var1'_`var2'==`id' & (CCM_lag_`var1'_`var2'!=0 | _n==`levelo'), replace omitmissing } if "`testval'"=="rho" { mata: _sort(`Xit',-1) } else if "`testval'"=="mae" { mata: _sort(`Xit',1) } mata: st_numscalar("`Lag_best'",`Xit'[1,2]) if `Lag_best' < 1 { loc count`var1'`var2'=`count`var1'`var2''+1 loc count`var1'`var2'`id'=`count`var1'`var2'`id''+1 } } else { loc count`var1'`var2'=`count`var1'`var2''+1 loc count`var1'`var2'`id'=`count`var1'`var2'`id''+1 } } } } } drop `obsnO' } * CCM Tests for convergence comparing bootstrapped small library size predictions to full results if "`CCMctest'" == "yes" { tsset gen long `obsnO' = _n loc newCCMctestreps = `CCMctestreps' - `CCMreps1' * Reuse previously computed bootstraps at 'lib1' library size foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`count`var1'`var2''"!="" { if `count`var1'`var2''>0 { tempname reuseCCM reuseCCM`var1'`var2' CCMconvetest mata: `reuseCCM`var1'`var2''=J(0,3,.) mata: `CCMconvetest'=J(0,5,.) quietly levelsof CCM_id_`var1'_`var2', local(levels) foreach id of local levels { if "`count`var1'`var2'`id''"!="" { if `count`var1'`var2'`id''>0 { tempname CCML2`var1'`var2'`id' Lmin`var1'`var2'`id' capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) loc levelzmin=r(min) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') sca `CCML2`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] sca `Lmin`var1'`var2'`id''=`levelzmin' } loc lib11 = `emax`var1'`id''+`lib1' if `Lmin`var1'`var2'`id''==`lib11' { capture { putmata `reuseCCM'=(CCM_id_`var1'_`var2' CCM_rho_`var1'_`var2' CCM_mae_`var1'_`var2') if CCM_id_`var1'_`var2' == `id' & CCM_L_`var1'_`var2'==`Lmin`var1'`var2'`id'' & CCM_lag_`var1'_`var2' == 0, omitmissing replace mata: `reuseCCM`var1'`var2''=`reuseCCM`var1'`var2''\ `reuseCCM' } * Top up existing bootstraps at 'lib1' if needed if `newCCMctestreps' > 0 { di "-------------------------------------------------------------------------------------------" di "Reps are `newCCMctestreps', CCM Cause==`var1', Outcome==`var2'" di "-------------------------------------------------------------------------------------------" capture { edm xmap `prestd'd`diff'.`var2' `prestd'd`diff'.`var1' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') oneway lib(`lib11') rep(`newCCMctestreps') `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { mata: `Z21' = J(1,`newCCMctestreps',`id') mata: `Z1'=st_matrix("e(xmap_1)") mata: `Z3'=`Z21'\ `Z1'' mata: `CCMconvetest' = `Z3'' } capture { mata: `reuseCCM`var1'`var2''= `reuseCCM`var1'`var2''\ (`CCMconvetest'[|1,1 \ .,1|],`CCMconvetest'[|1,4 \ .,4|],`CCMconvetest'[|1,5 \ .,5|]) } } } } } } capture { getmata (CCMctest_`var1'_`var2'*)=`reuseCCM`var1'`var2'', force double label variable CCMctest_`var1'_`var2'1 "CCM convergence test of `testval' at 'lib1' (all reps), panel ID from `panelvar'" rename CCMctest_`var1'_`var2'1 CCMctest_id_`var1'_`var2' recast long CCMctest_id_`var1'_`var2' label variable CCMctest_`var1'_`var2'2 "CCM convergence test of `testval' at 'lib1' (all reps), rho" rename CCMctest_`var1'_`var2'2 CCMctest_rho_`var1'_`var2' label variable CCMctest_`var1'_`var2'3 "CCM convergence test of `testval' at 'lib1' (all reps), mae" rename CCMctest_`var1'_`var2'3 CCMctest_mae_`var1'_`var2' } } } } } drop `obsnO' } * Hypothesis testing calculations and plots if "`CCMctestplot'" == "yes" { preserve gen long `obsnO'=_n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMctest_id_`var1'_`var2' if !_rc { tempname CCML2pct`var1'`var2' mata: `CCML2pct`var1'`var2'' = J(0,5,.) loc `var1'`var2'tcounter=0 loc `var1'`var2't95counter=0 loc `var1'`var2't99counter=0 quietly levelsof CCMctest_id_`var1'_`var2', local(levels) foreach id of local levels { loc `var1'`var2'tcounter=``var1'`var2'tcounter'+1 tempname CCML2`var1'`var2'`id' capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') sca `CCML2`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] } capture { _pctile CCMctest_`testval'_`var1'_`var2' if CCMctest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `CCML2`var1'`var2'`id''-r(r2) > 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="rho" & `CCML2`var1'`var2'`id''-r(r3) > 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 if "`testval'"=="mae" & `CCML2`var1'`var2'`id''-r(r2) < 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="mae" & `CCML2`var1'`var2'`id''-r(r1) < 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 mata: `CCML2`var1'`var2'`id''=st_numscalar("`CCML2`var1'`var2'`id''") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `CCML2`var1'`var2'`id'') mata: `C2' = (`CCML2`var1'`var2'`id''-st_numscalar("r(r1)"), `CCML2`var1'`var2'`id''-st_numscalar("r(r2)"), `CCML2`var1'`var2'`id''-st_numscalar("r(r3)"), 0) capture confirm scalar e(N) if !_rc { mata: `D' = J(1,4,st_numscalar("e(N)")) } else { mata: `D' = J(1,4,.) } mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `CCML2pct`var1'`var2''=`CCML2pct`var1'`var2''\ `E'' } } capture { getmata (CCML2pct`var1'`var2'*)=`CCML2pct`var1'`var2'', force double label variable CCML2pct`var1'`var2'1 "CCM convergence test Percentile panel ID" rename CCML2pct`var1'`var2'1 CCML2pct`var1'`var2' label variable CCML2pct`var1'`var2'2 "CCM convergence test Percentile" rename CCML2pct`var1'`var2'2 CCML2pct`var1'`var2'_pctile label variable CCML2pct`var1'`var2'3 "CCM convergence test values" rename CCML2pct`var1'`var2'3 CCML2pct`var1'`var2'_vals label variable CCML2pct`var1'`var2'4 "CCM convergence test Values Minus Observed" rename CCML2pct`var1'`var2'4 CCML2pct`var1'`var2'_valsc label variable CCML2pct`var1'`var2'5 "CCM convergence test Percentile N" rename CCML2pct`var1'`var2'5 CCML2pct`var1'`var2'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var1'`var2't95counter'/``var1'`var2'tcounter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var1'`var2't99counter'/``var1'`var2'tcounter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==110, meanonly loc testvalGM=r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' if "`testval'" == "rho" loc notess=`"note("{stSerif:Note 1: y=0 is} {&rho}{stSerif: at Min L, reject H0 if 99th and/or 95th percentiles are positive}" "{stSerif:Note 2: Of ``var1'`var2'tcounter' panel IDs, ``var1'`var2't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var1'`var2't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means: Observed} {&rho} {stSerif:= `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Min L mean = `nullmeanGMd'}")"' else if "`testval'" == "mae" loc notess=`"note("{stSerif:Note 1: y=0 is MAE at Min L, reject H0 if 1st and/or 5th percentiles are negative}" "{stSerif:Note 2: Of ``var1'`var2'tcounter' panel IDs, ``var1'`var2't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var1'`var2't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means: Observed MAE = `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Min L mean = `nullmeanGMd'}")"' if "`CCMsave'" == "yes" loc savers1="saving(CCMctest_`var1'_`var2'_`testval'1, replace)" if "`CCMsave'" == "yes" loc savers2="saving(CCMctest_`var1'_`var2'_`testval'2, replace)" tempvar titled2`var1'`var2' gen `titled2`var1'`var2'' = "" label variable `titled2`var1'`var2'' "Percentiles of Bootstrapped Minimum Library Size" replace `titled2`var1'`var2''="Mean Diff at Min L" if CCML2pct`var1'`var2'_pctile==50 replace `titled2`var1'`var2''="`fivepct'th Pctile Diff" if CCML2pct`var1'`var2'_pctile==`fivepct' replace `titled2`var1'`var2''="`onepct'`onesuffix' Pctile Diff" if CCML2pct`var1'`var2'_pctile==`onepct' qui labmask CCML2pct`var1'`var2'_pctile, values(`titled2`var1'`var2'') twoway (scatter CCML2pct`var1'`var2'_valsc CCML2pct`var1'`var2' if CCML2pct`var1'`var2'_pctile==50) /// (scatter CCML2pct`var1'`var2'_valsc CCML2pct`var1'`var2' if CCML2pct`var1'`var2'_pctile==`fivepct') /// (scatter CCML2pct`var1'`var2'_valsc CCML2pct`var1'`var2' if CCML2pct`var1'`var2'_pctile==`onepct'), /// legend(label(1 {stSerif:Mean Diff at Min L}) label(2 {stSerif:`fivepct'th Pctile Diff}) label(3 {stSerif:`onepct'`onesuffix' Pctile Diff})) /// title({stSerif:CCM: Convergence test of `var1' CCM causing `var2'}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// /*xlab(`panelrange')*/ name(CCMctest_`var1'_`var2'_`testval'1, replace) yline(0) `notess' graphregion(color(white)) `savers1' xline(`panelrange', lwidth(vthin) lpattern(dash)) if "`exportplot'" != "" { graph export CCMctest_`var1'_`var2'_`testval'1.`exportplot', replace } serset clear lgraph CCML2pct`var1'`var2'_valsc CCML2pct`var1'`var2' CCML2pct`var1'`var2'_pctile if CCML2pct`var1'`var2'_pctile!=110, graphregion(color(white)) /// yline(0) ytitle({stSerif:Prediction Accuracy} {`z'}) title({stSerif:CCM: Convergence test of `var1' CCM causing `var2'}, color(black)) /// /*xlab(`panelrange')*/ name(CCMctest_`var1'_`var2'_`testval'2, replace) `notess' `savers2' xline(`panelrange', lwidth(vthin) lpattern(dot)) if "`exportplot'" != "" { graph export CCMctest_`var1'_`var2'_`testval'2.`exportplot', replace } serset clear } } } } restore } serset clear clear mata * Setup check for convergence before testing effect as max L using permutation-based null (i.e., surrogate data) if "`CCMLtest'" == "yes" { gen long `obsnO'=_n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMctest_id_`var1'_`var2' if !_rc { loc count`var1'`var2' = 0 quietly levelsof CCMctest_id_`var1'_`var2', local(levels) foreach id of local levels { loc count`var1'`var2'`id'=0 if "`StepwiseCCMtest'" == "yes" { tempname CCML2`var1'`var2'`id' capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') sca `CCML2`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] } capture { _pctile CCMctest_`testval'_`var1'_`var2' if CCMctest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & r(r2)-`CCML2`var1'`var2'`id'' < 0 loc count`var1'`var2'=`count`var1'`var2''+1 if "`testval'"=="rho" & r(r2)-`CCML2`var1'`var2'`id'' < 0 loc count`var1'`var2'`id'=`count`var1'`var2'`id''+1 if "`testval'"=="mae" & r(r2)-`CCML2`var1'`var2'`id'' > 0 loc count`var1'`var2'=`count`var1'`var2''+1 if "`testval'"=="mae" & r(r2)-`CCML2`var1'`var2'`id'' > 0 loc count`var1'`var2'`id'=`count`var1'`var2'`id''+1 } } else { loc count`var1'`var2'=`count`var1'`var2''+1 loc count`var1'`var2'`id'=`count`var1'`var2'`id''+1 } } } } } drop `obsnO' } * CCM Tests for prediction at maximum library size using permutation if "`CCMLtest'" == "yes" { tsset * First permute for 'var1', run CCM analyses, and obtain the 'testval' at lag==0 and max L gen `randoM'=. gen long `newtimE'=. gen long `obsnO'=_n foreach var1 of varlist `variables' { tempvar d`diff'`var1' r`var1' gen `d`diff'`var1''=d`diff'.`var1' gen `r`var1''=. foreach var2 of varlist `variables' { if "`count`var1'`var2''"!="" { if `count`var1'`var2''>0 { tempname CCMLtest`var1'`var2' mata: `CCMLtest`var1'`var2''=J(0,5,.) foreach i of numlist 1/`CCMLtestreps' { replace `randoM'=uniform() if "`dropzeros'"=="yes" { bysort ``var1'_nonzerouse' (`randoM'): replace `newtimE'=_n if ``var1'_nonzerouse'==1 bysort ``var1'_nonzerouse' (`timevar'): replace `r`var1''=`d`diff'`var1''[`newtimE'] if ``var1'_nonzerouse'==1 tsset } else { sort `randoM' replace `newtimE'=_n sort `timevar' replace `r`var1''=`d`diff'`var1''[`newtimE'] tsset } capture quietly levelsof CCMctest_id_`var1'_`var2', local(levels) foreach id of local levels { if "`count`var1'`var2'`id''"!="" { if `count`var1'`var2'`id''>0 { capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 tempname CCML`var1'`var2'`id' sca `CCML`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] } di "--------------------------------------------------------------------------------" di "Rep == `i' of `CCMLtestreps', CCM Cause==`var1', Outcome==`var2'" di "--------------------------------------------------------------------------------" capture { qui edm xmap `prestd'd`diff'.`var2' `prestd'`r`var1'' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') oneway `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' `options' } capture { mata: `Z21' = (`id') mata: `Z1'=st_matrix("e(xmap_1)") tempname ehere sca `ehere'=e(e_actual) mata: `Z1'[1,1]=st_numscalar("`ehere'") mata: `Z3'=`Z21'\ `Z1'' mata: `CCMLtest`var1'`var2'' = `CCMLtest`var1'`var2''\ `Z3'' } } } } } capture quietly { getmata (CCMLtest_`var1'_`var2'*)=`CCMLtest`var1'`var2'', force double label variable CCMLtest_`var1'_`var2'1 "CCM test of `testval' at max library (all reps), panel ID from `panelvar'" rename CCMLtest_`var1'_`var2'1 CCMLtest_id_`var1'_`var2' label variable CCMLtest_`var1'_`var2'2 "CCM test of `testval' at max library (all reps), E value for CCM cause `var1'" rename CCMLtest_`var1'_`var2'2 CCMLtest_e_`var1'_`var2' recast int CCMLtest_e_`var1'_`var2' drop CCMLtest_`var1'_`var2'3 label variable CCMLtest_`var1'_`var2'4 "CCM test of `testval' at max library (all reps), rho" rename CCMLtest_`var1'_`var2'4 CCMLtest_rho_`var1'_`var2' label variable CCMLtest_`var1'_`var2'5 "CCM test of `testval' at max library (all reps), , MAE" rename CCMLtest_`var1'_`var2'5 CCMLtest_mae_`var1'_`var2' } } } } drop `d`diff'`var1'' `r`var1'' } drop `randoM' `newtimE' `obsnO' } * Hypothesis testing calculations and plots if "`CCMLtestplot'" == "yes" { preserve gen long `obsnO' = _n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMLtest_id_`var1'_`var2' if !_rc { tempname CCMLpct`var1'`var2' mata: `CCMLpct`var1'`var2'' = J(0,5,.) loc `var1'`var2'tcounter=0 loc `var1'`var2't95counter=0 loc `var1'`var2't99counter=0 quietly levelsof CCMLtest_id_`var1'_`var2', local(levels) foreach id of local levels { * capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 tempname CCML`var1'`var2'`id' sca `CCML`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] * } loc `var1'`var2'tcounter=``var1'`var2'tcounter'+1 * capture { _pctile CCMLtest_`testval'_`var1'_`var2' if CCMLtest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `CCML`var1'`var2'`id''-r(r2) > 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="rho" & `CCML`var1'`var2'`id''-r(r3) > 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 if "`testval'"=="mae" & `CCML`var1'`var2'`id''-r(r2) < 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="mae" & `CCML`var1'`var2'`id''-r(r1) < 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 mata: `CCML`var1'`var2'`id''=st_numscalar("`CCML`var1'`var2'`id''") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `CCML`var1'`var2'`id'') mata: `C2' = (`CCML`var1'`var2'`id''-st_numscalar("r(r1)"), `CCML`var1'`var2'`id''-st_numscalar("r(r2)"), `CCML`var1'`var2'`id''-st_numscalar("r(r3)"), 0) capture confirm scalar e(N) if !_rc { mata: `D' = J(1,4,st_numscalar("e(N)")) } else { mata: `D' = J(1,4,.) } mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `CCMLpct`var1'`var2''=`CCMLpct`var1'`var2''\ `E'' * } } capture quietly { getmata (CCMLpct`var1'`var2'*)=`CCMLpct`var1'`var2'', force double label variable CCMLpct`var1'`var2'1 "CCM max L test Percentile panel ID" rename CCMLpct`var1'`var2'1 CCMLpct`var1'`var2' label variable CCMLpct`var1'`var2'2 "CCM max L test Percentile" rename CCMLpct`var1'`var2'2 CCMLpct`var1'`var2'_pctile label variable CCMLpct`var1'`var2'3 "CCM max L test values" rename CCMLpct`var1'`var2'3 CCMLpct`var1'`var2'_vals label variable CCMLpct`var1'`var2'4 "CCM max L test Values Minus Observed" rename CCMLpct`var1'`var2'4 CCMLpct`var1'`var2'_valsc label variable CCMLpct`var1'`var2'5 "CCM max L Percentile N" rename CCMLpct`var1'`var2'5 CCMLpct`var1'`var2'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var1'`var2't95counter'/``var1'`var2'tcounter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var1'`var2't99counter'/``var1'`var2'tcounter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==110, meanonly loc testvalGM=r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' if "`testval'" == "rho" loc notess=`"note("{stSerif:Note 1: y=0 is permuted} {&rho}{stSerif:, reject H0 if 99th and/or 95th percentiles are positive}" "{stSerif:Note 2: Of ``var1'`var2'tcounter' panel IDs, ``var1'`var2't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var1'`var2't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means are: Observed} {&rho} {stSerif:= `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Null mean = `nullmeanGMd'}")"' else if "`testval'" == "mae" loc notess=`"note("{stSerif:Note 1: y=0 is permuted MAE, reject H0 if 1st and/or 5th percentiles are negative}" "{stSerif:Note 2: Of ``var1'`var2'tcounter' panel IDs, ``var1'`var2't95counter' (i.e., `pct95rnd2'%) show p < .05, and ``var1'`var2't99counter' (i.e., `pct99rnd2'%) show p < .01}" "{stSerif:Note 3: Raw grand-means are: Observed MAE = `testvalGMd'; `onepct'% = `onepctGMd'; `fivepct'% = `fivepctGMd'; Null mean = `nullmeanGMd'}")"' if "`CCMsave'" == "yes" loc savers1="saving(CCMLtest_`var1'_`var2'_`testval'1, replace)" if "`CCMsave'" == "yes" loc savers2="saving(CCMLtest_`var1'_`var2'_`testval'2, replace)" tempvar titled`var1'`var2' gen `titled`var1'`var2'' = "" label variable `titled`var1'`var2'' "Percentiles of Permuted Null Distribution" replace `titled`var1'`var2''="Permuted Mean Diff" if CCMLpct`var1'`var2'_pctile==50 replace `titled`var1'`var2''="`fivepct'th Pctile Diff" if CCMLpct`var1'`var2'_pctile==`fivepct' replace `titled`var1'`var2''="`onepct'`onesuffix' Pctile Diff" if CCMLpct`var1'`var2'_pctile==`onepct' qui labmask CCMLpct`var1'`var2'_pctile, values(`titled`var1'`var2'') twoway (scatter CCMLpct`var1'`var2'_valsc CCMLpct`var1'`var2' if CCMLpct`var1'`var2'_pctile==50) /// (scatter CCMLpct`var1'`var2'_valsc CCMLpct`var1'`var2' if CCMLpct`var1'`var2'_pctile==`fivepct') /// (scatter CCMLpct`var1'`var2'_valsc CCMLpct`var1'`var2' if CCMLpct`var1'`var2'_pctile==`onepct'), /// legend(label(1 {stSerif:Permuted Mean Diff}) label(2 {stSerif:`fivepct'th Pctile Diff}) label(3 {stSerif:`onepct'`onesuffix' Pctile Diff})) /// title({stSerif:CCM: Test of `var1' CCM causing `var2' at max L}, color(black)) ytitle({stSerif:Prediction Accuracy} {`z'}) /// /*xlab(`panelrange')*/ name(CCMLtest_`var1'_`var2'_`testval'1, replace) yline(0) `notess' graphregion(color(white)) `savers1' xline(`panelrange', lwidth(vthin) lpattern(dash)) if "`exportplot'" != "" { graph export CCMLtest_`var1'_`var2'_`testval'1.`exportplot', replace } serset clear lgraph CCMLpct`var1'`var2'_valsc CCMLpct`var1'`var2' CCMLpct`var1'`var2'_pctile if CCMLpct`var1'`var2'_pctile!=110, graphregion(color(white)) /// yline(0) ytitle({stSerif:Prediction Accuracy} {`z'}) title({stSerif:CCM: Test of `var1' CCM causing `var2' at max L}, color(black)) /// /*xlab(`panelrange')*/ name(CCMLtest_`var1'_`var2'_`testval'2, replace) `notess' `savers2' xline(`panelrange', lwidth(vthin) lpattern(dot)) if "`exportplot'" != "" { graph export CCMLtest_`var1'_`var2'_`testval'2.`exportplot', replace } serset clear } } } } restore } clear mata serset clear * Setup before computing SMAP coefficients check for significant prediction at max L using permutation-based null (i.e., surrogate data) if "`CCMSMAP'" == "yes" { gen long `obsnO'=_n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMLtest_id_`var1'_`var2' if !_rc { loc ncount`var1'`var2' = 0 quietly levelsof CCMLtest_id_`var1'_`var2', local(levels) foreach id of local levels { loc ncount`var1'`var2'`id'=0 if "`StepwiseCCMtest'" == "yes" { tempname CCML2`var1'`var2'`id' capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') sca `CCML2`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] } capture { _pctile CCMLtest_`testval'_`var1'_`var2' if CCMLtest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & r(r2)-`CCML2`var1'`var2'`id'' < 0 loc ncount`var1'`var2'=`ncount`var1'`var2''+1 if "`testval'"=="rho" & r(r2)-`CCML2`var1'`var2'`id'' < 0 loc ncount`var1'`var2'`id'=`ncount`var1'`var2'`id''+1 if "`testval'"=="mae" & r(r2)-`CCML2`var1'`var2'`id'' > 0 loc ncount`var1'`var2'=`ncount`var1'`var2''+1 if "`testval'"=="mae" & r(r2)-`CCML2`var1'`var2'`id'' > 0 loc ncount`var1'`var2'`id'=`ncount`var1'`var2'`id''+1 } } else { loc ncount`var1'`var2'=`ncount`var1'`var2''+1 loc ncount`var1'`var2'`id'=`ncount`var1'`var2'`id''+1 } } } } } drop `obsnO' } * Compute CCM coefficients using SMAP methods if "`CCMSMAP'" == "yes" { tsset foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { if "`ncount`var1'`var2''"!="" { if `ncount`var1'`var2''>0 { summarize CCMLtest_e_`var1'_`var2', meanonly loc levelz=r(max) forval i=0/`levelz' { gen CCM_SMAP_`var1'_`var2'_b`i'=. } quietly levelsof CCMLtest_id_`var1'_`var2', local(levels) foreach id of local levels { if "`ncount`var1'`var2'`id''"!="" { if `ncount`var1'`var2'`id''>0 { di "--------------------------------------------------------------------------------" di "CCM SMAP Cause==`var1', Outcome==`var2'" di "--------------------------------------------------------------------------------" loc a = 0 capture noisily { edm xmap `prestd'd`diff'.`var1' `prestd'd`diff'.`var2' if `panelvar'==`id' ``var1'dropzes1', e(`emax`var1'`id'') direction(oneway) `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' alg(smap) k(`CCMSMAPk') savesmap(z`var1'`var2') `options' } if "`time_gap'"=="dt" loc dtextras = `emax`var1'`id''-1 else loc dtextras = 0 loc a = `emax`var1'`id''+`extraz'+`dtextras' forval i=0/`levelz' { capture confirm variable z`var1'`var2'1_b`i'_rep1 if !_rc { replace CCM_SMAP_`var1'_`var2'_b`i'=z`var1'`var2'1_b`i'_rep1 if z`var1'`var2'1_b`i'_rep1 != . if `a'==`levelz' { local labelz : variable label z`var1'`var2'1_b`i'_rep1 label variable CCM_SMAP_`var1'_`var2'_b`i' "`labelz'" } drop z`var1'`var2'1_b`i'_rep1 } } } } } } } } } } serset clear clear mata * CCM SMAP plots if "`CCMSMAPgraph'"=="yes" & "`plots_on'" =="yes" { preserve foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCM_SMAP_`var1'_`var2'_b0 if !_rc { summarize CCMLtest_e_`var1'_`var2', meanonly loc levelz=r(max) loc a = "" loc `var1'`var2'b = "" loc c = `levelz'-`extraz' loc d = "" loc d`var1'`var2' = "" loc zz`var1'`var2' = "" loc nope="" loc h=0 capture forval i=1/`levelz' { loc labeldt : variable label CCM_SMAP_`var1'_`var2'_b`i' loc first=substr("`labeldt'",1,2) if `i' == 1 | (`i' > `extraz'+1 & "`first'"!="dt") { loc h=`h'+1 replace CCM_SMAP_`var1'_`var2'_b`i'=. if CCM_SMAP_`var1'_`var2'_b`i'==0 sum CCM_SMAP_`var1'_`var2'_b`i', meanonly if r(N) != 0 & `h' <= `CCMSMAPeffects' { loc a = "`a'(kdensity CCM_SMAP_`var1'_`var2'_b`i', lwidth(medthick))" loc d = "`d' CCM_SMAP_`var1'_`var2'_b`i'" loc d`var1'`var2' = "`d`var1'`var2'' CCM_SMAP_`var1'_`var2'_b`i'" loc zz`var1'`var2' = "`zz`var1'`var2'' `timevar' CCM_SMAP_`var1'_`var2'_b`i'" loc nope="`nope' no" loc e = `h'-1 loc `var1'`var2'b = `"``var1'`var2'b' `h' "Lag `e'""' } } } if "`nope'"!="" { if "`CCMSMAPsave'" == "yes" loc savers1="saving(CCM_SMAP_`var1'_`var2'_Marg, replace)" if "`CCMSMAPsave'" == "yes" loc savers2="saving(CCM_SMAP_`var1'_`var2'_Tim1, replace)" if "`CCMSMAPsave'" == "yes" loc savers3="saving(CCM_SMAP_`var1'_`var2'_Tim2, replace)" twoway `a', xline(0) legend(order(``var1'`var2'b') textfirst col(1) position(3)) xtitle({stSerif:Local Linear Slope}) /// name(CCM_SMAP_`var1'_`var2'_Marg, replace) `savers1' title({stSerif:Marginal CCM SMAP Effect of `var1' on `var2'}, color(black)) /// ytitle({stSerif:Density}) graphregion(color(white)) note({stSerif:Note: Differencing at d`diff' and missing coefficient variables reflect problems with SMAP estimates}) /// if "`exportplot'" != "" { graph export CCM_SMAP_`var1'_`var2'_Marg.`exportplot', replace } serset clear } } } } restore } /* foreach var1 in `variables' { foreach var2 in `variables' { capture confirm variable CCM_SMAP_`var1'_`var2'_b0 if !_rc { if "`d`var1'`var2''"!="" { preserve capture forval i=1/`levelz' { loc labeldt : variable label CCM_SMAP_`var1'_`var2'_b`i' loc first=substr("`labeldt'",1,2) if `i' == 1 | (`i' > `extraz'+1 & "`first'"!="dt") { replace CCM_SMAP_`var1'_`var2'_b`i'=. if CCM_SMAP_`var1'_`var2'_b`i'==0 } } tempvar meanZ egen `meanZ'= rowmean(`d`var1'`var2'') drop if `meanZ'==. tsset tsfill tsline `d`var1'`var2'', xtitle({stSerif:Time Variable}) ytitle({stSerif:Local Linear Slopes}) /// graphregion(color(white)) name(CCM_SMAP_`var1'_`var2'_Tim1, replace) cmissing(`nope') lwidth(medthick ..) /// title({stSerif:CCM SMAP Effect Averages of `var1' on `var2'}, color(black)) `savers2' /// legend(order(``var1'`var2'b') textfirst col(1) position(3)) note({stSerif:Note: Differencing at d`diff' and missing coefficient variables reflect problems with SMAP estimates}) if "`exportplot'" != "" { graph export CCM_SMAP_`var1'_`var2'_Tim1.`exportplot', replace } restore } } } } */ /* foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMLtest_id_`var1'_`var2' if !_rc { preserve stack `zz`var1'`var2'', into(`timevar' CCM_SMAP_`var1'_`var2') clear gen Lag = _stack-1 drop if CCM_SMAP_`var1'_`var2' ==. | (CCM_SMAP_`var1'_`var2'==0 & Lag==0) lgraph CCM_SMAP_`var1'_`var2' `timevar' Lag, statistic(`CCMSMAPstat') /// separate(.005) fit(`CCMSMAPlinefit') /* errortype(`CCMSMAPerrortype') */ /// title({stSerif:CCM SMAP Effect Averages of `var1' on `var2'}, color(black)) /// ytitle({stSerif:Local Linear Slopes}) xtitle({stSerif:Time Variable}) /// note({stSerif:Note: Differencing at d`diff'}) `savers3' /// graphregion(color(white)) bgcolor(white) nomarker /// name(CCM_SMAP_`var1'_`var2'_Tim2, replace) legend(order(`b') textfirst col(1) position(3)) if "`exportplot'" != "" { graph export CCM_SMAP_`var1'_`var2'_Tim2.`exportplot', replace } restore } } } */ * local CCMSMAPerrortype = "ci(95)" // Set errortype from lgraph command, here default is 95% ci but can set minmax for example or quantiles * Note 1: For no CI enter 0 * Save all results if "`saveresults'"=="yes" { putexcel set EDMresults.xlsx, replace * Save SP results capture qui { foreach dval of numlist `drange' { putexcel set EDMresults.xlsx, sheet("SP_d`dval'") modify putexcel A1 = "Simplex projection results using `useresults' and differencing at d`dval' (d0 = raw data, no differencing; d1 = first-differencing; etc.)" putexcel A3 = "No results exist, check data and/or EDM input" mata: spresults = `eminimum'::`emaximum' loc list = "E" foreach var of varlist `variables' { capture confirm variable SPlong_`useresults'_`var' if !_rc { mata: mm = J(0,1,.) foreach e of numlist `eminimum'/`emaximum' { qui sum SPlong_`useresults'_`var' if SPlong_d_`var' == `dval' & SPlong_e_`var' == `e', meanonly loc mn = r(mean) mata: mm = mm \ `mn' } mata: spresults = spresults , mm loc list = "`list' `var'" } } } mata: st_matrix("spresults", spresults) matrix colnames spresults = `list' putexcel A2 = matrix(spresults), names loc num = `emaximum'+2 loc num2 = `num'+1 putexcel A`num' = "Note: results are averaged across panels if running 'multiple EDM' with N>1" putexcel A`num2' = "for additional information such as sample sizes or SDs see the SP variables saved to the data file" } * Save SP hypothesis test results capture qui { putexcel set EDMresults.xlsx, sheet("SP_test") modify putexcel A1 = "SP hypothesis test for prediction accuracy based on `testval' against permuted/surrogate data replications as the null distribution" putexcel A3 = "No results exist, check data and/or EDM input" mata: SPtest = J(7,0,.) loc list = "" preserve if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var of varlist `variables' { tempname SPpct`var' mata: `SPpct`var'' = J(0,5,.) loc `var'counter=0 loc `var't95counter=0 loc `var't99counter=0 foreach id of numlist `panelrange' { if "`emax`var'`id''"!="" { if `emax`var'`id''>0 { loc `var'counter=``var'counter'+1 capture { edm explore `prestd'd`diff'.`var' if `panelvar'==`id' ``var'dropzes', e(`emax`var'`id'') full `forcing' `time_gap' `missings' extra(`extras`diff'' `extras_raw') `tweight' `mdistance' reportrawe `options' mat `Amat' = e(explore_result) if "`testval'"=="rho" sca `Test1' = `Amat'[1,3] else if "`testval'"=="mae" sca `Test1' = `Amat'[1,4] _pctile SPtestl_`testval'_`var' if SPtestl_id_`var'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `Test1'-r(r2) > 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="rho" & `Test1'-r(r3) > 0 loc `var't99counter=``var't99counter'+1 if "`testval'"=="mae" & `Test1'-r(r2) < 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & `Test1'-r(r1) < 0 loc `var't99counter=``var't99counter'+1 mata: `Test1'=st_numscalar("`Test1'") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `Test1') mata: `C2' = (`Test1'-st_numscalar("r(r1)"), `Test1'-st_numscalar("r(r2)"), `Test1'-st_numscalar("r(r3)"), 0) mata: `D' = J(1,4,st_numscalar("e(N)")) mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `SPpct`var''=`SPpct`var''\ `E'' } } } } capture quietly { getmata (SPpct`var'*)=`SPpct`var'', force double label variable SPpct`var'1 "SP Percentile panel ID" rename SPpct`var'1 SPpct`var' label variable SPpct`var'2 "SP Percentile" rename SPpct`var'2 SPpct`var'_pctile label variable SPpct`var'3 "SP test values" rename SPpct`var'3 SPpct`var'_vals label variable SPpct`var'4 "SP test Values Minus Observed" rename SPpct`var'4 SPpct`var'_valsc label variable SPpct`var'5 "SP Percentile N" rename SPpct`var'5 SPpct`var'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var't95counter'/``var'counter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var't99counter'/``var'counter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum SPpct`var'_vals if SPpct`var'_pctile==110, meanonly loc testvalGM=r(mean) loc `var'SPfullresult = r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum SPpct`var'_vals if SPpct`var'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum SPpct`var'_vals if SPpct`var'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum SPpct`var'_vals if SPpct`var'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' } mata: mm = `testvalGMd' \ `nullmeanGMd' \ `fivepctGMd' \ `onepctGMd' \ `pct95rnd2' \ `pct99rnd2' \ ``var'counter' mata: SPtest = SPtest , mm loc list = "`list' `var'" } mata: st_matrix("SPtest", SPtest) matrix colnames SPtest = `list' matrix rownames SPtest = "Observed `testval'" "Permuted mean" "Permuted `fivepct'%" "Permuted `onepct'%" "p<.05 percentage" "p<.01 percentage" "Number of panels" putexcel A2 = matrix(SPtest), names putexcel A10 = "Note: the 'observed' and permuted SP values are for the entire sample using the 'full' option with leave-one-out cross-validation" putexcel A11 = "The p<.05 and p<.01 percentages show what percentage of panel IDs reject this null H0" putexcel A12 = "So for N=1 or 'multispatial EDM' with N>1 cases this is either non-significant and 0 or significant and 100" putexcel A13 = "Note also that for 'multiple EDM' with N>1 the coefficient percentiles are computed for each panel ID and are then averaged" putexcel A14 = "For additional information see the permutation-based replications in SPtest variables" restore } * Save tp results capture qui { putexcel set EDMresults.xlsx, sheet("TP") modify putexcel A1 = "Time of prediction or tp results using `useresults'" putexcel A3 = "No results exist, check data and/or EDM input" mata: tpresults = 1::`TPmax' loc list = "tp" foreach var of varlist `variables' { capture confirm variable TPlong_`useresults'_`var' if !_rc { mata: mm = J(0,1,.) forval i=1/`TPmax' { qui sum TPlong_`useresults'_`var' if TPlong_tp_`var' == `i', meanonly loc mn = r(mean) mata: mm = mm \ `mn' } mata: tpresults mata: mm mata: tpresults = tpresults , mm loc list = "`list' `var'" } } mata: st_matrix("tpresults", tpresults) matrix colnames tpresults = `list' putexcel A2 = matrix(tpresults), names loc num = `TPmax'+3 putexcel A`num' = "Note: results are averaged across panels if running 'multiple EDM' with N>1, for additional information see the saved TP variables" } * Save S-map results capture qui { putexcel set EDMresults.xlsx, sheet("Smap") modify putexcel A1 = "S-map results using `useresults' at different theta values" putexcel A3 = "No results exist, check data and/or EDM input" local numtemp: word count `theta' mata: SMAPresults = J(0,1,.) foreach i of numlist `theta' { mata: SMAPresults = SMAPresults \ `i' } loc list = "theta" foreach var of varlist `variables' { capture confirm variable SMAPlong_`useresults'_`var' if !_rc { mata: mm = J(0,1,.) foreach i of numlist `theta' { qui sum SMAPlong_`useresults'_`var' if SMAPlong_theta_`var' == `i', meanonly loc mn = r(mean) mata: mm = mm \ `mn' } mata: SMAPresults = SMAPresults , mm loc list = "`list' `var'" } } mata: st_matrix("SMAPresults", SMAPresults) matrix colnames SMAPresults = `list' putexcel A2 = matrix(SMAPresults), names loc num = `numtemp'+3 loc num2 = `num'+1 putexcel A`num' = "Note: results are averaged across panels if running 'multiple EDM' with N>1, for additional information see the saved SMAP variables" } * Save S-map hypothesis test results capture qui { putexcel set EDMresults.xlsx, sheet("Smap_test") modify putexcel A1 = "S-map hypothesis test for nonlinearity comparing theta=0 to optimal theta>0 based on `testval'" putexcel A3 = "No results exist, check data and/or EDM input" mata: SMAPtest = J(6,0,.) loc list = "" preserve if "`testval'"=="rho" loc pctiles = "1, 5, 50" else if "`testval'"=="mae" loc pctiles = "50, 95, 99" foreach var of varlist `variables' { loc `var'counter=0 loc `var't95counter=0 loc `var't99counter=0 tempname SMAPpct`var' tempvar SMAPtestval`var' mata: `SMAPpct`var'' = J(0,3,.) gen `SMAPtestval`var''=. capture { qui bysort SMAPtestl_id_`var' SMAPtestl_rep_`var' (SMAPtestl_theta_`var'): replace `SMAPtestval`var''=SMAPtestl_`testval'_`var'-SMAPtestl_`testval'_`var'[_n-1] if SMAPtestl_theta_`var' > 0 & SMAPtestl_theta_`var'[_n-1]==0 & SMAPtestl_theta_`var'!=. } foreach id of numlist `panelrange' { loc `var'counter=``var'counter'+1 capture { _pctile `SMAPtestval`var'' if SMAPtestl_id_`var'==`id', percentiles(`pctiles') if "`testval'"=="rho" & r(r1) > 0 loc `var't99counter=``var't99counter'+1 if "`testval'"=="rho" & r(r2) > 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & r(r2) < 0 loc `var't95counter=``var't95counter'+1 if "`testval'"=="mae" & r(r3) < 0 loc `var't99counter=``var't99counter'+1 mata: `A' = J(1,3,`id') mata: `B' = (`pctiles') mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)")) mata: `E' = `A'\ `B'\ `C' mata: `SMAPpct`var''=`SMAPpct`var''\ `E'' } } drop `SMAPtestval`var'' capture quietly { getmata (SMAPpct`var'*)=`SMAPpct`var'', force double label variable SMAPpct`var'1 "SMAP Percentile panel ID" rename SMAPpct`var'1 SMAPpct`var' label variable SMAPpct`var'2 "SMAP Percentile" rename SMAPpct`var'2 SMAPpct`var'_pctile label variable SMAPpct`var'3 "SMAP test values" rename SMAPpct`var'3 SMAPpct`var'_vals } capture { if "`testval'"=="mae" loc fivepct = 95 else if "`testval'"=="rho" loc fivepct = 5 if "`testval'"=="mae" loc onepct = 99 else if "`testval'"=="rho" loc onepct = 1 if "`testval'"=="mae" loc onesuffix = "th" else if "`testval'"=="rho" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var't95counter'/``var'counter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var't99counter'/``var'counter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.0001) loc onepctGMd: di %9.4f `onepctGMr' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.0001) loc fivepctGMd: di %9.4f `fivepctGMr' sum SMAPpct`var'_vals if SMAPpct`var'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.0001) loc nullmeanGMd: di %9.4f `nullmeanGMr' } mata: mm = `nullmeanGMd' \ `fivepctGMd' \ `onepctGMd' \ `pct95rnd2' \ `pct99rnd2' \ ``var'counter' mata: SMAPtest = SMAPtest , mm loc list = "`list' `var'" } mata: st_matrix("SMAPtest", SMAPtest) matrix colnames SMAPtest = `list' matrix rownames SMAPtest = "Mean difference" "`fivepct'% difference" "`onepct'% difference" "p<.05 percentage" "p<.01 percentage" "Number of panels" putexcel A2 = matrix(SMAPtest), names putexcel A9 = "Note: values are `testval' at best theta > 0 minus `testval' at theta = 0; for rho positive values support nonlinearity but for MAE negative values support nonlinearity" putexcel A10 = "Note: the p<.05 and p<.01 percentages show what percentage of panel IDs reject this null H0" putexcel A11 = "For N=1 or 'multispatial EDM' with N>1 cases this is either non-significant and 0 or significant and 100" putexcel A12 = "For additional information see replications in SMAPtestl variables" restore } * Save coprediction results as a square but non-symmetric correlation matrix capture qui { putexcel set EDMresults.xlsx, sheet("Copredict") modify putexcel A1 = "Coprediction results as a square but non-symmetric matrix of `useresults' values" putexcel A3 = "No results exist, check data and/or EDM input" local numtemp: word count `variables' mata: Copred = J(`numtemp',`numtemp',.) forval n=1/`numtemp' { forval m=1/`numtemp' { qui sum CoPredLong_`useresults' if CoPredLong_Predictor == `n' & CoPredLong_Predictee == `m', meanonly loc mn = r(mean) mata: Copred[`m',`n']=`mn' } } mata: st_matrix("Copred", Copred) matrix colnames Copred = `variables' matrix rownames Copred = `variables' putexcel A2 = matrix(Copred), names loc num = `numtemp'+3 loc num2 = `num'+1 putexcel A`num' = "Note: results are averaged across panels if running 'multiple EDM' with N>1, for additional information see saved CoPred variables" putexcel A`num2' = "Predictor is columns and predictees are row" } * Save CCM results as a matrix (predictors are columns, outcomes are rows) with Simplex Projection results on the diagonal capture qui { preserve gen long `obsnO'=_n putexcel set EDMresults.xlsx, sheet("CMM_results_matrix") modify putexcel A1 = "CCM coprediction results as a square but non-symmetric matrix of `useresults' values, with SP results on diagonal" putexcel A3 = "No results exist, check data and/or EDM input" local numtemp: word count `variables' mata: CCM_matrix = J(`numtemp',`numtemp',.) loc n = 0 foreach var1 of varlist `variables' { loc n = `n'+1 loc m = 0 foreach var2 of varlist `variables' { loc m = `m'+1 capture confirm variable CCM_`useresults'_`var1'_`var2' if !_rc { * Find max library size L for each panel ID and each lag, so only max L results are plotted gen temp12`var1'`var2'lagz00 = . foreach id of numlist `panelrange' { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 replace temp12`var1'`var2'lagz00 = CCM_`testval'_`var1'_`var2' if _n ==`r(levels)' } sum temp12`var1'`var2'lagz00 loc mn = r(mean) mata: CCM_matrix[`m',`n']=`mn' } else if `m' == `n' { loc mn = ``var2'SPfullresult' mata: CCM_matrix[`m',`n']=`mn' } } } mata: st_matrix("CCM_matrix", CCM_matrix) matrix colnames CCM_matrix = `variables' matrix rownames CCM_matrix = `variables' putexcel A2 = matrix(CCM_matrix), names loc num = `numtemp'+3 loc num2 = `num'+1 loc num3 = `num'+2 putexcel A`num' = "Cause is columns and outcomes are rows; for comparison the on-diagonal elements are SP results for full sample (variable predicts its own future)" putexcel A`num2' = "Note: results are averaged across panels if running 'multiple EDM' with N>1" putexcel A`num3' = "Note: For additional information see the CCM variables saved to the data file including CCM_lag variables" restore } * Save CCM results at the requested lags and leads capture qui { preserve gen long `obsnO'=_n loc numtemp = `CCMlags'+`CCMleads'+1 loc list = "Lag" mata: CCMlags = -`CCMlags'::`CCMleads' putexcel set EDMresults.xlsx, sheet("CCM_lags") modify putexcel A1 = "CCM prediction accuracy using `useresults' at the requested lag levels (negative lags imply an impossible backwards in time cause-effect relationship)" putexcel A3 = "No results exist, check data and/or EDM input" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCM_`useresults'_`var1'_`var2' if !_rc { * Find max library size L for each panel ID and each lag, so only max L results are plotted gen temp12`var1'`var2'lagz00 = . mata: mm = J(0,1,.) foreach id of numlist `panelrange' { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 replace temp12`var1'`var2'lagz00 = CCM_`testval'_`var1'_`var2' if _n ==`r(levels)' } forval i = -`CCMlags'(1)`CCMleads' { if `i' == 0 { sum temp12`var1'`var2'lagz00 loc mn = r(mean) mata: mm = mm \ `mn' } else if `i'!=0 { capture quietly sum CCM_`testval'_`var1'_`var2' if CCM_lag_`var1'_`var2'==`i' loc mn = r(mean) mata: mm = mm \ `mn' } } mata: CCMlags mata: mm mata: CCMlags = CCMlags , mm loc list = "`list' `var1'_cause_`var2'" } } } mata: st_matrix("CCMlags", CCMlags) matrix colnames CCMlags = `list' putexcel A2 = matrix(CCMlags), names loc num = `numtemp'+3 loc num2 = `num'+1 loc num3 = `num'+2 loc num4 = `num'+3 putexcel A`num' = "Note: For additional information see the CCM variables saved to the data file including CCM_lag variables" putexcel A`num2' = "Positive lags are forward in time, so should make worse predictions, but if prediction improves this implies non-causal relationships" putexcel A`num3' = "See Ye et al. (2015) Distinguishing time-delayed causal interactions using convergent cross mapping" restore } * Save CCM hypothesis test of convergence using replications option capture qui { putexcel set EDMresults.xlsx, sheet("CCM_ctest") modify putexcel A1 = "CCM hypothesis test for convergence from minimum library size L using `testval' with `CCMctestreps' replications" putexcel A3 = "No results exist, perhaps due to optimal predictions occuring at a forward lag with StepwiseCCMtest=yes" mata: ctest = J(7,0,.) loc list = "" preserve gen long `obsnO'=_n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMctest_id_`var1'_`var2' if !_rc { tempname CCML2pct`var1'`var2' mata: `CCML2pct`var1'`var2'' = J(0,5,.) loc `var1'`var2'tcounter=0 loc `var1'`var2't95counter=0 loc `var1'`var2't99counter=0 quietly levelsof CCMctest_id_`var1'_`var2', local(levels) foreach id of local levels { loc `var1'`var2'tcounter=``var1'`var2'tcounter'+1 tempname CCML2`var1'`var2'`id' capture quietly { summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2') sca `CCML2`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] } capture { _pctile CCMctest_`testval'_`var1'_`var2' if CCMctest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `CCML2`var1'`var2'`id''-r(r2) > 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="rho" & `CCML2`var1'`var2'`id''-r(r3) > 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 if "`testval'"=="mae" & `CCML2`var1'`var2'`id''-r(r2) < 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="mae" & `CCML2`var1'`var2'`id''-r(r1) < 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 mata: `CCML2`var1'`var2'`id''=st_numscalar("`CCML2`var1'`var2'`id''") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `CCML2`var1'`var2'`id'') mata: `C2' = (`CCML2`var1'`var2'`id''-st_numscalar("r(r1)"), `CCML2`var1'`var2'`id''-st_numscalar("r(r2)"), `CCML2`var1'`var2'`id''-st_numscalar("r(r3)"), 0) capture confirm scalar e(N) if !_rc { mata: `D' = J(1,4,st_numscalar("e(N)")) } else { mata: `D' = J(1,4,.) } mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `CCML2pct`var1'`var2''=`CCML2pct`var1'`var2''\ `E'' } } capture { getmata (CCML2pct`var1'`var2'*)=`CCML2pct`var1'`var2'', force double label variable CCML2pct`var1'`var2'1 "CCM convergence test Percentile panel ID" rename CCML2pct`var1'`var2'1 CCML2pct`var1'`var2' label variable CCML2pct`var1'`var2'2 "CCM convergence test Percentile" rename CCML2pct`var1'`var2'2 CCML2pct`var1'`var2'_pctile label variable CCML2pct`var1'`var2'3 "CCM convergence test values" rename CCML2pct`var1'`var2'3 CCML2pct`var1'`var2'_vals label variable CCML2pct`var1'`var2'4 "CCM convergence test Values Minus Observed" rename CCML2pct`var1'`var2'4 CCML2pct`var1'`var2'_valsc label variable CCML2pct`var1'`var2'5 "CCM convergence test Percentile N" rename CCML2pct`var1'`var2'5 CCML2pct`var1'`var2'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var1'`var2't95counter'/``var1'`var2'tcounter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var1'`var2't99counter'/``var1'`var2'tcounter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==110, meanonly loc testvalGM=r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum CCML2pct`var1'`var2'_vals if CCML2pct`var1'`var2'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' } mata: mm = `testvalGMd' \ `nullmeanGMd' \ `fivepctGMd' \ `onepctGMd' \ `pct95rnd2' \ `pct99rnd2' \ ``var1'`var2'tcounter' mata: ctest = ctest , mm loc list = "`list' `var1'_cause_`var2'" } } } mata: st_matrix("ctest", ctest) matrix colnames ctest = `list' matrix rownames ctest = "Observed `testval'" "Replicated mean" "Replicated `fivepct'%" "Replicated `onepct'%" "p<.05 percentage" "p<.01 percentage" "Number of panels" putexcel A2 = matrix(ctest), names putexcel A10 = "Note: the p<.05 and p<.01 percentages show what percentage of panel IDs reject this null H0" putexcel A11 = "For N=1 or 'multispatial EDM' with N>1 cases this is either non-significant and 0 or significant and 100" putexcel A12 = "Note also that for 'multiple EDM' with N>1 the coefficient percentiles are computed for each panel ID and are then averaged" putexcel A13 = "For additional information see replications in CCMctest variables" restore } * Save hypothesis test of rho or mae (`testval') at max L capture qui { putexcel set EDMresults.xlsx, sheet("CCM_Ltest") modify putexcel A1 = "CCM hypothesis test for `testval' at maximum library size L test using `CCMLtestreps' permuted/surrogate data replications" putexcel A3 = "No results exist, perhaps due to previous hypothesis tests being non-significant with StepwiseCCMtest=yes" mata: Ltest = J(7,0,.) loc list = "" preserve gen long `obsnO' = _n if "`testval'"=="rho" loc pctiles = "50, 95, 99" else if "`testval'"=="mae" loc pctiles = "1, 5, 50" foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { capture confirm variable CCMLtest_id_`var1'_`var2' if !_rc { mata: mm = J(0,1,.) tempname CCMLpct`var1'`var2' mata: `CCMLpct`var1'`var2'' = J(0,5,.) loc `var1'`var2'tcounter=0 loc `var1'`var2't95counter=0 loc `var1'`var2't99counter=0 quietly levelsof CCMLtest_id_`var1'_`var2', local(levels) foreach id of local levels { quietly summarize CCM_L_`var1'_`var2' if CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 & !missing(CCM_`testval'_`var1'_`var2'), meanonly loc levelz=r(max) levelsof `obsnO' if CCM_L_`var1'_`var2' == `levelz' & CCM_id_`var1'_`var2'==`id' & CCM_lag_`var1'_`var2'==0 tempname CCML`var1'`var2'`id' sca `CCML`var1'`var2'`id''=CCM_`testval'_`var1'_`var2'[`r(levels)'] loc `var1'`var2'tcounter=``var1'`var2'tcounter'+1 _pctile CCMLtest_`testval'_`var1'_`var2' if CCMLtest_id_`var1'_`var2'==`id', percentiles(`pctiles') if "`testval'"=="rho" & `CCML`var1'`var2'`id''-r(r2) > 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="rho" & `CCML`var1'`var2'`id''-r(r3) > 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 if "`testval'"=="mae" & `CCML`var1'`var2'`id''-r(r2) < 0 loc `var1'`var2't95counter=``var1'`var2't95counter'+1 if "`testval'"=="mae" & `CCML`var1'`var2'`id''-r(r1) < 0 loc `var1'`var2't99counter=``var1'`var2't99counter'+1 mata: `CCML`var1'`var2'`id''=st_numscalar("`CCML`var1'`var2'`id''") mata: `A' = J(1,4,`id') mata: `B' = (`pctiles', 110) mata: `C' = (st_numscalar("r(r1)"), st_numscalar("r(r2)"), st_numscalar("r(r3)"), `CCML`var1'`var2'`id'') mata: `C2' = (`CCML`var1'`var2'`id''-st_numscalar("r(r1)"), `CCML`var1'`var2'`id''-st_numscalar("r(r2)"), `CCML`var1'`var2'`id''-st_numscalar("r(r3)"), 0) capture confirm scalar e(N) if !_rc { mata: `D' = J(1,4,st_numscalar("e(N)")) } else { mata: `D' = J(1,4,.) } mata: `E' = `A'\ `B'\ `C'\ `C2'\ `D' mata: `CCMLpct`var1'`var2''=`CCMLpct`var1'`var2''\ `E'' } capture quietly { getmata (CCMLpct`var1'`var2'*)=`CCMLpct`var1'`var2'', force double label variable CCMLpct`var1'`var2'1 "CCM max L test Percentile panel ID" rename CCMLpct`var1'`var2'1 CCMLpct`var1'`var2' label variable CCMLpct`var1'`var2'2 "CCM max L test Percentile" rename CCMLpct`var1'`var2'2 CCMLpct`var1'`var2'_pctile label variable CCMLpct`var1'`var2'3 "CCM max L test values" rename CCMLpct`var1'`var2'3 CCMLpct`var1'`var2'_vals label variable CCMLpct`var1'`var2'4 "CCM max L test Values Minus Observed" rename CCMLpct`var1'`var2'4 CCMLpct`var1'`var2'_valsc label variable CCMLpct`var1'`var2'5 "CCM max L Percentile N" rename CCMLpct`var1'`var2'5 CCMLpct`var1'`var2'_N } capture { if "`testval'"=="rho" loc fivepct = 95 else if "`testval'"=="mae" loc fivepct = 5 if "`testval'"=="rho" loc onepct = 99 else if "`testval'"=="mae" loc onepct = 1 if "`testval'"=="rho" loc onesuffix = "th" else if "`testval'"=="mae" loc onesuffix = "st" if "`testval'" == "rho" loc z="&rho" else if "`testval'" == "mae" loc z="stSerif:MAE" loc pct95rnd = round(100*``var1'`var2't95counter'/``var1'`var2'tcounter', 1) loc pct95rnd2: di %3.0f `pct95rnd' loc pct99rnd = round(100*``var1'`var2't99counter'/``var1'`var2'tcounter', 1) loc pct99rnd2: di %3.0f `pct99rnd' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==110, meanonly loc testvalGM=r(mean) loc testvalGMr=round(`testvalGM',.001) loc testvalGMd: di %8.3f `testvalGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==`onepct', meanonly loc onepctGM=r(mean) loc onepctGMr=round(`onepctGM',.001) loc onepctGMd: di %8.3f `onepctGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==`fivepct' , meanonly loc fivepctGM=r(mean) loc fivepctGMr=round(`fivepctGM',.001) loc fivepctGMd: di %8.3f `fivepctGMr' sum CCMLpct`var1'`var2'_vals if CCMLpct`var1'`var2'_pctile==50, meanonly loc nullmeanGM=r(mean) loc nullmeanGMr=round(`nullmeanGM',.001) loc nullmeanGMd: di %6.3f `nullmeanGMr' } mata: mm = `testvalGMd' \ `nullmeanGMd' \ `fivepctGMd' \ `onepctGMd' \ `pct95rnd2' \ `pct99rnd2' \ ``var1'`var2'tcounter' mata: Ltest = Ltest , mm loc list = "`list' `var1'_cause_`var2'" } } } mata: st_matrix("Ltest", Ltest) matrix colnames Ltest = `list' matrix rownames Ltest = "Observed `testval'" "Permuted mean" "Permuted `fivepct'%" "Permuted `onepct'%" "p<.05 percentage" "p<.01 percentage" "Number of panels" putexcel A2 = matrix(Ltest), names putexcel A10 = "Note: the p<.05 and p<.01 percentages show what percentage of panel IDs reject this null H0" putexcel A11 = "For N=1 or 'multispatial EDM' with N>1 cases this is either non-significant and 0 or significant and 100" putexcel A12 = "Note also that for 'multiple EDM' with N>1 the coefficient percentiles are computed for each panel ID and are then averaged" putexcel A13 = "For additional information see the permutation-based replications in CCMLtest variables" putexcel A14 = "If StepwiseCCMtest=yes then sample sizes can be smaller compared to convergence test, thus average `testval' reported may differ" restore } * Save average CCM SMAP Results capture qui { putexcel set EDMresults.xlsx, sheet("CCM_SMAP") modify putexcel A1 = "CCM S-map coefficients averaged across time with 5th and 95th percentiles, mean, and SD values" putexcel A3 = "No results exist, perhaps due to previous hypothesis tests being non-significant with StepwiseCCMtest=yes" loc list = "Lag" loc j = 0 loc jj = 0 foreach var1 of varlist `variables' { quietly summarize SMAP_e_`var1', meanonly if `jj' < r(max) loc jj = r(max) foreach var2 of varlist `variables' { capture confirm variable CCM_SMAP_`var1'_`var2'_b0 if !_rc { quietly summarize CCMLtest_e_`var1'_`var2', meanonly loc CCMLeval`var1'`var2' = r(max) if `j' < r(max) loc j = r(max) } } } mata: CCMSMAPresults = 0::`jj'-1 mata: CCMSMAPresults[1,1]=0 foreach var1 of varlist `variables' { foreach var2 of varlist `variables' { loc h = 0 mata: mm = J(0,4,.) capture confirm variable CCM_SMAP_`var1'_`var2'_b0 if !_rc { forval i=1/`j' { if `CCMLeval`var1'`var2'' >= `i' { loc labeldt : variable label CCM_SMAP_`var1'_`var2'_b`i' loc first=substr("`labeldt'",1,2) if `i' == 1 | (`i' > `extraz'+1 & "`first'"!="dt") { loc h = `h'+1 qui sum CCM_SMAP_`var1'_`var2'_b`i', detail loc fpct = r(p5) loc mn = r(mean) loc npct = r(p95) loc sd = r(sd) mata: mm mata: mm = mm \ `fpct' , `mn' , `npct' , `sd' mata: mm } } } if `h' < `jj' { loc bb = `jj' - `h' forval i= 1/`bb' { mata: mm = mm \ .,.,.,. } } mata: CCMSMAPresults = CCMSMAPresults , mm loc list = "`list' `var1'_cause_`var2'_5% `var1'_cause_`var2'_mean `var1'_cause_`var2'_95% `var1'_cause_`var2'_SD" } } } mata: st_matrix("CCMSMAPresults", CCMSMAPresults) matrix colnames CCMSMAPresults = `list' matrix colnames CCMSMAPresults = `list' putexcel A2 = matrix(CCMSMAPresults), names loc num = `jj'+3 putexcel A`num' = "Note: For additional information see the CCM_SMAP variables saved to the data file" } } /* di "************************************************************************************" di "* RESULTS ARE SAVED TO FILE EDMresults.xlsx *" di "* FOR ADDITIONAL INFO/RESULTS SEE NEW SAVED VARIABLES *" di "* SORRY FOR ALL THE PLOTS -- I GOT A BIT CARRIED AWAY *" di "* ALSO PLEASE CALL YOUR ELECTED OFFICIALS TO VOICE CONCERNS ABOUT GLOBAL WARMING *" di "************************************************************************************" */