The core component of all four of these analyses (ANOVA, ANCOVA, MANOVA, AND MANCOVA) is the first in the list, the ANOVA. An "Analysis of Variance" (ANOVA) tests three or more groups for mean differences based on a continuous (i.e. scale or interval) response variable (a.k.a. dependent variable). The term "factor" refers to the variable that distinguishes this group membership. Race, level of education, and treatment condition are examples of factors.

There are two main types of ANOVA: (1) "one-way" ANOVA compares levels (i.e. groups) of a single factor based on single continuous response variable (e.g. comparing test score by 'level of education') and (2) a "two-way" ANOVA compares levels of two or more factors for mean differences on a single continuous response variable(e.g. comparing test score by both 'level of education' and 'zodiac sign'). In practice, you will see one-way ANOVAs more often and when the term ANOVA is generically used, it often refers to a one-way ANOVA. Henceforth in this blog entry, I use the term ANOVA to refer to the one-way flavor.

One-way ANOVA has one continuous response variable (e.g. Test Score) compared by three or more levels of a factor variable (e.g. Level of Education).

Two-way ANOVA has one continuous response variable (e.g. Test Score) compared by more than one factor variable (e.g. Level of Education and Zodiac Sign).

ALSO CHECK OUT: Wikieducator has a nice set of slides explaining the distinctions between one-way and two-way ANOVA |

The obvious difference between ANOVA and ANCOVA is the the letter "C", which stands for 'covariance'. Like ANOVA, "Analysis of Covariance" (ANCOVA) has a single continuous response variable. Unlike ANOVA, ANCOVA compares a response variable by both a factor and a continuous independent variable (e.g. comparing test score by both 'level of education' and 'number of hours spent studying'). The term for the continuous independent variable (IV) used in ANCOVA is "covariate".

ANCOVA compares a continuous response variable (e.g. Test Score) by levels of a factor variable (e.g. Level of Education), controlling for a continuous covariate (e.g. Number of Hours Spent Studying).

ANCOVA is also commonly used to describe analyses with a single response variable, continuous IVs, and no factors. Such an analysis is also known as a regression. In fact, you can get almost identical results in SPSS by conducting this analysis using either the "Analyze > Regression > Linear" dialog menus or the "Analze > General Linear Model (GLM) > Univariate" dialog menus.

A key (but not only) difference in these methods is that you get slightly different output tables. Also, regression requires that user dummy code factors, while GLM handles dummy coding through the "contrasts" option. The linear regression command in SPSS also allows for variable entry in hierarchical blocks (i.e. stages).

The obvious difference between ANOVA and a "Multivariate Analysis of Variance" (MANOVA) is the “M”, which stands for multivariate. In basic terms, A MANOVA is an ANOVA with two or more continuous response variables. Like ANOVA, MANOVA has both a one-way flavor and a two-way flavor. The number of factor variables involved distinguish a one-way MANOVA from a two-way MANOVA.

One-way MANOVA compares two or more continuous response variables (e.g. Test Score and Annual Income) by a single factor variable (e.g. Level of Education).

Two-way MANOVA compares two or more continuous response variables (e.g. Test Score and Annual Income) by two or more factor variables (e.g. Level of Education and Zodiac Sign).

When comparing two or more continuous response variables by a single factor, a one-way MANOVA is appropriate (e.g. comparing ‘test score’ and ‘annual income’ together by ‘level of education’). A two-way MANOVA also entails two or more continuous response variables, but compares them by at least two factors (e.g. comparing ‘test score’ and ‘annual income’ together by both ‘level of education’ and ‘zodiac sign’).

A more subtle way that MANOVA differs from ANOVA is that MANOVA compares levels of a factor that has only two levels (a.k.a. binary). When dealing with a single response variable and binary factor (e.g. gender), one uses an independent sample t-test. However, a t-test can not estimate differences for more than one response variable together, thus a MANOVA fills that need.

Like ANOVA and ANCOVA, the main difference between MANOVA and MANCOVA is the “C,” which again stands for “covariance.” Both a MANOVA and MANCOVA feature two or more response variables, but the key difference between the two is the nature of the IVs. While a MANOVA can include only factors, an analysis evolves from MANOVA to MANCOVA when one or more more covariates are added to the mix.

MANCOVA compares two or more continuous response variables (e.g. Test Scores and Annual Income) by levels of a factor variable (e.g. Level of Education), controlling for a covariate (e.g. Number of Hours Spent Studying).

SPSS NOTE: When running either a MANOVA or MANCOVA, SPSS produces tables that show whether response variables (on the whole) vary by levels of your factor(s). SPSS also produces a table that presents follow-up univariate analyses (i.e. one response variable at a time - ANOVA/ANCOVA). This table shows which response variables in particular vary by level of the factors tested. In most cases, we are only concerned with this table when we find significant differences in the initial multivariate (a.k.a. omnibus) test. In other words, we first determine if our set of response variables differ by levels of our factor(s) and then explore which are driving any significant differences we find.

]]>Imagine we collected a score from every person in your town that measured how much they wanted ice cream at the particular moment of data collection (let's say scores could range from 1 to 100, with 100 meaning REALLY WANT ice cream). Further, let's pretend we did this once a day for 5 days. Our within-subject effect would be a measure of how much individuals in our sample tended to change on their wanting of ice cream over the five days.

Each colored line represents individuals' trend line for change over time in liking of ice cream (each person's within-subject effect). The black line represents the average of the individual trend lines in the sample (which is the sample's within-subject effect).

Between-persons (or between-subjects) effects, by contrast, examine differences between individuals. This can be between groups of cases when the independent variable (IV) is categorical or between individuals when the (IV) is continuous. These type of effects can be observed in either the univariate context or the multivariate context (including repeated measures). Either way, between-subjects effects determine if respondents differ on the dependent variable (DV), depending on their group (males vs. females, young vs. old…etc) or depending on their score on a particular continuous IV.

For example, let's return to our ice cream anecdote. If we want to test whether respondents are more likely to want ice cream if they score highly on an IQ test, we are testing for between-subjects effects. In this example, we are seeing if differences between persons with different IQs also have correspondingly different scores for "wanting ice cream". If course, the correct answer here is obviously yes.

Townspeople with a higher IQ tended to like ice cream more than those with a comparatively lower IQ, as the blue trend line shows. This is a between-subjects effect - it is comparing "ice cream liking" between people with various levels of intelligence (IQ).

**Editorial Note:*** Stats Make Me Cry is owned and operated by Dr. Jeremy J. Taylor. The site offers many free statistical resources (e.g. a blog, SPSS video tutorials, and R video tutorials), as well as fee-based statistical consulting and dissertation consulting services to individuals from a variety of disciplines all over the world.*

APA Format Table Example Before and After

Pictured (**above**) are examples of standard SPSS tables (**left**) and tables produced in SPSS after a few adjustments (**right**) to the settings. The table on the right more closely aligns with APA format than the table on the left in several ways:

- The title has been changed from center justified and
**bold**to left justified,*italics*, and NOT bold (**[1] above-right;**APA format).

- The table borders have been adjusted appropriately (details of specific changes to follow shortly).

- The default font type and size has been changed to Times New Roman 12pt.

The adjustments to SPSS that are needed to produce tables like the ones on the right are only necessary to be made once, after which the adjustments are made automatically by SPSS and you'll find all of your future tables are ready for insertion into your APA manuscript immediately after analysis. The necessary changes can be accomplished in 3 steps:

- Produce an initial table for alteration (using any analysis; a simple frequency table is sufficient).

- Create a custom "Table Look Style", by "Editing" the initial table's "Look Style" and saving the changes as a custom "style" ("APA Table" seems like a reasonable choice).

- Adjust your SPSS settings (options) so that SPSS recognizes your newly created "Look Style" as the default table "Look Style".

From there, you can simply run your analyses as you typically would and your tables should be formatted in APA format. Let's get into the specifics about how to accomplish these three steps...

**1) PRODUCE INITIAL TABLE**

The first step to make your SPSS adjustment is to produce an initial table for editing. For our purposes, a simple frequency does the trick (in the SPSS drop-down menus, navigate to: Analyze>descriptives>frequencies). Once your table is produced (

**below**

), right click on the table and click on "Edit Content" and then either "In Viewer" or "In Separate Window" (it doesn't really matter which you choose, for our purposes).

SPSS Edit Content Menu

Once your table is in "editing" mode (**below**), right click again and click on "TableLooks..."

Next, the "TableLooks" screen (**below**) should pop-up.

Under "TableLooks Files:", change the selection "CompactAcademicTimesRoman" (**[1]****below**).

While simply making that switch gets us a lot closer to APA format than the "default" SPSS table, we can improve the settings to get us much closer with a few additional changes.

NOTE: "CompactAcademicTimesRoman" is the closest "TableLook" to APA on its own, but luckily we can alter its attributes and save the changes!

Once you've clicked on "CompactAcademicTimesRoman", click on the "Edit Look..." button (**[2] above**). After clicking on "Edit Look...", the "Table Properties" screen should pop-up (**below**).

Within the "Table Properties" screen, we are going to adjust elements of both the "Cell Formats" tab (**above)** and the "Borders" tab (**[1] below**).

First, the "Cell Formats" tab (**above)**:

On the "Cell Formats" screen, you are able to adjust: the tables "Text" (font), the "Alignment" (justifications) of the text, the background color (which we will not be adjusting), and the "Inner Margins". We will only be changing the "Text" and "Alignment" settings. We'll deal with the "Text" first.

The default of all text in SPSS tables is 8 pt (**[4] above**), while the appropriate APA format font is 12 point, so the first thing we'll need to to is change all of the text in the table from 8 pt (**[4] above**) to 12 pt.

**Unfortunately, you are required to change each text element separately**by either clicking on the element in the "Sample" table on the right side of the screen (**[1] above**), or by selecting different elements in the "Area" drop-down menu (**[2]****above****)**.

**For example**

, click on the "

**Table Title**

" (

**[3] **

**above**

)

in the "Sample" table to edit that element. After clicking on the element, simply adjust the attributes on the left side of th screen

**NOTE**

: to comply with APA format for table titles, change your font size from 8 pt. to 12 pt. (

**[4] **

**above**

), make it italics and not bold (

**[5] **

**above)**

, and click on "Left Alignment" (

**[6] **

**above**

**]**

)

Next, switch to the "Borders" tab (**[1] below).**

Once in the "Borders" tab, there are three elements that we are going to adjust:

- Top inner frame (
**[2] above**)

- Bottom inner frame

- Data area top

To adjust the "Top inner frame", highlight it in the Border menu section (

**[1] below**

). Next, click on the "Style" drop-down menu (

**[2] below**

) and change the style from the double line (not APA format) to the single thin line (

**[3] below**

; second from the bottom; complies with APA format).

SPSS TableLooks Screen Example Cell Table Properties: Top Inner FrameNext, repeat the style adjustment for the "Bottom inner frame" (**[1] below**).

Again, repeat the style adjustment for the "Data area top" (**[1] below**).

Next, click the "Apply" button (**[2] above**), followed by the "OK" button (**[3] above**).

**2) CREATE CUSTOM TABLE LOOK STYLE**

After clicking the "OK" button, you should find yourself back at the "TableLooks" screen (**[1] below**). On this screen click on "Save As" (**[2] below**).

In the "Save As" dialogue screen (**below), **give your newly create table "Look" a name, preferably something self-explanatory and easy to remember. As you can see, I choose to call it "APA Table"(**[1] below**).

**Before clicking "Save"**, make sure you are saving the "TableLook" file in the correct directory:

**On a mac**

, the "Looks" directory can be found in the "SPSS" folder (or PASWStatistics folder;

**[1] below**

) within "Applications" (

**[2] below**

).

**On a PC**, the "Looks" director can be found at **C:>Documents and Settings> Program Files>SPSS**

Once inside the "Looks" folder (**below**), you should see various other "TableLooks" files (the files end in ".stt"). If you see that, you know you are in the right folder. From here, check to make sure your "File Name" is what you want it to be and then click "Save" (**[1] below**).

After you've clicked "Save", you should find yourself back in the "TableLooks" dialogue screen (**below**). Also, you should now see a newly available "TableLook" in the "TableLook Files:" area (**[1] below**) (the one you saved above). Next, simply click on that to highlight it (**1] below**) and then click the "OK" button (**[2] below**).

After clicking "OK", the "TableLooks" screen should disappear and the initial table you created should again be visible, but its format should now reflect the changes we've made and it should more closely resemble APA format (**below**)!

While certainly you could choose to do all of those steps for every graph you produce from now until forever, that wouldn't seem to be a very efficient use of your time. Instead, let's change the default SPSS settings to **automatically** use our newly created "TableLook" for all tables that are created in the future.

**3) ADJUST SPSS TABLE "LOOK STYLE" SETTINGS (OPTIONS)**

To adjust the SPSS "TableLook" settings, go to "Options" (**[1] below**), which you'll find under the "Edit" menu.

With the "Options" dialogue screen now visible, select the "Pivot Tables" tab (**[1] below**). Next, select our newly created "Table Look" (I called mine "APA table"; **[2] below**).

**On a side note**, I'd also suggest changing the "Copying wide tables to the clipboard in rich text format" option (**[3] below**) to "Shrink width to fit". Making this change will prevent SPSS from wrapping tables that are too wide for your page to another row (making them appear as two tables, even though they are really just two parts of the same table). I personally find that very irritating. Instead, this will tell SPSS to adjust the width of the cells in the table so that the table can fit within the margins of the page.

**Finally**, click on the "Apply" button (**[4] below**), followed by the "OK" button (**[5] below**). You should now be done and all future graphs should be produced in APA format (or closer to it anyway). Happy table making!

RIGHT-CLICK HERE AND "SAVE AS FILE" TO DOWNLOAD THE STT "LOOKS" FILE

on 2011-06-08 20:45 by Jeremy Taylor

A few sharp readers have made a great point about this post: If you have a version of SPSS that is licensed by a University, the instructions may not work. Specifically, when you try to create a "new look", it will likely display an error message that says you don't have "access" to the directory (or something like that).

Thanks to one of our readers (Benjamin Telkamp), we have a solution! Benjamin discover that you can save the "new look" as one of the existing looks in SPSS (just pick one that you don't think you'll be needing). Thanks for the tip Benjamin!

]]>

Reproducible research (RR) provides a road map through a study. In other words, conducting RR means engaging in a transparent analytic process. This prospect can be both exciting and terrifying.

Cartoon by Sidney Harris (The New Yorker)

Many of us harbor a burning fear that we will be "found out" as someone trying to "fake it until we make it." No matter how many goals we reach or degrees we earn, a hint of doubt remains. This doubt can keep us humble, vigilant, and hungry for professional growth. For some, a potent mix of doubt and fear is fuel for ambition. Unfortunately, this same combination can also make RR a scary concept.

The RR movement champions accuracy and clarity over "progress" and polish. Unfortunately, for too long much of the research world prioritized the latter over the former. According to a 2013 Economist article, replication studies show that as few as 11 to 25% of selected published biomedical findings held-up when re-tested. Similar doubts were cast on psychological priming research. Though explanations for these findings may be many, only uncertainty and doubt are created when science is a black box.

We must learn from these revelations. Though intimidating, RR is a critical step toward restoring confidence in the research community. As colleagues, we must reward transparency with constructive feedback and support. As researchers, we must develop thick skin when feedback is critical. For RR to gain momentum, we must find a delicate balance between constructive feedback and mutual respect for the scientific process. That process is rooted in replication and falsification. Researchers must be free to champion transparency without fear of ridicule or embarrassment. After all, every scientific discovery stands on the shoulders of many disproven theories.

There are many excellent sources of information available. For those looking to educate themselves on this topic, I will post some useful resources below. Promoting RR can be as simple as making your analyses' syntax (i.e. code) publicly available (e.g. in a linked footnote). When possible, making your data available along with your code verifies both your process and results. There are many ways one can promote RR, but the most important step is to educate yourself. Make it a priority to learn what RR is, how it works, and how you can apply it to your own work. In addition to some valuable resources, I've attempted to explain a few key terms below (some used in this blog and some not). In the open-source spirit, please share any other RR terms you'd like discussed. As my work is iterative and undoubtedly flawed, please also let me know if I've gone astray with my explanations below!

**Reproducible Research** - research that is conducted with systematic procedures and documentation, such that it can be reproduced either by the original researcher or by an independent party, for the purposes of verifying the study's results and/or conclusions. According to CRAN, the goal of reproducible research "is to tie specific instructions to data analysis and experimental data so that scholarship can be recreated, better understood and verified."

**Minimum-Threshold Journals** - journals that seek to publish as much quality science as possible, regardless of "statistical significance." These journals' inclusion criteria is focused on a studies' methodological rigor, instead of whether results are statistically significant. Despite their inclusive intentions, up to 50% of submitted articles are rejected because of poor methodological rigor, so perhaps their threshold is more accurately characterized as "re-calibrated" rather than "minimum." PLoS ONE is a prominent example of a "minimum-threshold journal."

**Literate Programming** - a method of programming (often statistical analysis) that combines typical programming code, graphics, and natural language into a document that is more easily understood by individuals other than the programmer. The product of literate programming is often an html page that presents an analytic process through a combination of natural language narrative, programming code snippets, and table or graphical output of analysis results. Well produced literate programming allows a reader to completely reproduce an analysis as they read along, using embedded links to download data and code snippets to produce the results presented in the document.

CRAN Task View: Reproducible Research. *CRAN R-Project*, 2014. Web. 20 Jan. 2014. http://cran.r-project.org/web/views/ReproducibleResearch.html

Knuth, Donald E. "Literate programming." *The Computer Journal, * 27.2 (1984): 97-111.

Trouble at the lab. *The Economist*, 2013. Web. 19, Oct. 2013. http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble

Yihui Xie (2013). knitr: A general-purpose package for dynamic report generation in R. R package version 1.5. http://cran.r-project.org/web/packages/knitr/index.html

**Reproducible Research.net** - a repository of RR resources and links. See more at: http://reproducibleresearch.net/index.php/Main_Page

**The Comprehensive R Archive Network (CRAN) RR Page** - resources for RR with R, which is an open-source statistical programming software platform. See more at: http://cran.r-project.org/web/views/ReproducibleResearch.html

**Johns Hopkins Free Online RR Course (with video) - **Learn the concepts and tools behind reporting modern data analyses in a reproducible manner. This is the fifth course in the Johns Hopkins Data Science Specialization. See more at: https://www.coursera.org/course/repdata

**Johns Hopkins Free Online Courses (with video) on Data Science - **A set of 9 free online courses in data science. Topics include: 1) The Data Scientist’s Toolbox - Get yourself set up, 2) R programming - Learn to code, 3) Getting and Cleaning Data - You need data. Get some. 4) Exploratory Data Analysis - What’s that in my data? 5) Reproducible Research - Did you do what you think you did? 6) Statistical Inference - You don’t have infinite money. Try sampling. 7) Regression Models - The duct tape of data science. 8) Practical Machine Learning - Predict the future with data. Easy. 9) Developing Data Products - There better be an app for that data.

See more at: http://jhudatascience.org

**knitr: Elegant, flexible and fast dynamic report generation with R** - The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package (knitr ≈ Sweave + cacheSweave + pgfSweave + weaver + animation::saveLatex + R2HTML::RweaveHTML + highlight::HighlightWeaveLatex + 0.2 * brew + 0.1 * SweaveListingUtils + more). See more at: http://yihui.name/knitr/

**The Sweave Homepage - **Sweave is a tool that allows to embed the R code for complete data analyses in latex documents. The purpose is to create dynamic reports, which can be updated automatically if data or analysis change. Instead of inserting a prefabricated graph or table into the report, the master document contains the R code necessary to obtain it. When run through R, all data analysis output (tables, graphs, etc.) is created on the fly and inserted into a final latex document. The report can be automatically updated if data or analysis change, which allows for truly reproducible research. See more at: http://www.stat.uni-muenchen.de/~leisch/Sweave/

**Science Exchange** - is a marketplace for scientific collaboration, where researchers can order experiments from the world's best labs and/or have their own research replicated. See more at: https://www.scienceexchange.com

*NOTE: This tutorial was created by Jared Knowles, doctoral candidate from the University of Wisconsin and owner of **Jared Knowles: From Data to Decisions**. Mr. Knowles is not affiliated with Stats Make Me Cry in any way and written consent was obtained before this tutorial was posted.*

Here is the handout associated with this presentation: CLICK HERE FOR HANDOUT

Here is the R Code associated with this presentation: CLICK HERE FOR R CODE

]]>*This video was created by Dr. Roger Peng, professor at the Johns Hopkins Bloomberg School of Public Health *and author at* Simply Statistics. Dr. Peng is not affiliated with Stats Make Me Cry in any way and written consent was obtained before this video was posted.*

To demonstrate this task I'm using one of the sample datasets that comes with SPSS named "demo_cs.sav". To start let's assume that we've already found an interaction effect **(see figure below)**. In this case, we've run a model in which income and gender are predictive of the price of one's vehicle. The figure below also shows us that income and gender interact to predict price of one's car (p<.001), so we have an effect to explore/plot!

The significant interaction term indicates that there is a moderating effect to explore graphically!

As you may or may not know, the above analysis can be run using either the GLM menu dialog or the regression dialog in SPSS. A key difference between the two is that you'll need to manually create the interaction term using the regression method, whereas the GLM will allow you to specify the interaction in the "Model..." dialog **(see 1 in figure below)**.

Click on the "Model..." button to specify main effects and interactions in a Univariate General Linear Model (GLM). Click on "Plots" to produce effect plots, but this only works for categorical/binary predictors (Fixed Factors). How do you do this when a predictor is continuous? Read on...

In the GLM dialog (above) you might've also noticed that there is a "Plots" button that you can click (see 2 in figure above), which seems promising, except you may be disappointed to find that it is only helpful if both predictors are binary or categorical (Fixed Factors in Univariate GLM). If either of the predictors in the interaction you wish to explore graphically are continuous (Covariate in Univariate GLM), then that predictor won't be available to create a plot in the "Plots" dialog (see figure below).

Only the "Fixed Factor(s)" predictor is available in the "Univariate: Profile Plots" dialogue

To obtain the plot you are seeking when one of your predictors is continuous (Covariate in Univariate GLM), you simply need to save your predicted values during analysis and plot them using "Graphs > Legacy Dialogs > Scatter/Dot...".

Let's walk through our example. Whether you used the GLM - Univariate analysis or the Regression - Linear analysis the first step is the same: return to your analysis dialog and click on the "Save..." button (GLM - Univariate example on left below, Regression-Linear example on right below).

Click "Save..." and then click on the "Unstandardized" box in the "Predicted Values" options.

Click "Save..." and then click on the "Unstandardized" box in the "Predicted Values" options.

After re-running your analysis while saving the predicted values, you will find a new variable in your dataset named "PRE_1" (which stands for "Predicted Values"; see figure below).

NOTE: each time you re-run this analysis a new version of this variable will be created with a new numeric postscript...PRE_2, PRE_3...etc). You will use this new variable to plot your effects.

Next navigate to "Graphs > Legacy Dialogs > Scatter/Dot..." (see figure below).

Graphs > Legacy Dialogs > Scatter/Dot...

Once in the "Scatter/Dot..." dialog, move the newly-created predicted values variable (PRE_1) to the Y-Axis (predicted value for price of car in our example), your continuous predictor to the X-Axis (income in our example) and your categorical variable (gender in our example) to the "Set Markers By" field (see figure below). When done (set "Titles" and change "Options" as desired), click "OK".

Plot "predicted values" from regression or Univariate GLM to explore interaction effects.

You now have your plot, but you'll probably notice immediately that you are missing your trend/regression lines to compare your effects **(see figure left below)**! We need to make some slight modifications here. To add these lines: double click on the plot in the output viewer (or right click and choose "Edit Content > In Separate Window"). Once your new plot editor window appears **(circled in figure center below)**, click on the "At Fit Line at Subgroups" button.

Upon clicking on the "At Fit Line at Subgroups" you should now see your trend lines for each group (i.e. moderator group; our example has only 2 groups, but this would work the same if there were more than 2 groups).