Visualising Independence

File format

Raw format

A CSV file, with variables as columns and observations as rows. If the first row is not a header that gives variable names, you'll need to deselect the 'File contains a header' option.



Counts format

This format allows you to read in a table of counts directly. The first two lines should specify the names of the factors in the table. It does not matter which order these two lines are in.

The third line should give the names (levels) of the column factor, and then line 4 onwards should specify the rows of the table, where the first value in each line gives the level of the row factor, and the following values give the counts.



Using this tool

First upload a file. Specify the data type of your file using the Data Type options.

For raw CSV files, the factors in the dataset will be detected automatically, and all factors with no more than 5 levels will be displayed in the factor selection boxes. A numerically-coded factor will be recognised as a factor as long as the levels are non-negative integers and there are no more than 5 levels. Other numeric variables that do not meet these conditions are ignored. We place a restriction on the number of factor levels because having a large number of levels make the diagrams more difficult to interpret.

You can display counts for each cell (and column total) by selecting the Counts options under 'Show data labels'. Similarly, you can instead choose to display proportions within each column by selecting the Proportions option.

Tick the 'Show independence' checkbox to show what we would expect to see if the two factors were independent.