Visualizing Statistical Significance – and Effect Sizes!

Ann Emery recently posted an awesome blog post on visualizing statistical significance. Starting with a table of statistics with lots of numbers and asterisks (*), she ended up with this lovely version:

Here’s what our final makeover looked like. We decided to focus on the big-picture findings. So, we used empty squares to represent variables that weren’t statistically significant and filled-in squares to represent variables that were. We used p?.05 as our cutoff here; anything with .05 or lower got filled in and anything above .05 remained empty.
Ann K. Emery:

I responded saying I loved it, but I’m often asked to also show effect sizes and mentioned a couple of ideas I had for how she could also show effect sizes visually. Bogdan Miku explored this with confidence intervals and effect sizes in another blog post, but I wanted to share how I thought about doing it using a table I often use with one of my clients.

Original version, revised a la Emery

With these clients, we often want to answer the questions, “For whom?” or “In what cases?” In this case, it was exploring for whom and in what cases is after-school program quality rated higher. We explored this by school type (public and charter), gender (female and male), grade level, and their reasons for joining the program, an important variable we often explore in our evaluations. 

Often, my PI asks me, “But which of these are meaningful differences?” By asking this question, she is asking the effect sizes of these differences. I often would then add a narrative explaining these differences, but we all know that our clients don’t necessarily read the fine print in our reports! So with that, here is my version using Emery’s method shown above. Note that I had to pull the text explaining where the differences were up above as part of the table header. This also helps me remove the table note that I’d always add explaining what P, C, F, M, Int, and Ext meant.

Visualizing Effect Sizes

My first step in visualizing the effect sizes was figuring out the exact effect sizes. Given these are group differences of either 2 or 4 groups, I decided to use eta-squared (you can find an awesome Excel spreadsheet to calculate effect sizes by Daniel Lakens here).  This could easily be transformed into Cohen’s d if I preferred, but for the purpose of this exercise I just left it at eta-squared.

My first attempt, and what I originally suggested to Ann, was to simply vary the sizes of the boxes:

Ultimately, I did not like this. It was hard to see the differences well and I could only imagine how trying to explain the sizes in a table note would go. So I scrapped that and ultimately ended up with the following:

Rather than the size of the boxes, I varied the number of boxes. This is much easier to see visually. You can clearly tell that the thing that explained ratings of quality was students’ reason for joining, but there’s also a nice effect of school type on quality ratings. Gender and grade level, while statistically significant, are not very meaningfully different.


Personally, I’m really stoked that I finally got around to doing this. I’m sad that I didn’t do this for the final report that I finished a couple of months ago, but I look forward to using it in subsequent reports. The only issue I have is that often things flip around in the tables and the second row (e.g., public > charter) doesn’t always work out for every row below. Sometimes public > charter but sometimes charter > public. I’m not entirely sure how I’d reconcile that. Perhaps I could use a different color to show that the relationship is vice-versa. What are your thoughts?