Under the Radar: Simplifying the Representation of Latent Class Characteristics

Abstract

In this visualization, we demonstrate how to use radar plots to represent the class-specific posterior response probabilities from Latent Class Analysis results. These plots allow for a simple representation of the class-differences in the distributions across the modeled indicators. We demonstrate the utility of this approach with results from a published model of women’s employment and family life circumstances. In doing so, we demonstrate how to avoid some of the pitfalls common to radar plots, and provide example code allowing other researchers to readily adapt this approach to present their own results.

Latent Class Analysis (LCA)

Sociologists have increasingly utilized a variety of cluster-based analytic strategies, especially those that incorporate the estimation of relationships between latent traits and observed behaviors. One particularly common form of this type of model is Latent Class Analysis, or LCA (McCutcheon 1987).

LCA uses observable indicators to yield unobserved, or latent, probabilities of endorsing y\(_1\)…y\(_k\) response patterns accounting for the co-variance among observed indicators. Its basic form is defined by
\[P(Y=y)=\sum_tP(X=t) P(Y=y | X=t) \] where \(P(X = t)\) denotes the probability of belonging to class t and \(P(Y = y|X = t)\) is the probability of having response pattern y conditional on membership in class t.

In reporting LCA results, it is common to disclose both the probability of class membership (gammas) and the distribution of class-specific item-response probabilities (rhos). For the latter, any situation including more than a few variables, with a limited number of items, combined with few classes, it is difficult to concisely and clearly represent the full pattern of response patterns across classes.

Alternatives for presenting LCA results have typically included (a) tables with each class-item-response category probabilities reported, sometimes highlighting patterns across the observed variable distributions and item probabilities, (b) line/bar charts that group classes together, and occasionally approaches that (c) summarize some of that information, e.g., with ternary plots separating out the differences in response patterns across classes (Bakk and Roux 2017). While rhos inherently include three dimensions of interest - (1) response-item distributions, (2) class-memberships, and (3) the differences in (1) by (2), none of the existing approaches for presenting these results optimize across all three of these dimensions simultaneously, instead prioritizing some over the others. For example, tables can provide detailed item-response distributions but make seeing the class differences across those difficult. Simple line or bar charts can make the distributions visible, but can only cluster classes together (making comparisons across them difficult) or variables together (making class compositions difficult to see). Alternatively, while ternary plots make the differences between classes easily perceptible, the underlying item-response distributions are obfuscated.

Visualizing LCA Results with Radar Plots

Here, we suggest radar plots (Sievert 2018) accomplish all three of these aims simultaneously, demonstrate their use as applied to a previously published result, and provide replicable code (in ways that incorporate these optimizations) for others to adapt for presenting their own results.

The radar plots presented in Figure 1 use the LCA solution from (Lippert and Damaske 2019) on young adult women’s work and family formation trajectories.¹ For presenting results in the radar plot format, we choose to employ two options.² First, we normalize each variable’s value into a single index ranging from 0-1. Normalization requires multiplying the item-specific response probabilities by a scaling factor determined by the order and number of categories within each item. For illustration purposes, the variables in the presented model include dichotomous, trichotomous, and other ordered variables. Second, we reverse code indicators as necessary to ensure all normalized values are oriented in the same direction.³

Figure 1. Radar Plot of LCA Results . This visualization represents class-specific posterior response probabilities from the LCA result in (Lippert and Damaske 2019). In the top panel, each class is represented with a radar trace of a different color. Each plotted value corresponds to the likelihood of class members being in the top-coded (rhos that are normalized for all, and reverse-coded for # of children). You can click to highlight particular classes, or double-click to hide all others. The bottom panel presents the class-membership probabilities (gammas).

For the seven classes identified, four are characterized by full-time employment (orange, green, red, and purple polygons) and three by lower engagement with paid labor (grey, pink, and brown). Two classes - professional workers with and without children - were similar with respect to their employment status, propensity for well-paid high-skilled work, and job decision latitude, but differed by their relationship status and whether they had children in their care (see green vs. red polygons). Further distinctions between the identified classes are visible by comparing across the distribution of item posterior response probabilities between the classes.

Our approach provides a means for visually representing LCA results,⁴ which allows readers to easily compare class-level differences in the pattern across distributions of the item-specific response categories. We hope that the companion code will allow future researchers to more easily present LCA results in their own research.

References

Bakk, Zsuzsa, and Niel J. le Roux. 2017. “Visualizing Latent Class Models with Analysis-of-Distance Biplots.” Sociological Methodology 47 (1): 345–78. https://doi.org/10.1177/0081175017717048.

Lippert, Adam M, and Sarah Damaske. 2019. “Finding Jobs, Forming Families, and Stressing Out? Work, Family, and Stress among Young Adult Women in the United States.” Social Forces, December. https://doi.org/10.1093/sf/soy117.

McCutcheon, Allen L. 1987. Latent Class Analysis. Quantitative Applications in the Social Sciences. SAGE.

Sievert, Carson. 2018. plotly for R. https://plotly-r.com.

NOTE: Appendices with data to replicate Figure 1, and code for adapting the approach to your own results are available at: https://github.com/jimiadams/LCA-Viz.

For the complete table of the posterior probabilities included in this solution, see the .csv file in the Appendices.↩
These options can be turned on/off in the provided code.↩
In the case presented here, this required subtracting the result of the normalization step from 1 for the variable indicating number of children in one’s care.↩
This approach is likely also readily adaptable to the presentation of results from other clustering-based approaches.↩

Under the Radar: Simplifying the Representation of Latent Class Characteristics

jimi adams & Adam M. Lippert

8/1/2019

Abstract

Latent Class Analysis (LCA)

Visualizing LCA Results with Radar Plots

References