Tarid Wongvorachan: Visualizing Item Analysis: Creating Interactive Plots for Educational Assessments

Tarid Wongvorachan

Introduction

While working on an educational assessment project with colleagues who were not familiar with psychometrics, I faced the challenge of presenting item analysis results, such as item difficulty and item discrimination, in an easy-to-understand way. This led me to the idea of creating interactive plots to visualize these results effectively.
In this post, I will demonstrate how to create interactive item analysis plots using the following R packages: ggplot2, dplyr, plotly, and DT.
The dataset used in this example is a synthesized, anonymized, and de-identified dataset from an educational assessment. The test items are dichotomous, meaning they have two possible responses (e.g., correct or incorrect). The test was administered to both French and English speakers. There are two types of items in the dataset: field test items and operational items.
Operational test items are scored and counted toward the examinees’ total scores, while field test items are administered to determine their suitability for use in the operational setting. Essentially, field test items do not count toward the final score, but we need examinees to complete them to assess whether they are too difficult, too easy, or functioning as intended.
Here is an overview of our dataset. It contains seven columns:
- Item Name: The name of the test item.
- Difficulty: The difficulty level of the item.
- Discrimination: The discrimination index of the item.
- Sample Size: The number of examinees who took the item.
- Language: The language of the test (French or English).
- Item Type: The type of item (Field Test or Operational).
- Item Flag: Indicates whether the item is good, not good, or requires caution.
Item difficulty is a measure of how challenging a test item is, based on classical test theory frameworks. A higher difficulty value indicates a higher proportion of examinees answering the item correctly, meaning the item is easier. Conversely, a lower difficulty value indicates the item is more difficult.
Item discrimination in this dataset is represented by the point-biserial correlation value. This value measures the correlation between the item score (coded as 1 for correct and 0 for incorrect) and the total scores that the test takers receive. A higher point-biserial correlation indicates that the item is better at differentiating between high-performing and low-performing examinees

Show code

datatable(df, options = list(pageLength = 10))

I will subset the total dataset into four sub-dataframes: item property data for French test takers, item property data for Non-French test takers, field test items, and operational test items.

Show code

# Subset Non-French
df_non_french <- df %>% filter(language == "Non-French")

# Subset French
df_french <- df %>% filter(language == "French")

# Subset Field Test Item
df_field_test <- df %>% filter(item_type == "field test")

# Subset Operational Item
df_operational <- df %>% filter(item_type == "operational")

The idea is that presenting the entire table to your audience may not be ideal for conveying which items are good and which are bad. It can be challenging for people to grasp the overall picture. That’s why I decided to plot the data as a single picture and make it interactive, allowing people to check the details of each data point.
For this plot, I added dashed lines for both item discrimination and item difficulty as references. The cut-off point for acceptable point-biserial correlation varies from 0.15 to 0.30. I settled on 0.20 as a middle ground.
For item difficulty, a value of 0.9 means the item is very easy, while a value of 0.5, in the context of this study, means the item is very difficult. This is because the cut-off point for passing the assessment is 70% (examinees need to score at least 70% to pass), even though some may consider it moderately difficult (Thompson, 2020).
The two dashed lines divide the plot into quadrants. Items in quadrant 1 (upper right) have desirable conditions, while items in other quadrants are flagged as poor. Items near the dashed lines are considered cautionary.

All Data Points

Show code

# Define color mapping
flag_colors <- c("Good" = "green", "Caution" = "orange", "Poor" = "red")

# Create ggplot
p <- ggplot(df, aes(x = Discrimination, y = Difficulty, color = flag)) +
  geom_point(size = 1, aes(text = paste("Item:", Item,
                                        "<br>N:", N,
                                        "<br>Language:", language,
                                        "<br>Item Type:", item_type))) +
  geom_vline(xintercept = 0.20, linetype = "dashed", color = "blue") +
  geom_hline(yintercept = 0.50, linetype = "dashed", color = "blue") + 
  scale_color_manual(values = flag_colors) +
  labs(title = "Difficulty vs. Discrimination (all)",
       x = "Discrimination",
       y = "Difficulty") +
  theme_minimal()

# Convert ggplot to plotly
ggplotly(p)

Language

Non-French

Show code

# Create ggplot
p <- ggplot(df_non_french, aes(x = Discrimination, y = Difficulty, color = flag)) +
  geom_point(size = 1, aes(text = paste("Item:", Item,
                                        "<br>N:", N,
                                        "<br>Language:", language,
                                        "<br>Item Type:", item_type))) +
  geom_vline(xintercept = 0.20, linetype = "dashed", color = "blue") +
  geom_hline(yintercept = 0.50, linetype = "dashed", color = "blue") + 
  scale_color_manual(values = flag_colors) +
  labs(title = "Difficulty vs. Discrimination (Non-French)",
       x = "Discrimination",
       y = "Difficulty") +
  theme_minimal()

# Convert ggplot to plotly
ggplotly(p)

French

Show code

# Create ggplot
p <- ggplot(df_french, aes(x = Discrimination, y = Difficulty, color = flag)) +
  geom_point(size = 1, aes(text = paste("Item:", Item,
                                        "<br>N:", N,
                                        "<br>Language:", language,
                                        "<br>Item Type:", item_type))) +
  geom_vline(xintercept = 0.20, linetype = "dashed", color = "blue") +
  geom_hline(yintercept = 0.50, linetype = "dashed", color = "blue") + 
  scale_color_manual(values = flag_colors) +
  labs(title = "Difficulty vs. Discrimination (French)",
       x = "Discrimination",
       y = "Difficulty") +
  theme_minimal()

# Convert ggplot to plotly
ggplotly(p)

Item Type

Field Test Item

Show code

# Create ggplot
p <- ggplot(df_field_test, aes(x = Discrimination, y = Difficulty, color = flag)) +
  geom_point(size = 1, aes(text = paste("Item:", Item,
                                        "<br>N:", N,
                                        "<br>Language:", language,
                                        "<br>Item Type:", item_type))) +
  geom_vline(xintercept = 0.20, linetype = "dashed", color = "blue") +
  geom_hline(yintercept = 0.50, linetype = "dashed", color = "blue") + 
  scale_color_manual(values = flag_colors) +
  labs(title = "Difficulty vs. Discrimination (Field Test)",
       x = "Discrimination",
       y = "Difficulty") +
  theme_minimal()

# Convert ggplot to plotly
ggplotly(p)

Operational Item

Show code

# Create ggplot
p <- ggplot(df_operational, aes(x = Discrimination, y = Difficulty, color = flag)) +
  geom_point(size = 1, aes(text = paste("Item:", Item,
                                        "<br>N:", N,
                                        "<br>Language:", language,
                                        "<br>Item Type:", item_type))) +
  geom_vline(xintercept = 0.20, linetype = "dashed", color = "blue") +
  geom_hline(yintercept = 0.50, linetype = "dashed", color = "blue") + 
  scale_color_manual(values = flag_colors) +
  labs(title = "Difficulty vs. Discrimination (Operational Item)",
       x = "Discrimination",
       y = "Difficulty") +
  theme_minimal()

# Convert ggplot to plotly
ggplotly(p)

Concluding remark

When communicating data-driven insights, presenting results through interactive charts can be highly beneficial. This approach allows our audience to grasp the information at a glance.
By making the charts interactive, we enable users to explore the details of each data point, enhancing their understanding and making the insights more accessible.
Although this is a relatively short post, I aimed to be concise and efficient. Thank you very much for visiting!

Comment on this article Share:

Visualizing Item Analysis: Creating Interactive Plots for Educational Assessments

Introduction

All Data Points

Language

Non-French

French

Item Type

Field Test Item

Operational Item

Concluding remark

Reuse

Citation