## Overview

The **goal** of this **building block** is to provide a guide to create basic forest plots, compare the results of different studies, and assess the significance of their pooled effects.

`Forest plots`

are a visual representation that summarises findings from various scientific studies that investigate a common research question. They find significant application in the field of `meta-analysis`

, a type of statistical analysis that combines and examines results from a number of independent studies.
More practically, forest plots identify a statistic that is common to such set of studies and report the various instances of that statistic. This, in turn, allows to compare the different results and the significance of the overall pooled summary effect.

Among the **benefits** of forest plots, we find:

- clear and concise
`visual`

`representation`

of results; `effect`

`size`

and`confidence`

`interval`

comparison across different studies;- overall, useful tool to evaluate the consistency and strength of evidence, identify potential sources of bias, and make informed judgments about the effect of interventions or exposures.

## Forest Plots in R

One of the most popular R packages used for forest plots is forestploter. Compared to other packages (e.g., forestplot), `forestploter`

focuses entirely on forest plots, which are treated as a table. Moreover, it allows to control for graphical parameters with a theme and to have confidence intervals spread across multiple columns and divided by groups.

### Generate Dataset

The code blocks below shows how to `generate`

a `dataset`

and create a `basic`

`layout`

for a forest plot.

**Load**the**required packages**and**generate a simulated dataset**.

```
library(forestploter)
library(dplyr)
data <- data.frame(
Study = c("Study A", "Study B", "Study C", "Study D", "Study E"),
Group1 = c(200, 150, 180, 250, 120),
Group2 = c(150, 170, 160, 240, 130),
smd = c(0.51, 1.27, -0.54, 0.81, 0.87),
CI_Lower = c(-0.25, -0.74, -1.6, 0.55, -0.1),
CI_Upper = c(1.25, 1.4, -0.1, 2.1, 1.3)
)
```

- Calculate
**standard errors**and**weights**to be used in the derivation of the**pooled standard error**and of the**overall pooled effect**.

```
data$se <- (data$CI_Upper - data$CI_Lower) / (2 * 1.96)
data$Weight <- 1/data$se^2
data$Weight <- data$Weight / sum(data$Weight)
pooled_effect <- round(sum(data$smd * data$Weight), 2)
data$Weight <- round(100 * (data$Weight), 2)
data$n <- (data$Group1 + data$Group2)
n_studies <- 5
pooled_se <- sqrt(
(sum((data$n - 1) * data$se^2)) / ((sum(data$n) - n_studies))
)
```

- Calculate the
**confidence interval**for the pooled summary effect, insert**empty cells**to match the forest plot’s graphical representation, and include a final**summary column**.

```
z_score <- qnorm(0.975)
lower_bound <- round(pooled_effect - z_score * pooled_se, 2)
upper_bound <- round(pooled_effect + z_score * pooled_se, 2)
data$` ` <- paste(rep(" ", 30), collapse = " ")
data$`SMD (95% CI)` <- paste(data$smd, " [", data$CI_Lower, ", ", data$CI_Upper, "]", sep = "")
```

**Reorder**the**columns**and keep only those necessary for the final forest plot. Generate a**totals**row to append to the original dataset and make any**adjustments**necessary to ensure that variables appear neat.

```
data <- data %>%
select(Study, Group1, Group2, smd, CI_Lower, CI_Upper, se, ` `, Weight, `SMD (95% CI)`)
totals <- c(" ", sum(data$Group1), sum(data$Group2), pooled_effect, lower_bound,
upper_bound, pooled_se, " ", sum(data$Weight),
paste(pooled_effect, " [", lower_bound, ", ", upper_bound, "]", sep = ""))
data <- rbind(data, totals)
data$Weight <- paste(data$Weight, "%")
data[nrow(data), 1] <- "Overall"
```

### Create Forest Plot Theme

**Generate**and**customise**forest plot theme.

```
tm <- forest_theme(base_size = 10,
# Graphical parameters of confidence intervals
ci_pch = 15,
ci_col = "#0e8abb",
ci_fill = "red",
ci_alpha = 1,
ci_lty = 1,
ci_lwd = 2,
ci_Theight = 0.2,
# Graphical parameters of reference line
refline_lwd = 1,
refline_lty = "dashed",
refline_col = "grey20",
# Graphical parameters of vertical line
vertline_lwd = 1,
vertline_lty = "dashed",
vertline_col = "grey20",
# Graphical parameters of diamond shaped summary CI
summary_fill = "#006400",
summary_col = "#006400")
```

Type `help(forest_theme)`

in your R terminal for more info about the `forest_theme()`

function arguments.

### Generate Forest Plot

- Once dataset and theme are set,
**generate forest plot**.

```
# Final data manipulation part
data$CI_Upper <- as.numeric(data$CI_Upper)
data$CI_Lower <- as.numeric(data$CI_Lower)
data$smd <- as.numeric(data$smd)
data$se <- as.numeric(data$se)
# Forest plot
pt <- forest(data[,c(1:3, 8:10)],
est = data$smd,
lower = data$CI_Lower,
upper = data$CI_Upper,
sizes = 0.8,
is_summary = c(rep(FALSE, nrow(data)-1), TRUE),
ci_column = 4,
ref_line = 0,
arrow_lab = c("Favours Group 1", "Favours Group 2"),
xlim = c(-2, 2),
ticks_at = c(-2, -1, 0, 1, 2),
xlab = "Standardised Mean Difference",
theme = tm)
plot(pt)
```

Type `help(forest)`

in your R terminal for more info about the `forest()`

function arguments.

This is what the `final output`

should look like:

## Interpret Forest Plots

The following is an explanation of how to **interpret** the figure above.

**Study**: results source.**Group 1/2**: number of participants in the study. Normally the two groups are split between`treatment`

and`control`

groups.**Squares**(red): effect size of the indvidual studies. In this example, the effect size is represented by the`standardised mean difference`

between the averages of the two groups. Other possible effect sizes are`mean`

`difference`

,`odds`

`ratio`

, or`hazard`

`ratio`

.**Horizontal**(blue)**lines**:`95%`

`confidence`

`intervals`

(CI). The interpretation is that we are 95% confident that the true value of the effect size lies between the lower and upper bounds. The wider the CI the less precise the study.**Diamond**(green):`pooled`

`summary`

`effect`

of all the studies included in the meta-analysis. The middle points of the diamond represent the pooled effect, while the points on the sides its 95% confidence interval.**Dotted Line**: this line is known as`line`

`of`

`no`

`effect`

and it is plotted at the exact point where, relative to the effect size chosen in the analyis, there is no difference between the estimates of the two groups. If the effect size is based on a difference, the line of no effect will be at 0, whereas if the effect size is based on a ratio, the line of no effect will be at 1. This line is very useful to`interpret`

the`results`

, in fact, if the CI intersects the line, the results are**NOT significant**. In this case the pooled summary effect is not significant.**Weight**: study weight is proportional to study precision and it represents the`influence`

of each individual`study`

on the pooled effect size. More practically, when the standard error of the estimate of a study increases, its weight decreases. An alternative approach is to use a weight that is positively correlated with sample size.**SMD (95% CI)**: summary of effect size and confidence intervals in the graphical representation.