Synthetic Pupil Popularity Dataset — pop

A synthetic two-level dataset with pupils nested within schools, generated to mimic the structure and parameters of the popular dataset from Hox (2010). It can be used to demonstrate multilevel bootstrap methods without depending on external data sources.

Usage

pop_syn

Format

A data frame with 2000 rows and 5 variables:

pupil: Pupil identification number within school (integer).
school: School identification number, 1–100 (integer).
popular: Pupil popularity score on a 0–10 scale (numeric).
sex: Pupil sex: 0 = boy, 1 = girl (integer).
texp: Teacher experience in years (numeric).

Source

Simulated from parameters estimated from the popular dataset in: Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Routledge. Data available at https://stats.oarc.ucla.edu/stat/stata/examples/mlm_ma_hox/popular.dta.

Details

The dataset was generated by fitting a two-level random intercept model to the original popular data and simulating new data from the estimated parameters:

Fixed effects: intercept = 3.56, sex = 0.84, texp = 0.093
School-level random intercept SD: 0.69
Residual SD: 0.68

Continuous predictions were rounded to the nearest integer and clamped to the [0, 10] range to match the original Likert-type scale.