Skip to contents

A synthetic two-level dataset with pupils nested within schools, generated to mimic the structure and parameters of the popular dataset from Hox (2010). It can be used to demonstrate multilevel bootstrap methods without depending on external data sources.

Usage

pop_syn

Format

A data frame with 2000 rows and 5 variables:

pupil

Pupil identification number within school (integer).

school

School identification number, 1–100 (integer).

popular

Pupil popularity score on a 0–10 scale (numeric).

sex

Pupil sex: 0 = boy, 1 = girl (integer).

texp

Teacher experience in years (numeric).

Source

Simulated from parameters estimated from the popular dataset in: Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Routledge. Data available at https://stats.oarc.ucla.edu/stat/stata/examples/mlm_ma_hox/popular.dta.

Details

The dataset was generated by fitting a two-level random intercept model to the original popular data and simulating new data from the estimated parameters:

  • Fixed effects: intercept = 3.56, sex = 0.84, texp = 0.093

  • School-level random intercept SD: 0.69

  • Residual SD: 0.68

Continuous predictions were rounded to the nearest integer and clamped to the [0, 10] range to match the original Likert-type scale.