Comparison of the Accuracy of Stratified Random Sampling and Simple Random Sampling Methods in National Assessment (AN)
##semicolon##
https://doi.org/10.59188/eduvest.v5i6.51460##semicolon##
standard error##common.commaListSeparator## mean square error##common.commaListSeparator## national assessment##common.commaListSeparator## simple random sampling, stratified random samplingAbstrakt
Sampling methods are crucial for large-scale assessments. International surveys like PISA, TIMSS, and PIRLS use stratified random sampling (StRS) to enhance estimation accuracy, ensure representation of all subpopulations, and provide efficient administration. Similarly, Indonesia's National Assessment (AN) applies StRS, dividing populations by school size, class size, and gender. However, the accuracy of the AN sampling method, including its reliability and validity, has not been tested since its 2021 implementation. This study compares the reliability and validity of the AN sampling method to simple random sampling (SRS). Reliability is assessed by the consistency of estimates across repeated sampling, indicated by small standard error (SE) and confidence intervals (CI). Validity measures how accurately sample estimates reflect population parameters, evaluated through Mean Square Error (MSE). Using AN data from 1.9 million junior high school students out of 4.2 million, the analysis shows no significant differences in national population parameters between StRS and SRS. Both methods produce similar mean estimates (55) and standard deviations (10.7). However, StRS demonstrates greater variability in weights, reflecting its ability to account for sampling structure. At the school level, StRS outperforms SRS, yielding narrower CI and MSE ranges, highlighting its superior reliability. While MSE differences are statistically significant, their practical impact is minor due to the small effect size and large dataset. These results suggest StRS is more reliable for school-level reporting.
##submission.citations##
Abrahamowicz, M., Binder, H., Briel, M., Hornung, R., Morris, T. P., Rahnenführer, J., Sauerbrei, W., Groenwold, R. H. H., & Boulesteix, A.-L. (2020). Introduction to statistical simulations in health research. BMJ Open, 10(12), e039921. https://doi.org/10.1136/bmjopen-2020-039921
Almaskut, A., LaRoche, S., & Foy, P. (2023). Chapter 3: Sample design in PIRLS 2021. In M. v. Davier, I. V. Mullis, B. Fishbein, & P. Foy (Eds.), Methods and procedures: PIRLS 2021 technical report. Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2103.kb9560
Altman, D. G., & Bland, J. M. (2014a). Uncertainty and sampling error. BMJ, g7064. https://doi.org/10.1136/bmj.g7064
Altman, D. G., & Bland, J. M. (2014b). Uncertainty beyond sampling error. BMJ, g7065. https://doi.org/10.1136/bmj.g7065
Berndt, A. E. (2020). Sampling methods. Journal of Human Lactation, 36(1), 1–3. https://doi.org/10.1177/0890334420906850
Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Sage.
Ding, C.-S., Haieh, C.-T., Wu, Q., & Pedram, M. (1996). Stratified random sampling for power estimation. In Proceedings of International Conference on Computer Aided Design (pp. 576–582). IEEE. https://doi.org/10.1109/ICCAD.1996.569913
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78. https://doi.org/10.1016/j.paid.2016.06.069
Hodson, T. O. (2022). Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development, 15, 5481–5487. https://doi.org/10.5194/gmd-15-5481-2022
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning with applications in R (2nd ed.). Springer.
Kepala-BSKAP. (2024). Keputusan BSKAP Kemendikbudristek No. 019/H/KP/2024 tentang Pedoman Penyelenggaraan AN. Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi.
LaRoche, S., & Foy, P. (2020). Chapter 9: Sample implementation in TIMSS 2019. In M. O. Martin, M. v. Davier, & I. V. Mullis (Eds.), Methods and procedures: TIMSS 2019 technical report. TIMSS & PIRLS International Study Center, Boston College.
Levy, P. S., & Lemeshow, S. (2013). Sampling of populations: Methods and applications (4th ed.). John Wiley & Sons.
Lin, L. (2018). Bias caused by sampling error in meta-analysis with small sample sizes. PLOS ONE, 13(9), e0204056. https://doi.org/10.1371/journal.pone.0204056
Lohr, S. L. (2022). Sampling: Design and analysis (3rd ed.). CRC Press.
Machromah, I. U., Utami, N. S., Setyaningsih, R., Mardhiyana, D., Wahyu, L., & Fatmawati, S. (2021). Minimum competency assessment: Designing tasks to support students’ numeracy. Turkish Journal of Computer and Mathematics Education, 12(14), 5480–5487.
Mang, J., Küchenhoff, H., Meinck, S., & Prenzel, M. (2021). Sampling weights in multilevel modelling: An investigation using PISA sampling structures. Large-scale Assessments in Education, 9(1), 1–39. https://doi.org/10.1186/s40536-021-00099-0
Mascha, E. J., & Vetter, T. R. (2018). Significance, errors, power, and sample size: The blocking and tackling of statistics. Anesthesia & Analgesia, 126(2), 691–698. https://doi.org/10.1213/ANE.0000000000002741
Megawati, L. A., & Sutarto, H. (2021). Analysis numeracy literacy skills in terms of standardized math problem on a minimum competency assessment. Unnes Journal of Mathematics Education, 10(2), 128–135. https://doi.org/10.15294/ujme.v10i2.49540
Mendikbudristek. (2021). Peraturan Menteri Pendidikan, Kebudayaan, Riset, dan Teknologi No. 71 Tahun 2021 tentang Asesmen Nasional. Kemendikbudristek.
Mendikbudristek. (2022). Permendikbudristek No. 9 Tahun 2022 tentang Evaluasi Sistem Pendidikan oleh Pemerintah Pusat dan Pemerintah Daerah terhadap PAUD, Dikdas, Dikmen. Kemendikbudristek.
OECD. (2023). PISA 2022 results (Volume I): The state of learning and equity in education. OECD Publishing. https://doi.org/10.1787/53f23881-en
OECD. (2024). PISA 2022 technical report. OECD Publishing. https://www.oecd.org/en/publications/pisa-2022-technical-report_01820d6d-en.html
Pusmendik. (2024a, November 12). Simulasi AKM. https://pusmendik.kemdikbud.go.id/an/simulasi_akm
Pusmendik. (2024b). Laporan Monitoring AN 2024. Pusat Asesmen Pendidikan.
Pusmendik. (2025, January 8). FAQ Asesmen Nasional. https://pusatinformasi.raporpendidikan.kemdikbud.go.id/hc/en-us/articles/38597276705305
Salkind, N. J. (2006). Encyclopedia of measurement and statistics. SAGE Publications.
Taherdoost, H. (2016). Sampling methods in research methodology: How to choose a sampling technique for research. International Journal of Academic Research in Management, 5(2), 18–27.
Widarti, H. R., Rokhim, D. A., Septiani, M. O., & Dzikrulloh, M. H. A. (2022). Identification of science teacher practices and barriers in preparation of minimum competency assessment in the Covid-19 pandemic era. Orbital: The Electronic Journal of Chemistry, 14(1), 47–56. https://doi.org/10.17807/orbital.v14i1.1695
Wibowo, A., Indahwati, Sumertajaya, I. M., & Astuti, E. T. (2015). Accuracy comparison of simple, systematic, and stratified random sampling for estimating population (Minimarket case in Indonesia). In Proceedings of International Conference on Research, Implementation and Education of Mathematics and Sciences (pp. 168–175). Yogyakarta State University.
Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30, 79–82.
Wu, C., & Thompson, M. E. (2020). Sampling theory and practice. Springer. https://doi.org/10.1007/978-3-030-44246-0
##submission.downloads##
Publikované
##submission.howToCite##
Číslo
Sekcia
##submission.license##
##submission.copyrightStatement##
##submission.license.cc.by-sa4.footer##