IQB National Assessment Study 2012 (IQB-LV 2012)



Table of contents

Project description

Blank data sets


Notes on the use of the data



> Link to application form (Scientific Use Files)

Data Set Published on 01.12.2015
Version v4
Current Version Available Since 24.06.2019
Survey Period 2012
Sample Students in grade 9 (N=44,500); Classes (N=2,109); Schools (N=1,326)
Survey Unit Principals
Measured Competencies Mathematics, Biology, Chemistry, Physics
Region Germany, Baden-Wuerttemberg, Bavaria, Berlin, Brandenburg, Bremen, Hamburg, Hesse, Mecklenburg-Western Pomerania, Lower Saxony, North Rhine-Westphalia, Rhineland-Palatinate, Saarland, Saxony, Saxony-Anhalt, Schleswig-Holstein, Thuringia
Principal Investigators Pant, Prof. Dr. Hans Anand
Stanat, Prof. Dr. Petra
Data Producers Institut zur Qualitätsentwicklung im Bildungswesen (IQB)
Funded by Standing Conference of the Ministers of Education and Cultural Affairs of the Länder in the Federal Republic of Germany
Related Studies IQB-BT 2018 (DOI: 10.5159/IQB_BT_2018_v1), PISA 2012 (DOI: 10.5159/IQB_PISA_2012_v5)
Suggested Citation Pant, H. A., Stanat, P., Hecht, M., Heitmann, P., Jansen, M., Lenski, A. E., Penk, C., Pöhlmann, C., Roppelt, A., Schroeders, U., & Siegle, T. (2015). IQB-Ländervergleich Mathematik und Naturwissenschaften 2012 (IQB-LV 2012) [IQB National Assessment Study 2012 (IQB-LV 2012)] (Version 4) [Data set]. Berlin: IQB – Institut zur Qualitätsentwicklung im Bildungswesen.
Restriction Notice Cognitive abilities must not be used as a dependent variable in the analyses.

Users of the data set should always cite the scale manual:

Lenski, A. E., Hecht, M., Penk, C., Milles, F., Mezger, M., Heitmann, P., Stanat, P., & Pant, H. A. (2016). IQB-Ländervergleich 2012. Skalenhandbuch zur Dokumentation der Erhebungsinstrumente. [IQB National Assessment Study 2012 scaling manual. Documentation of the survey instruments]. Berlin: Humboldt-Universität zu Berlin, Institut zur Qualitätsentwicklung im Bildungswesen.


Project description

The National Assessment Study in Mathematics and Science 2012 (IQB Ländervergleich Mathematik und Naturwissenschaften 2012) is a nationwide large-scale assessment commissioned by the Standing Conference of the Ministers of Education and Cultural Affairs of the Länder (KMK) in the Federal Republic of Germany. The study was designed to assess students’ achievement in mathematics, biology, chemistry and physics and to evaluate to which extent they meet educational standards in the German Länder. About 44,500 grade 9 students participated in the study. In mathematics, all general competencies and content areas were tested and averaged to obtain a global score of mathematics proficiency. In Science, in turn, only the two content areas “content knowledge” and “scientific inquiry” were assessed for biology, chemistry and physics (see also Educational standards by subject). In addition to achievement tests, the study used questionnaires for students, teachers and principals to assess curricular and extracurricular learning opportunities and to investigate which structures can be used to optimize learning processes. Indicators of the students’ reading abilities and basic cognitive abilities were also assessed. (IQB)

back to overview

Blank data sets

For a first overview of the data sets and their variables, dummy data sets containing the variables used and the value labels relating to them are provided for download here.

back to overview


Here you can find further documentation:

back to overview

Notes on the use of the data

Are the competence estimators of the PISA, IGLU and IQB studies comparable with each other?

In principle, the achievement tests used in German large scale assessment studies (PISA, IGLU and IQB studies) correlate highly, but the underlying competence models differ. The IQB tests are based on the educational standards of the The Standing Conference of the Ministers of Education and Cultural Affairs of the Länder in the Federal Republic of Germany (Kultusministerkonferenz, KMK) and as a result more closely aligned with the German school curriculum than PISA tests.

Comparability can be tested using IRT methods based on studies in which both PISA and IQB items were used. Some studies for comparison are, for example

The extent of comparability must be considered separately for reading and mathematical literacy and for secondary and primary education. Although it can be assumed that federal state differences can be well mapped using both measures, it is unfortunately not possible to analyse trends on a common metric.

How many classes per school are included in the sample in the IQB studies?

In the IQB studies, one class per school is usually included in the sample. Exceptions are made for some federal states and for some types of schools (e.g. special education schools). Information on sampling in the studies can be found in the results reports or scale manuals.

Here is a brief summary of the sampling procedure:

  • National Assessment Study 2008/2009: One 9th grade class per school; the entire class took part in the test; special education schools were not part of the sample.
  • National Assessment Study 2011: in regular schools: One 4th grade class per school; the entire class took part in the test; at special schools, all students in 4th grade with a special need in the area of learning, language, or emotional and social development participated across all classes.
  • National Assessment Study 2012: In grammar schools ("Gymnasium"), one 9th grade class was included in the study, in other school types (with the exception of special education schools), two classes per school (if available) were included. The entire classes took part in the test. At special schools, all students in 4th grade with a special need in the area of learning, language, or emotional and social development participated across all classes.
  • IQB Trends in Student Achievement 2015: In regular schools, one ninth grade class per school was included in the sample; the entire class took part in the test. In special education schools, all ninth grade adolescents with special needs in the area of learning, language, or emotional and social development participated in the study.
  • IQB Trends in Student Achievement 2016: in regular schools: one 4th grade class per school; the entire class took part in the test; at special schools, all students in 4th grade with a special need in the area of learning, language, or emotional and social development participated across all classes.
  • IQB Trends in Student Achievement 2018: In grammar schools ("Gymnasium"), one 9th grade class was included in the study, in other school types (with the exception of special education schools), two classes per school (if available) were included. The entire classes took part in the test. At special schools, all students in 4th grade with a special need in the area of learning, language, or emotional and social development participated across all classes.

Is it possible to establish links between students and teachers via the link data set?

The information on the subjects taught in the link data set was collected by answering the following question in the teachers' questionnaire: "Do you teach the classes/courses in the subjects mathematics, biology, chemistry, physics or natural sciences? Thus, the information in the link data set does not allow to determine beyond doubt whether a teacher has taught a specific student. In fact, some students were taught by several teachers. An unambiguous assignment is possible via the course names in the teacher data set (variable names: luntflvteil01_1_FDZ to luntflvteil24_1_FDZ) and the student data set (variable names: tkursdiffdeu_FDZ, tkursdiffmat_FDZ, tkursdiffbio_FDZ, tkursdiffche_FDZ, tkursdiffphy_FDZ, tkursdiffnwi_FDZ). You can use these variables via remote access. Nevertheless, it is not possible for all students and teachers to clearly assign the variables. On this challenge, you can consult chapter 12 (especially subchapter 12.6) of the report on the study, which is available online. LINK (only german)

What is the reliability of the scales BEFKI figural (wle.gff), C-test (wle.ctest) and highest ISEI of the family (HISEI)?

The WLE-reliabilities of the BEFKI (figural) and C-test scales are: BEFKI: 0.701; C-test: 0.884. Unfortunately, we cannot report a reliability coefficient for the highest ISEI in the family (HISEI), since this indicator was recorded via only one item.

How do I deal with missing values in multi-level models?

We recommend - as described in Chapter 10 of the report on the study LINK - the use of the Full Information Maximum Likelihood (FIML) approach to deal with missing values in multi-level models. IQB-analyses also used the PVs available in this dataset and only used FIML in Mplus afterwards.

Further methodological notes on dealing with missing values in multi-level models can be found in the following publications:

  • Grund, S., Lüdtke, O., & Robitzsch, A. (2018). Multiple imputation of missing data for multilevel models: Simulations and recommendations. Organizational Research Methods. doi: 10.1177/1094428117703686
  • Lüdtke, O., Robitzsch, A., & Grund, S. (2017). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods, 22, 141–165. doi: 10.1037/met0000096

Of the more than 44,000 students in the sample, about 40% have no performance data. What are the reasons for this?

The high percentage of missing values is due to the fact that not all students were presented with all competency tests, but a multiple matrix sampling was used. The missing values are therefore largely "Missing by Design". More information on test design can be found in the report on this study LINK (only german) in Chapter 4 and Chapter 13.

Is it possible to record the age of students (to the day) in the IQB studies?

Information on the year and age of birth of students is collected as standard in the IQB studies and is available for re- and secondary analyses of the data. However, for reasons of data protection, exact information on the date of birth was not recorded and is not available in the data sets. The exact test date is also not included in most data sets. Often, however, the data sets contain an age variable that was calculated using the year and month of birth in relation to the test.

What are the new variables in the students` data set in version 4?

In the fourth version, new variables on learning time in biology, chemistry, physics and natural sciences have been added to the students` dataset. These are the number of learning hours per semester from the 5th to the 9th grade in each tested subject. In addition, the variables on the cumulative number of learning hours per week in the school years from grade 5 to grade 9 in each tested subject have been updated. Furthermore, the variables on the level of competence in mathematics, biology, chemistry and physics have been corrected.

Here you can find an overview of the new variables:

learning time

Biology: pstdbio051.r, pstdbio052.r, pstdbio061.r, pstdbio062.r, pstdbio071.r, pstdbio072.r, pstdbio081.r, pstdbio082.r, pstdbio091.r, pstdbio092.r

Chemistry: pstdche051.r, pstdche052.r, pstdche061.r, pstdche062.r, pstdche071.r, pstdche072.r, pstdche081.r, pstdche082.r, pstdche091.r, pstdche092.r

Physics: pstdphy051.r, pstdphy052.r, pstdphy061.r, pstdphy062.r, pstdphy071.r, pstdphy072.r, pstdphy081.r, pstdphy082.r, pstdphy091.r, pstdphy092.r

Natural Sciences: pstdnws051.r, pstdnws052.r, pstdnws061.r, pstdnws062.r, pstdnws071.r, pstdnws072.r, pstdnws081.r, pstdnws082.r, pstdnws091.r, pstdnws092.r


Here you can find an overview of the updated variables:

Learning Time

lzbio, lzche und lzphy

Competency Levels

Mathematics (global scale): pv_1_GL_stufe - pv_15_GL_stufe

Biology (content knowledge): pv_1_BF_stufe - pv_15_BF_stufe

Biology (scientific inquiry): pv_1_BE_stufe - pv_15_BE_stufe

Chemistry (content knowledge): pv_1_CF_stufe - pv_15_CF_stufe

Chemistry (scientific inquiry): pv_1_CE_stufe - pv_15_CE_stufe

Physics (content knowledge): pv_1_PF_stufe - pv_15_PF_stufe

Physics (scientific inquiry): pv_1_PE_stufe - pv_15_PE_stufe

How are the data sets of the National Assessment Study 2012 linked to PISA 2012?

The data sets of the National Assessment Study 2012 can be linked with the data sets of the 9th graders in the PISA 2012 study, which is also available at the FDZ of the IQB. The classes of the PISA 2012 study (n = 9 998; two classes per school) participated in the competence testing of the IQB on the second day of testing. The linkage of the data sets is achieved via the variable idstud_FDZ.

Unfortunately, it is not possible to link the other PISA waves with the data of the IQB studies because the ID variables cannot be recoded uniformly.

back to overview


Selected literature is listed PDF here (as of September 2023).

The Laendervergleich 2012 Report (German only), a summary (english and german version) and more information can be found PDF here.


Sachse, K. A., Weirich, S., Mahler, N. & Rjosk, C. (2023). Explaining performance decline over the course of taking comprehensive proficiency tests: the roles of effort and omission propensity. International Journal of Testing.


Lenz, S., Stanat, P. & Rjosk, C. (2022). Schulische Segregation und ihre Veränderung im Zuge von Schulstrukturreformen in Berlin, Bremen und Hamburg. ZSE : Zeitschrift für Soziologie der Erziehung und Sozialisation, 42(1), 54–72.

Schneider, R., Gentrup, S., Jansen, M. & Stanat, P. (2022). Kohortentrends in schulfachbezogenen Selbstkonzepten und Interessen bei Mädchen und Jungen. Zeitschrift für Pädagogische Psychologie, 50, 182.

Schotte, K., Rjosk, C., Edele, A., Hachfeld, A. & Stanat, P. (2022). Do teachers’ cultural beliefs matter for students’ school adaptation? A multilevel analysis of students’ academic achievement and psychological school adjustment. Social Psychology of Education, 25(1), 75–112.

Winkler, O., Jansen, M. & Edele, A. (2022). Warum gibt es in Ostdeutschland weniger einwanderungsbezogene Bildungsungleichheit? Bedingungen der Bildungsbeteiligung und Lesekompetenz von Heranwachsenden mit Einwanderungsgeschichte in Ost- und Westdeutschland. Zeitschrift für Soziologie, 51(2), 131–153.



Grewenig, E. (2021) School Track Decisions and Teacher Recommendations: Evidence from German State Reforms (ifo Working Papers 353). München: ifo Institut - Leibniz-Institut für Wirtschaftsforschung an der Universität München (ifo Institut). Retrieved from

Milles, F. & Jansen, M. (2021). Die Bedeutung von Unterrichtsmerkmalen für das mathematische Selbstkonzept und für die Moderation des Big-Fish-Little-Pond Effekts. In R. Lazarides & D. Raufelder (Hrsg.), Motivation in unterrichtlichen fachbezogenen Lehr-Lernkontexten. Perspektiven aus Pädagogik, Psychologie und Fachdidaktiken (Springer eBook Collection, Bd. 10, 1. Aufl., S. 299–329). Wiesbaden: Springer Fachmedien.

Müller, M. (2021). Perspektiven für die dritte Phase: Eine Analyse des Fortbildungsverhaltens von Lehrkräften in Baden-Württemberg. Universität Tübingen.


Edele, A., Jansen, M., Schachner, M. K., Schotte, K., Rjosk, C. & Radmann, S. (2020). School track and ethnic classroom composition relate to the mainstream identity of adolescents with immigrant background in Germany, but not their ethnic identity. International Journal of Psychology : Journal International De Psychologie, 55(5), 754–768.

Kuschel, J., Richter, D. & Lazarides, R. (2020). Wie relevant ist die gesetzliche Fortbildungsverpflichtung für Lehrkräfte? Eine empirische Untersuchung zur Fortbildungsteilnahme in verschiedenen deutschen Bundesländern. Zeitschrift für Bildungsforschung, 38(4), 915.

Müller, M., Baust, C., Fleck, P. B., Werner-Neumann, E., Pachner, A. & Schmidt-Hertha, B. (2020). Leitlinien für die universitäre Lehrerfort- und -weiterbildung. Zeitschrift Hochschule und Weiterbildung (ZHWB), 2020(2), 52–58.


Bergbauer, A. B. (2019). Conditions and consequences of education - microeconometric analyses - Dissertation. Ludwig-Maximilians-Universität München, München.

Jansen, M., Schroeders, U., Lüdtke, O. & Marsh, H. W. (2019). The dimensional structure of students’ self-concept and interest in science depends on course composition. Learning and Instruction, 60(4), 20–28.

Krohmer, K. (2019). Explanatory missing propensity models as an instrument for item evaluation - Unveröffentlichte Masterarbeit. Otto-Friedrich-Universität Bamberg, Bamberg.

Lenski, A. E., Richter, D. & Lüdtke, O. (2019). Using the theory of planned behavior to predict teachers’ likelihood of taking a competency-based approach to instruction. European Journal of Psychology of Education, 34(1), 169–186.


Autorengruppe Bildungsberichterstattung. (2018). Bildung in Deutschland 2018. Ein indikatorengestützter Bericht mit einer Analyse zu Bildung und Migration. Bielefeld: wbv.

Jores, D. & Leiss, L. M. (2018). Die Kontroversen um PISA und Co. sowie eine Untersuchung zur Einstellung von Lehrkräften und Schulleitern zu Large Scale Assessments - Unveröffentlichte Bachelorarbeit. Johannes-Gutenberg-Universität Mainz, Mainz.

Milles, F. (2018). Das Selbstkonzept im Fach Mathematik - Effekte von Unterrichtsmerkmalen und Moderation des big-fish-little-pond Effekts - Unveröffentlichte Masterarbeit. Humboldt-Universität zu Berlin, Berlin.

Pluschnikov, M., Gianneres, S. & Sicking, T. (2018). Eine empirische Analyse des einschulungsbedingten Geburtsmonatseffektes auf die kognitive Leistungsfähigkeit und den beruflichen Erfolg in Deutschland - Unveröffentlichte Seminararbeit. Westfälische Wilhelms-Universität Münster, Münster.

Richter, E., Richter, D. & Marx, A. (2018). Was hindert Lehrkräfte an Fortbildungen teilzunehmen? Eine empirische Untersuchung der Teilnahmebarrieren von Lehrkräften der Sekundarstufe I in Deutschland. Zeitschrift für Erziehungswissenschaft, 21(5), 1021–1043.

Schmitt, N. (2018). Kompetenzorientierter Unterricht und eingesetzte Schulbücher im Fach Mathematik - Unveröffentlichte Bachelorarbeit. Johannes-Gutenberg-Universität Mainz, Mainz.


Wurster, S., Richter, D. & Lenski, A. E. (2017). Datenbasierte Unterrichtsentwicklung und ihr Zusammenhang zur Schülerleistung. Zeitschrift für Erziehungswissenschaft, 20(4), 628–650.


Autorengruppe Bildungsberichterstattung. (2016). Bildung in Deutschland 2016. Ein indikatorengestützter Bericht mit einer Analyse zu Bildung und Migration. Bielefeld: Bertelsmann.

Huebener, M., Kuger, S. & Marcus, J. (2016) Increased Instruction Hours and the Widening Gap in Student Performance (1st ed.) (DIW Discussion Paper 1561). Berlin: Deutsches Institut für Wirtschaftsforschung.

Jansen, M., Lüdtke, O. & Schroeders, U. (2016). Evidence for a positive relation between interest and achievement: Examining between-person and within-person variation in five domains. Contemporary Educational Psychology, 46, 116–127.

Jansen, M. & Stanat, P. (2016). Achievement and Motivation in Mathematics and Science: The Role of Gender and Immigration Background. International Journal of Gender, Science and Technology, 8, 4–18.

Krohmer, K. (2016). Effekte der sprachlichen Komplexität von Mathematiktestaufgaben für Schülerinnen und Schüler mit niedrigem sozioökonomischen Status - Unveröffentlichte Bachelorarbeit. Freie Universität Berlin, Berlin.

Lenski, A. E., Hecht, M., Penk, C., Milles, F., Mezger, M., Heitmann, P., Stanat, P. & Pant, H. A. (2016). IQB-Ländervergleich 2012. Skalenhandbuch zur Dokumentation der Erhebungsinstrumente. Berlin: Humboldt-Universität zu Berlin, Institut zur Qualitätsentwicklung im Bildungswesen.

Weirich, S., Hecht, M., Penk, C., Roppelt, A. & Böhme, K. (2016). Item Position Effects Are Moderated by Changes in Test-Taking Effort. Applied Psychological Measurement, 41(2), 115–129.

Winkler, T. (2016). Soziale Herkunft und schulischer Erfolg: Eine empirische Studie zur moderierenden Wirkung bildungspolitischer Maßnahmen - Unveröffentlichte Bachelorarbeit.


Hecht, M., Weirich, S., Siegle, T. & Frey, A. (2015). Effects of design properties on parameter estimation in large-scale assessments. Educational and Psychological Measurement, 75(6), 1021–1044.

Hecht, M., Weirich, S., Siegle, T. & Frey, A. (2015). Modeling booklet effects for nonequivalent group designs in large-scale assessment. Educational and Psychological Measurement, 75(4), 568–584.

Jansen, M., Schroeders, U., Lüdtke, O. & Marsh, H. W. (2015). Contrast and assimilation effects of dimensional comparisons in five subjects: An extension of the I/E model. The Journal of Educational Psychology, 107(4), 1086–1101.

Pant, H. A., Stanat, P., Hecht, M., Heitmann, P., Jansen, M., Lenski, A. E., Penk, C., Pöhlmann, C., Roppelt, A., Schroeders, U. & Siegle, T. (2015). IQB-Ländervergleich in Mathematik und den Naturwissenschaften 2012 (IQB-LV 2012) (Version 4) [Datensatz]. Berlin: IQB - Institut zur Qualitätsentwicklung im Bildungswesen.

Penk, C. & Schipolowski, S. (2015). Is it all about value? Bringing back the expectancy component to the assessment of test-taking motivation. Learning and Individual Differences, 42, 27–35.

Schroeders, U., Schipolowski, S. & Wilhelm, O. (2015). Age-related changes in the mean and covariance structure of fluid and crystallized intelligence in childhood and adolescence. Intelligence, 48, 15–29.


Jansen, M., Schroeders, U., Lüdtke, O. & Pant, H. A. (2014). Interdisziplinäre Beschulung und die Struktur des akademischen Selbstkonzepts in den naturwissenschaftlichen Fächern. Zeitschrift für Pädagogische Psychologie, 28(1-2), 43–49.

Van den Ham, A.-K., Nissen, A., Ehmke, T., Sälzer, C. & Roppelt, A. (2014). Mathematische Kompetenz in PISA, IQB-Ländervergleich und NEPS - Drei Studien, gleiches Konstrukt? Unterrichtswissenschaft, 42(4), 321–341. Verfügbar unter ;|LOG_0042


Pant, H. A., Stanat, P., Schroeders, U., Roppelt, A., Siegle, T. & Pöhlmann, C. (Hrsg.). (2013). IQB-Ländervergleich 2012. Mathematische und naturwissenschaftliche Kompetenzen am Ende der Sekundarstufe I - Ergebnisbericht. Münster: Waxmann.

back to overview