Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Please note that all times are shown in the time zone of the conference. The current conference time is: 17th May 2024, 05:01:32am GMT

 
 
Session Overview
Session
09 SES 16 A: Understanding Learning Outcomes and Equity in Diverse Educational Contexts
Time:
Friday, 25/Aug/2023:
1:30pm - 3:00pm

Session Chair: Kajsa Yang Hansen
Location: Gilbert Scott, EQLT [Floor 2]

Capacity: 120 persons

Paper Session

Show help for 'Increase or decrease the abstract text size'
Presentations
09. Assessment, Evaluation, Testing and Measurement
Paper

Investigation of Factors Related to Immigrant Students' Mathematics Performance in PISA 2018

Ayse Akkir1, Serkan Arikan2

1Bogazici University, Turkiye; 2Bogazici University, Turkiye

Presenting Author: Akkir, Ayse

There has been a large number of studies on the performance gap between immigrant and native students (Arikan, Van de Vijver, & Yagmur, 2017; Martin, Liem, Mok, & Xu, 2012; Pivovarova and Powers, 2019; Rodriguez, Valle, Gironelli, Guerrero, Regueiro, & Estévez, 2020). It is critical to identify variables that could have relation with the performance of immigrant students. Numerous studies have been conducted on the variables that are related to immigrant students’ performance. Some studies emphasize immigrants' resilience (Rodriguez et al, 2020), while others focus on exposure to bullying (Karakus, Courtney, & Aydin, 2022; Ponzo, 2013). It was found that native students had higher scores than immigrant students on three indicators of wellbeing such as positive affect, self-efficacy-resilience, and a sense of belonging to the school (Rodríguez et al., 2020). Investigating factors at the student- and country-level that predict immigrant students’ performance may assist policymakers in taking education related action.

Thus, this study focus on identifying student- and country-level variables that are associated with mathematics performance of immigrant students using PISA 2018 data. In this regard, student-level variables are chosen based on Walberg’s theory (Walberg, 2004). According to Walberg's theory of academic achievement, a student's success is impacted by their characteristics and their environment. The main psychological factors influencing academic achievement were categorized into three groups. Student ability, instruction, and psychological environment are the categories. Student aptitude refers to a student's capacity, growth, drive, or predisposition for extreme perseverance in academic work. Both the quantity and the quality of the instructional time are part of instruction. Psychological environments refer to the morale or students' views of their peers in the classroom and the home environment. The morale of students or their perceptions of their classmates in the classroom and at home constitute psychological settings (Walberg, 2004). On the other hand, country-level variables are chosen based on research. Some research suggests that migrant integration policy index (MIPEX) is associated with achievement (Arikan et al, 2017; He et al., 2017). In addition to this, some research claims that the human development index (HDI) was found associated with achievement (Arikan et al., 2020).

The following research questions of the current study are

RQ1: Which student-level (motivation to master tasks, resilience, cognitive flexibility/adaptivity, exposure to bullying, sense of belonging, discriminating school climate, students’ attitudes toward immigrants) and country-level (MIPEX and HDI) variables could predict mathematics performance of immigrant students and native students across European countries in PISA 2018?

RQ2: Is there a statistically significant difference between the mathematics performance of first-generation immigrant students, second-generation immigrant students and native students across European countries in PISA 2018?

RQ3: Is there a statistically significant difference between the mathematics performance of first-generation immigrant students, second-generation immigrant students and native students after controlling economic and social status (ESCS)?


Methodology, Methods, Research Instruments or Sources Used
Participants
Participants are immigrant students (first- and second-generation) and native students who took the PISA assessment in 2018. Students who were born in another country and whose parents were born in another country are considered first-generation. Second-generation students are those who were born in the country of assessment but whose parents were born elsewhere. Native students are those whose parents (at least one of them) were born in the assessment country (OECD, 2019). The data from 14 European countries such as Croatia, Estonia, Germany, Greece, Iceland, Ireland, Italy, Latvia, Malta, Portugal, Serbia, Slovenia, Spain, Switzerland were included.

Measures
PISA does not only measure the performance of students but also gather data about students’ backgrounds by applying questionnaires. Student-level variables are chosen from the student questionnaires. Student-level variables are motivation to master tasks, resilience, and cognitive flexibility/adaptivity, exposure to bullying, sense of belonging, discriminating school climate, students’ attitudes toward immigrants. At the country-level, migrant integration policy index and human development index was used. As a control variable economic and social status (ESCS) was used.

Data Analysis
In order to answer the first research question, multilevel regression analysis will be used to investigate which student-level and country-level variables could predict the mathematics performance of immigrant and native student. For the multilevel regression analyses, MPLUS 7.4 will be used. For the second research question, independent samples t-test will be used to compare the performance of immigrant students and native students. The sample weights and plausible values will be included in the analyses to have unbiased results by using IDB Analyzer (Rutkowski, Gonzalez, Joncas, & Von Davier, 2010). In order to answer the third research question, propensity score matching will be used first and then related comparisons will be performed to examine the performance gap between immigrant students and native students after controlling economic and social status. The MatchIt R package (Ho, Imai, King, & Stuart, 2011) will be used for propensity score matching.

Conclusions, Expected Outcomes or Findings
The intraclass correlation will be reported to partition the variation in immigrant students’ math performance by country-level and student-level differences. Moreover, R-square will be used to understand explained variances in mathematics performance by student-level and country-level variables of the current study. Then, student- and country-level variables that are significantly related to mathematics performance will be reported.

Multiple independent samples t-test will be used to test if statistically significant difference exists between the mathematics performance of first-generation immigrant students, second-generation immigrant students and native students. The mathematics performance of first-generation immigrant students and native students will be compared. Then, second-generation immigrant students’ and native students’ mathematics performance will be compared. After that, first-generation and second-generation immigrant students’ mathematics performance will be compared. Confidence intervals, t-values and effect sizes will be presented.  Since the sample weights and plausible values had to be included in the analyses, ANOVA could not be used. IDB Analyzer will be used because it considers sample weights and plausible values. Applying multiple t-test may increase the chance of type 1 error. Therefore, the Bonferroni adjustment will be used to lower the likelihood of receiving false-positive findings. The adjustment is made by dividing the p-value into the number of t-test (Napierala, 2012). Therefore, the correction will be made by dividing the p-value (0.5) by the number of t-tests (3).  

Propensity score matching will be used to investigate the performance difference between immigrant students and native students after ESCS has been controlled. The scores of economic and social status will be matched for immigrant and native groups so that the groups will be similar regarding ESCS. Then, the performance of immigrant students and native students will be compared by applying the t-test. Effect size of performance difference before and after matching will be compared.

References
Arikan, S., Van de Vijver, F. J., & Yagmur, K. (2017). PISA mathematics and reading performance differences of mainstream European and Turkish immigrant students. Educational Assessment, Evaluation and Accountability, 29(3), 229-246.
Arikan, S., van de Vijver, F. J., & Yagmur, K. (2020). Mainstream and immigrant students’ primary school mathematics achievement differences in European countries. European Journal of Psychology of Education, 35(4), 819-837.
Ho, D., Imai, K., King, G., & Stuart, E. A. (2011). MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Journal of Statistical Software, 42(8), 1–28. doi:10.18637/jss
IEA (2022). Help Manual for the IEA IDB Analyzer (Version 5.0). Hamburg, Germany. (Available from www.iea.nl)
Karakus, M., Courtney, M., & Aydin, H. (2022). Understanding the academic achievement of the first-and second-generation immigrant students: A multi-level analysis of PISA 2018 data. Educational Assessment, Evaluation and Accountability, 1–46.
Martin, A. J., Liem, G. A., Mok, M., & Xu, J. (2012). Problem solving and immigrant student mathematics and science achievement: Multination findings from the Programme for International Student Assessment (PISA). Journal of educational psychology, 104(4), 1054.
Napierala, M. A. (2012). What is the Bonferroni correction? Aaos Now, 40-41.
OECD (2019), PISA 2018 Results (Volume III): What School Life Means for Students’ Lives, PISA, OECD Publishing, Paris, https://doi.org/10.1787/acd78851-en.
Pivovarova, M., & Powers, J. M. (2019). Generational status, immigrant concentration and academic achievement: comparing first and second-generation immigrants with third-plus generation students. Large- scale Assessments in Education, 7(1), 1-18.
Ponzo, M. (2013). Does bullying reduce educational achievement? An evaluation using matching estimators. Journal of Policy Modeling, 35(6), 1057–1078.
Rodríguez, S., Valle, A., Gironelli, L. M., Guerrero, E., Regueiro, B., & Estévez, I. (2020). Performance and well-being of native and immigrant students. Comparative analysis based on PISA 2018. Journal of Adolescence, 85, 96–105.
Rutkowski, L., Gonzalez, E., Joncas, M., & Von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational researcher, 39(2), 142-151.
Walberg, H. J. (2004). Improving educational productivity: An assessment of extant research. The LSS Review, 3(2), 11-14.


09. Assessment, Evaluation, Testing and Measurement
Paper

Differences in Component Reading Skills Profiles for Native and Immigrant Fifteen-Year-Old Students in Sweden

Camilla Olsson, Monica Rosén

University of Gothenburg

Presenting Author: Olsson, Camilla

Today, societies in many countries are multilingual. Multilingualism can contribute to success in school and later in working life. An individual's language development affects their reading development (Kirsch et al., 2002). Thus, in a learning context, such as in school, reading skills are effective tools for obtaining, organizing, and using information in various fields (Artelt, Schiefele & Schneider, 2001). Reading is a multi-component process (Grabe, 2009) and many students during middle school have difficulty moving from “learning to read” to “reading to learn”. Fluency, previous knowledge, experience, and word knowledge are important since the students are expected to read about, for them, unknown subjects in which words and linguistic structures are more complex (Wharton‑McDonald & Erickson, 2016). For a student who reads the information at school in a second language (L2), the reading process becomes even more complex. Grabe (2009) summarizes the major overall differences between reading in a first language (L1) and a second language (L2): “Linguistic and processing differences, developmental and educational differences and sociocultural and institutional differences” (Grabe, 2009, p.130). Research has also shown that it takes at least four to five years before an individual can use their second language (L2) as a school language (Cummins, 2017; Thomas & Collier, 2002). In PISA 2018 the students in Sweden performed at a higher level on the reading literacy test, than the average for OECD countries. However, the groups of students with a foreign background, both those born in Sweden and those born abroad, performed at a lower level than native Swedish students (National Agency for Education, 2019). Educators in several countries have expressed concern about how education for first- and second-generation immigrant students is designed (Cummins, 2011).

In this study, PISA data from 2018 was used to investigate the patterns of variation regarding the components in PISA defined as reading fluently (reading fluency) and the students' perception of the usefulness of reading strategies regarding memorizing and understanding texts(UNDREM) (awareness of the usefulness of reading strategies) effects on reading literacy performance for multilingual fifteen years old students in a Swedish context. The aim is to get a better understanding of similarities and differences in the students’ component skills reading profiles (CSRP), in this study defined as learners’ relative development of different reading subskills, between categories of students with different language backgrounds.

Two research questions were posed:

  • What is the relative importance of reading fluency and the awareness of the usefulness of reading strategies in memorizing and understanding texts on an overall reading performance for native, second generation and first generation students?
  • Are there similarities and differences between the three categories of students regarding the effects of reading fluency and the awareness of the usefulness of reading strategies on the processes locate information, understanding and evaluating and reflecting?

To analyse the results the theory component skills approach to reading was used. In the approach, the overall multicomponent reading process is divided into two processes, defined as lower-level processes and higher-level processes (Grabe, 2009; Koda, 2005). The approach can show if and how these processes interact and how much each of the processes contributes individually and collectively to reading comprehension for both L1 and L2 readers (Grabe, 2009; Koda, 2005). In this study reading fluency is assumed to be related to lower-level processes and awareness of the usefulness of reading strategies to higher–level processes. Both components are, in theories and research, described to be of importance in a reading process (Grabe, 2009; Koda, 2005). Knowledge about differences and similarities in the students’ component reading profiles can potentially be used in the future development of reading instruction for multilingual students.


Methodology, Methods, Research Instruments or Sources Used
This study is based on a secondary analysis performed with data generated during the PISA 2018 when reading was the main subject for the third time. In Programme for International Student Assessment (PISA), students' knowledge in the subjects of reading literacy, mathematics and science is examined just before the students are about to leave compulsory school (OECD, 2019). The Swedish sample consisted of 5504 students of which 4283 were native students, 556 were second generation students, and 499 were first generation students. The observed independent variables used in the study were the variables in PISA defined as “reading fluently” (reading fluency) and the students’ self-reported index variable UNDREM (Meta-cognition: understanding and remembering) (reading strategies). Data preparation and management were performed in SPSS 28, and the analyses were carried out in Mplus Version 8 (Muthén & Muthén, 1998-2017). Weights are used according to common practice in the analysis of PISA data. In order to investigate the differences and similarities in the students’ component reading profiles (in this study defined as learners' relative development of reading subskills) between the different categories of students, five separate multigroup path analyses were conducted. In the first two analyses the overall PISA score for each of the students was used as a dependent variable. In the models three to five the scores on the processes measured in PISA, locating information, understanding and evaluating and reflecting were used as dependent variables. In this study all 10 plausible values for each of the processes were used. In the first path analysis the test-takers were divided into two groups, native students (who were born in Sweden) and first generation students (students who were born abroad with parents who were also born abroad). In the following models the three categories, native, second generation and first generation of students were measured separately. Additionally, the relations between the different categories of the students' reading fluency and awareness of the usefulness of reading and various levels of reading proficiency defined in the PISA assessment were compared and visualized.
Conclusions, Expected Outcomes or Findings
The results revealed that there are significant differences in the effects of the lower-level process related to reading fluency (β (Native and SecGen)= 0.434), (β (FirstGen)= 0.631) and the higher-level process related to the awareness of the usefulness of reading strategies (β (Native and SecGen)= 0.349), β (FirstGen)= 0.222) on the students reading literacy performance between, on the one hand the students who were born in Sweden and on the other those who were born abroad. In model two when all three categories of students were included the effects of the two components on reading literacy performance were almost similar between the group of native students (β (RF) = 0.422, β (RS)= 0.346) and second generation students (β (RF) = 0.491, β (RS)=0.339) while for the group of first generation students the relation was much larger (β (RF) = 0.631, β (RS)=0.222). When comparing the relation between reading fluency and the students’ perceptions regarding the usefulness of reading strategies with the proficiency levels defined in PISA the results showed that the test-takers in the group of first generation students have another distribution of higher- and lower-level processes up to proficiency levels three (between 480-553 score points on the PISA test) than the test takers in the groups of native and second generation students. Thus, the result from the analysis indicates that the groups of students have different component skills reading profiles and appear to rely on partly different kinds of processes at several reading proficiency levels. The patterns with regard to both reading fluency and awareness of the usefulness of reading strategies are similar for the groups of native and second generation students but different for the group of first generation students. The results indicate that the relative importance of reading fluency and awareness of the efficiency of reading strategies is different for first generation students.
References
Artelt, Schiefele & Schneider, 2001). Artelt, C., Schiefele, U., & Schneider, W. (2001) Predictors of Reading Literacy. European Journal of Psychology of Education 26, (3), 363.

Grabe, William. (2009). Reading in a Second Language: Moving from Theory to Practice. Cambridge University Press: Cambridge.

Kirsch, I., de Jong, J., La Fontaine, D. McQueen, J., Mendelovits, J.,Monseur, C. (2002). Reading for change: Performance and engagement across countries. Paris: Organization for Economic Co-operation and Development.

Koda, K. (2005). Insights into second language reading: A cross-linguistic approach. Cambridge: Cambridge University Press.

Muthén, L. K., & Muthén, B. (2017). Mplus user’s guide. Statistical Analysis With Latent Variables Muthén & Muthén.

National Agency for Education. (2019). PISA 2018, 15-åringars kunskaper i läsförståelse, matematik och naturvetenskap. Stockholm: National Agency for Education.

OECD. (2019). PISA 2018 Assessment and Analytical Framework. Paris: OECD Publishing.
 https://doi.org/10.1787/b25efab8-en.

Wharton-McDonald, R., & Erickson, J. (2016). Reading Comprehension in the Middle Grades Characteristics, Challenges, and Effective Supports. I S.E. Israel (red), Handbook of research on reading comprehension (s 353-376.). Guilford Publications: New York.


09. Assessment, Evaluation, Testing and Measurement
Paper

Gender Gap and Differentiating Trends in Learning Outcomes in Estonia and Finland

Arto Ahonen

University of Jyväskylä, Finland

Presenting Author: Ahonen, Arto

In PISA 2018, girls outperformed boys in reading by almost 30 score points. However, the size of the gender gap did not seem to be related to the average performance. In 16 out of the 25 countries and economies whose mean score was above the OECD average, the difference in the reading performance between boys and girls was smaller than the average gender gap across OECD countries (OECD, 2019b). Among these high-performing countries, the difference between girls' and boys' performance ranged from 13 score points in B-S-J-Z (China) to 52 in Finland.

In societies where gender equality is enhanced, girls often perform better in reading and maths (Scheeren, van de Werfhorst and H., & BolStill, 2019). This paper examines the gender gap in learning in two well-performing neighbouring countries, Finland and Estonia. In both countries, gender equality is established and travels well across the sectors of society. In Finland, there has been a declining trend in students' PISA performance in all core assessment domains, reading, mathematics and science since 2009. At the same time, the gender gap in Finland has transformed to favour girls in mathematics and science (OECD, 2019a). Meanwhile, in Estonia, the country's average performance has increased in reading and mathematics and remained at its level in science. Also, in Estonia, the gender gap has narrowed in reading, is neutral in science and developed to favour boys in mathematics.

Even though gender differences are probably the most commonly examined education outcomes, it remains unclear what the underlying causes of the existing differences remain. Maccobly and Jacklin (1974) concluded their substantially extensive review that whilst some patterns persist, for example, female superiority in verbal skills and male superiority in mathematical skills, it is not easy to untangle the influence of stereotyping on individuals' perceptions of and behaviour towards, events and objects. According to them, it was also challenging to separate if, and to what extent, innate or learned behaviours underpin the development of behavioural or cognitive gender differences. The focus on masculinity in crisis is potentially fruitful, however, because it shifts the emphasis away from structural factors in post-industrial societies, which position boys as inevitable ‘losers’. Instead, it would be necessary to explore the characteristics of masculinity that inhibit boys as learners and citizens and how these might be challenged (Epstein et al., 1998).

There is a substantial variation in gender differences, but no equal starting point given the considerable differences between countries in their provision of preschool education, age of entry into formal schooling, age of school tracking, community resources such as libraries, training of teachers, general learning cultures for example (Topping et al., 2003). From this societal and educational structure point of view, Estonia and Finland are very similar. So, it is not easy to adduce which factors have the most significant influence and why. Previous research has shown that students' families' socioeconomic status has a somewhat differentiated effect on performance by gender (Van Hek, Buchmann, & Kraaykamp, 2019; Autor, 2019). Also, students' motivation and self-efficacy are among the most vital associates of their performance across PISA studies, specifically in Finland and Estonia (Lee & Stankov, 2018; Lau & Ho, 2022).

The following research questions were formulated to examine these topics: How do motivation and self-efficacy predict girls’ and boys’ proficiency in Finland and Estonia in PISA cycles from 2006 to 2018? Could the gender gap explain the differentiating trajectories of a country's educational outcomes?


Methodology, Methods, Research Instruments or Sources Used
Finnish and Estonian data are first compared with the IDB analyser utilising the SPSS program. A linear regression analysis was conducted separately of the predictors for girls’ and boys’ country average scores calculated with ten plausible values in PISA cycles 2006, 2009, 2012, 2015 and 2018, of which have mathematics, science and reading literacy as the main domain. Descriptive statistics were calculated and presented for each cycle. The predicting factors of self-efficacy and motivation or joy/like of the main domain school subject were examined as computed variables with Weighted Likelihood Estimate (WLE) values. The ESCS index was used as an indicator of students' socioeconomic background, which was also used either as a control covariate or a predicting variable to examine the possible differentiated effect it may have on gender proficiency. Finally, regression analysis was conducted to form a predicting model for girls’ and boys’ proficiency in every domain, both for Finland and Estonia
Conclusions, Expected Outcomes or Findings
The preliminary results reveal that while in the first two cycles of the PISA study, gender differences were not as evident in Finland as later on, the motivation towards the assessed domain was higher than in the later cycles. Also, the motivational factors were stronger predictors for main domain proficiency in Finland than they were in Estonia in the earlier cycles, 2006 and 2009. In the recent cycles, 2015 and 2018, self-efficacy was the strongest predictor in Finland and Estonia. It appears that the change in the level of motivational factors has been towards a lower level in Finland but remained stable or slightly increased in Estonia. Finally, the applied regression models could predict more of the variance of the girls than the boys in each major domain in each cycle.
References
Autor, D. Figlio, D., Karbownik, K., Roth, J., & Wasserman, M. 2019. Family disadvantage and the gender gap in behavioural and educational outcomes. American economic journal: Applied economics 11(3), 338–381. https://doi.org/10.1257/app.20170571
Epstein, D., Ellwood, J., Hey, V. & Maw, J., 1998. Failing boys? Issues in gender and achievement. Buckingham: Open University Press.
Van Hek, M., Buchmann, C., & Kraaykamp, G. 2019. Educational Systems and Gender Differences in Reading: A Comparative Multilevel Analysis. European Sociological Review 35 (2), 169–186. https://doi.org/10.1093/esr/jcy054
Lau, KC., Ho, SC. 2022. Attitudes Towards Science, Teaching Practices, and Science Performance in PISA 2015: Multilevel Analysis of the Chinese and Western Top Performers. Research in Science Education 52, 415–426 https://doi.org/10.1007/s11165-020-09954-6
Lee, J., & Stankov, L. 2018. Non-cognitive predictors of academic achievement: Evidence from TIMSS and PISA. Learning and Individual Differences 65 (3), 50–64.
Maccoby, E.E. & Jacklin, C.N., 1974. The psychology of sex differences. Stanford: Stanford University Press.
OECD 2019a. PISA 2018 Results. Volume I. What Students Know and Can do? Paris: OECD Publishing.

OECD 2019b. PISA 2018 Results. Volume II. Where All Students Can Succeed. Paris: OECD Publishing.

Scheeren, L., van de Werfhorst, H., & Bol, T.  2018 The Gender Revolution in Context: How Later Tracking in Education Benefits Girls. Social Forces 97 (1), 193–220. https://doi.org/10.1093/sf/soy025


09. Assessment, Evaluation, Testing and Measurement
Paper

Alternative Indicators of Economic, Cultural, and Social Status for Monitoring Equity: A Construct Validity Approach

Alejandra Osses, Raymond J. Adams, Ursula Schwantner

Australian Council for Educ. Research, Australia

Presenting Author: Osses, Alejandra

Background: Young people’s economic, cultural and social status (ECSS) is one of the most prevalent constructs used for studying equity of educational outcomes. National, regional and international large-scale assessments have furthered the quantitative research concerning the relationship between economic, cultural, and social background indicators and educational outcomes (Broer et al., 2019; Lietz et al., 2017; OECD, 2018).

However, there are observed theoretical and analytical limitations in the use of existing ECSS indicators from large-scale assessments for the purpose of monitoring equity in education (Osses et al., Forthcoming). Theoretical limitations relate to inconsistencies in how the ECSS construct is defined and operationalised, which pose significant challenges for comparing results between large-scale assessments and limit the usability of findings in addressing policy issues concerning equity in education. For example, Osses et al. (2022), demonstrated that using alternative approaches for constructing an ECSS indicator leads to different judgements concerning education systems in terms of equity of learning achievement.

Analytical limitations relate to the validity and reliability of ECSS indicators used in large-scale assessments. Whilst studies often explore reliability, cross-national invariance and other psychometric properties of ECSS indicators, information about the performance of alternative indicators is not provided. In fact no studies were found that compare the performance of alternative ECSS indicators constructed by large-scale assessments; Oakes and Rossi (2003) is an example from health research.

Objective: This paper focuses on analysing the properties of two ECSS indicators constructed using alternative theoretical and analytical approaches, applied to the same student sample. Evidence on validity is provided to evaluate the relative merits and the comparability of the two indicators for monitoring equity in education.

Method: This study analyses the properties of students’ ECSS indicators constructed by PISA and TIMSS with the aim of providing evidence concerning the validity and comparability of these two indicators. The novelty of the methodological approach lies in estimating both indicators for the same sample of students – those in PISA 2018, and thus analysing the merits of each analytical approach.

Indicators are analysed in terms of its content – ie, evaluating alignment between the theoretical construct, the indicators and the items chosen for its operationalisation – and its internal consistency. Indicators’ internal structure is investigated using confirmatory factor analysis and item response modelling in relation to model fit and the precision with which the indicators measure the ECSS construct – that is, targeting and reliability. The use of plausible values as a strategy to reduce error in making inferences about the population of interest is also explored.

Preliminary results show that the TIMSS-like indicator constructed using PISA 2018 data may benefit from better defining the underlying construct and of theoretical support to provide evidence for evaluating the adequacy of indicators chosen in its operationalisation. In terms of internal consistency, results indicate that items in the TIMSS-like indicator are “too easy” for the PISA population of interest and, although response data show a reasonably fit to the measurement model, the chosen items provide an imprecise measurement of students’ ECSS.

Three key conclusions emerge from preliminary results. First, large-scale assessments should devote more time to clearly define and provide theoretical support for the construct of students’ ECSS. Second, items used in summary indicators of ECSS should be carefully inspected, not only in terms of their reliability but also in terms of the adequacy of response categories and fit to measurement model. Third, the use of plausible values should be considered in order to avoid bias and improve precision of population estimates. The PISA indicator is currently being analysed.


Methodology, Methods, Research Instruments or Sources Used
This work extends the analysis in Osses et al. (2022) to investigate the properties of two alternative ECSS indicators constructed with the same student sample using PISA 2018 data. The first indicator corresponds to the PISA Economic, Social, and Cultural Status index (hereinafter, PISA_ESCS). The second indicator is constructed by recoding PISA data to obtain variables that are identical to those used in the TIMSS Home Educational Resources scale for grade 8 students (hereinafter, PISA_HER) and following the procedures detailed in the TIMSS 2019 technical report (Martin, von Davier, & Mullis, 2020). Two main aspects of validity (AERA et al., 2014) are evaluated: evidence on indicators’ content and internal structure.
Evidence on indicators’ content
Evaluating alignment between the construct, indicators and items chosen for its operationalisation allows determining whether scores can be interpreted as a representation of individuals’ ECSS. This is typically referred to as evidence of content relevance and representation (AERA et al., 2014; Cizek, 2020; Messick, 1994). To investigate content relevance and representation, a review of published documentation of PISA and TIMSS was undertaken in relation to theoretical underpinning, conceptualisation and operationalisation of each indicator.
Evidence on indicators’ internal structure
The modelling approach of each indicator is analysed in relation to the appropriateness of analytical steps followed in its construction. The PISA_ESCS is the arithmetic mean of three components, highest parental education and occupation, and home possessions – the latter being a latent variable indicator constructed using Item Response Modelling – IRM (OECD, 2020). The PISA_HER is the application of an IRM to three items: highest parental education, study support items at home, and number of books at home.
Internal structure of indicators is investigated using the analytical tools provided by the modelling approach used in PISA and TIMSS in relation to model fit and the precision with which the indicators measure the ECSS construct – that is, targeting and reliability.
Confirmatory factor analysis – with a specification and constraints that matches the indicator construction method used by PISA (OECD, 2020), is used to investigate the internal structure of PISA_ESCS, including model fit and reliability. IRM is used to investigate the internal structure of PISA_HER and of the home possessions scale – a component in the PISA_ESCS. Within IRM analysis, item targeting, model fit, and reliability of estimates are investigated. The use of plausible values, as opposed to weighted likelihood estimates, is also explored (OECD, 2009; Wu, 2005).

Conclusions, Expected Outcomes or Findings
Indicators’ content: PISA and TIMSS published documentation provide different levels of depth in the theoretical argument underpinning the ECSS construct. Although indicators used in both summary scales are typically used in operationalisations of ECSS, neither assessment specifies a conceptual model relating theory to construct operationalisation.
Indicators’ internal structure: Preliminary results relate to PISA_HER indicator; PISA_ESCS indicator is currently being analysed.
Items in the PISA_HER scale are relatively easy for PISA students, with most thresholds located in the lower region of the scale – ie, below the mean latent attribute of 1.63. PISA_HER items fit well together (ie, have similar discrimination) and response data fit the partial credit model (mean squared statistic close to 1). However, the person separation index of the PISA_HER index is low (0.36). Using plausible values in relation to ability estimates is common practice in PISA and TIMSS, where the interest is on reducing error in making inferences about the population of interest. However, contextual information is typically analysed using an IRM approach with the use of point estimates (eg, WLE) to produce students’ scores. Preliminary results indicate that the analytic outcomes might be quite different if plausible values are used.
Preliminary results from this study suggest that ECSS indicators in PISA and TIMSS require a sounder definition and operationalisation of the ECSS construct, which should be supported by theory and empirical evidence. The analytical steps in constructing the summary indicator – ie, the measurement model, should reflect the underlying theory. For example, if the construct is theorised to be a latent variable, then the summary indicator should be constructed using a latent variable model. As large-scale assessments aim at making inferences about the population of interest, rather than about individual students, using plausible values is an approach that should be explored in constructing contextual indicators. 

References
AERA, APA, & NCME. (2014). Standards for Educational and Psychological Testing. AERA.
Broer, M., Bai, Y., & Fonseca, F. (2019). Socioeconomic Inequality and Educational Outcomes. Evidence from Twenty Years of TIMSS. SpringerOpen.
Cizek, G. J. (2020). Validity: An Integrated Approach to Test Score Meaning and Use. Routledge.
Hooper, M., Mullis, I. V. S., Martin, M. O., & Fishbein, B. (2017). TIMSS 2019 Context Questionnaire Framework. In I. V. S. Mullis & M. O. Martin (Eds.), TIMSS 2019 Assessment Frameworks. Boston College, TIMSS & PIRLS International Study Center. http://timssandpirls.bc.edu/timss2019/frameworks/
Lietz, P., Cresswell, J., Rust, K. F., & Adams, R. J. (2017). Implementation of Large‐Scale Education Assessments. John Wiley and Sons.
Martin, M. O., Mullis, I. V. S., Foy, P., & Arora, A. (2012). Methods and Procedures in TIMSS and PIRLS 2011. TIMSS & PIRLS International Study Center, Boston College. https://timssandpirls.bc.edu/methods/index.html
Martin, M. O., von Davier, M., & Mullis, I. V. S. (2020). Methods and Procedures: TIMSS 2019 Technical Report. TIMSS & PIRLS International Study Center.
Messick, S. (1994). Validity of Psychological Assessment: Validation of Inferences from Persons’ Responses and Performances as Scientific Inquiry into Score Meaning. Educational Testing Service. https://files.eric.ed.gov/fulltext/ED380496.pdf
Oakes, M., & Rossi, P. (2003). The measurement of SES in health research: Current practice and steps toward a new approach. Social Science & Medicine, 56(4), 769–784.
OECD. (2001). Knowledge and Skills for Life—First results from the OECD Programme for International Student Assessment (PISA) 2000. https://www.oecd-ilibrary.org/education/knowledge-and-skills-for-life_9789264195905-en
OECD. (2009). PISA Data Analysis Manual: SAS Second Edition. https://www.oecd.org/pisa/pisaproducts/pisadataanalysismanualspssandsassecondedition.htm
OECD. (2017). PISA 2015 Assessment and Analytical Framework: : Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving, revised edition. PISA, OECD Publishing. https://doi.org/10.1787/9789264255425-en
OECD. (2018). Equity in Education: Breaking down barriers to social mobility. OECD Publishing.
OECD. (2019). PISA 2018 Results (Volume II): Where All Students Can Succeed. https://www.oecd.org/pisa/publications/
OECD. (2020). Chapter 16. Scaling procedures and construct validation of context questionnaire data—PISA 2018. https://www.oecd.org/pisa/publications/


 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: ECER 2023
Conference Software: ConfTool Pro 2.6.149+TC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany