Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Please note that all times are shown in the time zone of the conference. The current conference time is: 1st June 2024, 07:51:28am GMT

 
Filter by Track or Type of Session 
Only Sessions at Location/Venue 
 
 
Session Overview
Location: Gilbert Scott, 253 [Floor 2]
Capacity: 40 persons
Date: Tuesday, 22/Aug/2023
9:00am - 12:00pm00 SES 0.5 WS D: PIRLS 2021 – How to Analyze Primary Students’ Reading Literacy
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Falk Brese
Session Chair: Minge Chen
Workshop. Pre-registration required. Laptop necessary.
 
00. Central & EERA Sessions
Research Workshop

PIRLS 2021 – How to Analyze Primary Students’ Reading Literacy

Falk Brese, Minge Chen

IEA, Germany

Presenting Author: Brese, Falk; Chen, Minge

Note: Participants should bring a laptop. Lecturers will hand out USB sticks with published PIRLS material for group work. Participants could practice analysis if they have the IEA IDB Analyzer and either SPSS, SAS, or R installed on their laptop.

The primary objective of this workshop is to explore how data from international assessments can be used for research regarding outcomes and contexts of reading literacy. The workshop will put emphasis on how data from studies conducted by the IEA (International Association for the Evaluation of Educational Achievement) could provide further insights for policy and practice.

As a leading organization in the field of educational research for more than 60 years, the IEA promotes capacity building and knowledge sharing to facilitate innovation and foster quality in education. IEA studies approach the reality of educational learning outcomes in all its complexity by collecting a huge variety of background information that can be related to students’ achievement, knowledge, and attitudes.

This course will introduce participants to the IEA Progress in International Reading Literacy Study (PIRLS) 2021. The PRILS 2021 database will be published in June 2023 and will provide a fresh and rich source for secondary research of outcomes related to reading literacy across the world, and in particular in Europe with more than 30 European countries participating. PIRLS 2021 is the 5th cycle of IEA’s flagship study in reading literacy, following the administrations in 2001, 2006, 2011, and 2016.

The course will include an overview of PIRLS, covering its background, conceptual framework and design. It will present some key findings from the 2021 data collection. Participants will be introduced to the survey instruments and database, and be provided with access paths to data sources, technical documentation, analysis guides and software tools. There will also be a presentation about available variables such as students’ achievement, their attitudes towards reading, characteristics of their teachers who teach reading, and class- and school-level learning contexts.

With this information, participants will formulate and discuss research questions that can be addressed with PIRLS 2021 data. The instructors will be available to mentor the development of research ideas and design as well as to answer data related and technical questions. Research questions from individual attendants will be presented to all participants in order to provide opportunities to share ideas.

No prior knowledge about large-scale international studies is required. Basic knowledge about statistical analysis is not required but is an advantage.

Draft Agenda:

Introductory session – 10 min

• Introduction of participants and their research interests

• IEA – mission, studies, topics, audiences

PIRLS – 20 min

• Introduction

- Background

- Main research focus, framework

- Design

• PIRLS 2021

- Highlighted results

- Instruments, outcome variables and scales

- Access and availability of data files, technical documentation, analysis guides and software tools

Group work – 35 min

• Participants will form working groups

• Each group will receive selected questionnaire materials and information on the corresponding variables (e.g., perceptions and background) of PIRLS 2021

• Participants will develop their own research questions that could be answered with the information collected in PIRLS 2021

• Each group presents one or two research questions and gets feedback from instructors and audience

• If participants have the IEA IDB Analyzer and SPSS, SAS or R installed, they can practice analyzing the PIRLS data with support by the instructors

Example Analysis – 20 min

• Live demo of analysis of example research questions

• Discussion of analysis and results

Closing – 5 min

• Questions, summary & conclusions

• Invitation to advanced data analysis seminars and initiation of collaborative work


Methodology, Methods, Research Instruments or Sources Used
The course will begin with a brief overview of the studies conducted by the International Association for the Evaluation of Educational Achievement (IEA), followed by a more detailed introduction to the Progress in International Reading Literacy Study (PIRLS). The introductory presentation will include information on the history of the study, its conceptual underpinning and the study design.

Next, there will be a summary of key results from PIRLS 2021. Participants will get insight into the findings of one of the most well-established published international comparative study on reading literacy. As PIRLS 2021 is already the fifth cycle of IEA’s study on students’ reading literacy, the trend results will also be presented to highlight the potential of analyzing PIRLS data across time.

Afterwards, there will be practice sessions that will cover most of the workshop time. Participants will be asked to think about research questions that could be answered using PIRLS 2021 data. As an input for that task, participants will be provided links to the available survey material (student, teacher, and school questionnaires). Further, information is given about variables and data derived from the questionnaire items, for example scale scores for latent variables, such as students’ attitudes towards math and science, as well as students’ perceptions of the school climate. Then, participants will work in groups to think about and discuss possible research questions that could be explored with PIRLS 2021 data. During the group work, the workshop instructors will be available to answer questions and provide conceptual support to the groups.

Each group will present their research question(s), and share and discuss these with all participants, to enable an exchange of thoughts and ideas within the group.

Finally, by using (some of) the research questions developed by participants, the lecturers will conduct example analyses as a live demo. This will provide participants with first insights into methodological aspects and also options of analyzing quantitative data from international large-scale assessments.


Conclusions, Expected Outcomes or Findings
This workshop offers a unique opportunity for participants to learn about the concepts and results of PIRLS 2021 as one of the biggest international large-scale assessments of reading literacy in primary education. Participants will learn about the conceptual underpinning and design of the study as well as about the results and findings. During the practice part of the workshop, participants will be able to develop a good understanding about how to address PIRLS 2021 data with appropriate research questions. Finally, participants will get insights into appropriate ways of analyzing international large-scale assessment data.
References
Mullis, I. V. S., & Martin, M. O. (Eds.). (2019). PIRLS 2021 Assessment Frameworks. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: https://timssandpirls.bc.edu/pirls2021/frameworks/

Reynolds, K.A., Wry, E., Mullis, I.V.S., & von Davier, M. (2022). PIRLS 2021 Encyclopedia: Education Policy and Curriculum in Reading. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: https://pirls2021.org/encyclopedia

Martin, M. O., Mullis, I. V. S., & Hooper, M. (Eds.). (2017). Methods and Procedures in PIRLS 2016. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: https://timssandpirls.bc.edu/publications/pirls/2016-methods.html

Foy, Pierre PIRLS 2016 User Guide for the International Database. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: https://timssandpirls.bc.edu/pirls2016/international-database/index.html
 
1:15pm - 2:45pm09 SES 01 B: COVID-19 and Education: Assessing Impacts, Methodologies, and Policy Responses
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Jana Strakova
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

EPIC - Education Preparedness Index in COVID-19: Methodology and Research

Arusyak Aleksanyan1, Mariam Muradyan2, Anna Malkhasyan3, Anna Arustamyan4, Narek Yenokyan5, Arayik Tsaturyan6

1YSU, Armenia; 2YSU, Armenia; 3World Bank, Armenia; 4Teach for Armenia; 5Armenian Lawyer's Association, NGO; 6KPMG Armenia

Presenting Author: Aleksanyan, Arusyak

The Education Preparedness Index in Covid-19 (EPIC) is one of the outputs of the project Enabling Learning to Happen for All Children in Emergency Crisis. The project is funded by the Global Campus of Human Rights in partnership with the Right Livelihood Award Foundation. In 2019, people all over the world were faced with new realities and adopted new rules of life. One of those realities also affected the education system. Due to the COVID-19 pandemic, many countries in the world initiated emergency remote education applications and platforms to continue providing education without interruption, aiming for students to continue their learning. Under the crisis of COVID-19 remote/distance learning became a viable alternative to ensure the continuity of the educational process. Under these circumstances, the research aimed at studying the preparedness of the education system to adapt to new realities and to act in crisis conditions. To this end, a group of Armenian experts took the initiative to develop a model for assessing the education system in emergencies - Education Preparedness Index in COVID-19 (EPIC). EPIC assessment of preparedness for education in emergencies is a set of indicators, tools, and methods aimed at measuring education system readiness for emergencies and analyzing the effectiveness of education policy responses in times of crisis. The four main thematic areas that the framework covers are as follows:

• Policy and Legal Framework

• Coordination and Cooperation

• E-readiness

• Capacities and Resources

Each thematic area incorporates a set of indicators and sub-indicators that allow uncovering the level of achievements and the efficiency level of preparedness within each recommended section.

EPIC is applicable in all emergencies entailing physical distancing and education through online means. The emergency context was retrieved from the conditions and limitations appeared through COVID-19 period combined with other crises such as war, internal unrest, context of disability and some other characteristics. The framework is flexible to changes of individual country cases and these are specific characteristics are considered during the assessment.

The research question of this study is as follows:

What is the level of preparedness of the country/countries in provision of education in emergency entailing rapid shift from conventional education to distant/ online education?

EPIC basis on the international principles of child rights. Among them the framework of 4As, Availability, Accessibility, Acceptability and Adaptability of Education of the UN Human Rights Office of the High Commissioner is the corner stone of the study. The methodology is based on the UN CRC Commentary N13 description and statements of the 4As framework. Furthermore, the EPIC grounds its target group that primarily on children of 6-14 age group following the CRC General Comments N13, World Declaration on Education for All, as well as on SDGs, the Minimum Standards for Education in Emergencies, the Global Study of Children Deprived of Liberty (Nowak, 2019), Chronic Crises and Early Reconstruction (2004). the Dakar Education for All (EFA) framework, and the Sphere Project’s Humanitarian Charter.


Methodology, Methods, Research Instruments or Sources Used
To achieve the research goals, the triangulation strategy was employed. The purpose of the triangulation approach is to use diverse methods to assist each other in explaining and interpreting the data. Thus, the calculation and analysis of the index are based on three data sources: 1. Quantitative survey of teachers and students; 2. Expert interview; 3. Statistical data.
1. The quantitative research includes conducting a representative quantitative survey among students and teachers of secondary schools to assess and calculate the e-Readiness sub-index. The survey covers the following three thematic areas, indicating the readiness of schools for distance education:
• Technological readiness for distance learning
• Social-psychological readiness for e-learning
• Cognitive readiness for online education
Each of the above-mentioned areas has its sub-indicators and a series of relevant questions for teachers and students. We adopted an online standardized questionnaire hosted on Google, the link to which is provided to school administrators who further ensured the dissemination of the data among students and teachers via private messaging systems. The questionnaire includes questions, mainly designed by using a Likert scale. On average, it took the respondents 20 minutes to complete the survey. The responses to these questions were initially collected by the Google drive excel database. After fully completing the survey, all answers were directly exported into an SPSS file and analyzed.
2. As a qualitative method, interviews with experts was distinguished. The experts were involved based on the relevant experience in policy/strategy developing experience, skills in implementation and monitoring. The expert interview questionnaire covers questions related to the existing regulations and policies, capacities, coordination of involved parties, technological availability, and delivered models of e-learning in an emergency. To this end, interviews with representatives of the Government, independent experts, involved CSOs and international organizations were conducted. An important approach is the application of the saturation method when the number of experts is determined by the collected information.
3. Statistical data collection stage of the study involves the collection of statistical data. In case of missing data for the reporting year, the most recent available data can be collected.
In the final stage, all the data are standardized, on the basis of which the INDEX is calculated on a scale from 0 to 100, where 0 is the lowest level of preparedness and 100 is the highest.

Conclusions, Expected Outcomes or Findings
Finally, an emergency situation such as the Covid-19 pandemic has raised awareness of the need for schools and education systems to be prepared for different emergencies. This global crisis made it clear that countries should develop and have a coping strategy for mitigating the adverse impact of the pandemic as well as identify and provide additional support to the most vulnerable groups. This challenge is an opportunity for those schools not having a strategy for emergency situations to develop one and to use it during such eventualities. School closures have shown that online teaching and learning preparedness is not only a trend but also a must to achieve success in the educational process. The effectiveness of distance learning and educational process in general mostly depend on the relevant professional-pedagogical skills of the teaching community, the willingness of teachers to constantly improve and develop, on teachers’ creative thinking, motivation to teach at school, etc. Different international studies have shown that in order to have an effective education system, it is important to have a highly qualified pedagogical community. The effectiveness of the education system is largely measured by the achievements of the students. And the achievements of the students significantly depend on the high professional and pedagogical skills and capacities of teachers. Thus, education systems successfully meet the challenges of emergencies if they regularly evaluate and monitor the system's preparedness for emergencies.
References
1.Bensalah, Kacem. 2002. “Guidelines for education in situations of emergency and crisis: EFA strategic planning”. UNESCO.
2.Çağatay, İhsan Ulus. 2020. "Emergency Remote Education vs. Distance Education". European Commission.
3.Chebib, Kinda. 2020. “Education For All in the Time of COVID-19: How EdTech can be Part of the Solution”.
4.Committee on Economic, Social and Cultural Rights. 2020. "Statement on the Coronavirus Disease (COVID-19) Pandemic and Economic, Social and Cultural Rights".
5.Fernando, M. Reimers, Andreas Schleicher. 2020. “A framework to guide an education response to the Covid-19 Pandemic of 2020”. Harvard Graduate school of education.
6.Humanitarian Practice Network. 2006. “Implementing minimum standards for education in emergencies: lessons from Aceh”.
7.INEE (Inter-Agency Network for Education in Emergencies). 2010. “Minimum Standards for Education: Preparedness, Response, Recovery”. Accessed March 8, 2021.
8.INEE. 2004. “Minimum Standards for Education in Emergencies, Chronic Crises and Early Reconstruction”. DS Print.
9.Lasi, Masri bin Abdul. 2021. “Online Distance Learning Perception and Readiness During Covid-19 Outbreak: A Research Review”. International Journal of Academic Research in Progressive Education and Development. 28 February.
10.Nicolai, Susan. 2003. “Education in Emergencies A toolkit for starting and managing education in emergencies”. Save the Children.
11.OECD. 2020. “Education Response to Covid-19: Implementing a Way Forward”. Working Paper No. 224. 9 July.
12.Penna, Maria Pietronilla, Vera Stara. 2007. “The failure of e-learning: why should we use a learner centred design”. Journal of e-Learning and Knowledge Society. January.    
13.Phan, Thanh Thi Ngoc, Ly Thi Thao Dang. 2017. “Teacher Readiness for Online Teaching: A Critical Review”. June.
14.UN. 2020. "Policy Brief: Education during COVID-19 and beyond", August.
15.UNESCO. 2016. “Education 2030: Incheon Declaration and Framework for Action for the implementation of Sustainable Development Goal 4: Ensure inclusive and equitable quality education and promote lifelong learning”.
16.UNESCO. 2020b. “COVID-19 Education Response, How Many Students are at Risk of not Returning to School”. Advocacy paper, 30 July.
17.UNESCO. 2020c. “Covid-19 Education Response. Education Sector Issue Notes. Supporting teachers and education personnel during times of crisis”. Issue note no. 2.2, April.
18.UNESCO. 2020а. “COVID-19 Education Response, Distance learning strategies in response to COVID-19 school closures”. Issue note no. 2.1, April.
19. UNICEF. 2020. "Education and COVID-19 report".
20.World Bank. 2021. “Urgent, Effective Action Required to Quell the Impact of COVID-19 on Education Worldwide”.


09. Assessment, Evaluation, Testing and Measurement
Paper

Assessing Distance Learning in Primary Education of Kazakhstan during the COVID-19 Pandemic: Evidence from PIRLS-2021

Nazym Smanova

JSC “Information-Analytical Center”, Kazakhstan

Presenting Author: Smanova, Nazym

The COVID-19 pandemic has led to near-universal closing of schools at all levels worldwide, remaining negative consequences for all participants in the educational process. School support plays a key role in mitigating the negative effect of school-closure during the pandemic period on student learning achievement. The present study assesses the possible mediating role of school support in the effects of COVID-19 disruption on primaty students learning achievement through analysis of data collected through the IEA’s Progress in International Reading Literacy Study (PIRLS 2021).

In Kazakhstan, as in many countries, regular schooling was disrupted since the COVID-19 pandemic broke out in March 2020. In 2021, 77% of all school students started the new academic year via distance learning. In response to the pandemic, the government of Kazakhstan undertook set of systemic measures: distance learning was implemented using online platforms and services, as well as using audio and telework; computer equipment and Internet cards were presented freely to students in need; professional development courses on distance learning were offered for about 347 thousand teachers.

So far, some studies have pointed to significant losses in students' knowledge in Kazakhstan during school-closures led by the pandemic (IAC, 2020; Dzhaksylykov, 2020). Researchers in the USA found that many children in Year 2 and Year 3 in their study lost momentum on fundamental skills such as reading, with the difficulty in creating a language-rich environment on Zoom being one of the primary reasons (Domingue et al., 2021). In another study conducted in the UK, attainment gaps were found for both Year 1 and Year 2 students, with the most profound effects on students from a disadvantaged background (Rose et al., 2021).

The following research questions will guide the study:

  1. How is the school closure during the pandemic period resulted in reading achievement of Kazakhstani primary grade students?
  2. Does school support (providing access to digital devices, delivering printed and online learning materials, organizing online activities, providing technical and methodological support for teachers) mediate the relationship between school closure due to COVID-19 and reading achievement?
  3. Does the mediation effect of school support vary across different socio-demographic groups?

Methodology, Methods, Research Instruments or Sources Used
The quantitative research method will be employed to evaluate the impact of the selected parameters on student performance by using multilevel modelling techniques. The data for this study will be from the Kazakhstan sample in the IEA PIRLS 2021 database. PIRLS 2021 is an only international large-scale assessment conducted during the COVID-19 school disruption. It provides contextual information about how remote instruction was organized in schools, including information about distance learning resources available for students, methodological and technical support for teachers, etc.
Conclusions, Expected Outcomes or Findings
As a result, it is expected to find out how school resources could mediate the relationship between school closure and academic performance across SES groups using the national questionnaire data from PIRLS 2021. The study will inform an ongoing process of developing an effective mechanism for designing and implementing the educational recovery program in Kazakhstan.
References
1. JSC Information and Analytical Center. (2020). Analytical report on the monitoring of learning using distance technologies in general schools in the framework of emergency distance learning in Kazakhstan [Unpublished research]. https://iac.kz/
2. Bokayev, B., Torebekova, Z., Abdykalikova, M., & Davletbayeva, Z. (2021). Exposing policy gaps: The experience of Kazakhstan in implementing distance learning during the COVID-19 pandemic. Transforming Government: People, Process and Policy, 15(2). Retrieved from https://doi.org/10.1108/TG-07-2020-0147
3. Dzhaksylykov, S. (2020). Distance learning diaries: How was the “distance” school term from students and parents’ point of view. Retrieved from https://drive.google.com/file/d/1a-wo91IsG_puveH2mUVCU9XliZIcC8_2/view
4. Domingue, B. W., Hough, H. J., Lang, D., & Yeatman, J. (2021). Changing Patterns of Growth in Oral Reading Fluency during the COVID-19 Pandemic. Policy Analysis for California Education, Working Paper. Retrieved from https://edpolicyinca.org/sites/default/files/2021-03/wp_domingue_mar21-0.pdf


09. Assessment, Evaluation, Testing and Measurement
Paper

Capturing the Educational and Economic Impacts of School Closures in Poland

Tomasz Gajderowicz1, Maciej Jakubowski1, Sylwia Wrona1, Harry Patrinos2

1University of Warsaw, Poland; 2World Bank

Presenting Author: Wrona, Sylwia

COVID-19 led to strict lockdown measures, which included school closures in most countries. As a result, more than 1.5 billion students were out of school for weeks or months (UNESCO, 2022). The loss of schooling is expected to negatively impact children's cognitive development, even if distance learning modes are enacted. The loss of in-person teaching could also lead to inequality since the only remaining relevant input is parental involvement during school closures (Agostinelli et al., 2022). Most studies document significant learning loss. In Europe, the average learning loss is almost a quarter of a school year, but the estimates are available mainly for Western European countries (Donnelly and Patrinos, 2021). Worldwide the loss is even greater, especially in lower-income countries (Patrinos et al., 2022). Poland is an interesting case because it represents countries in Eastern Europe where school closures lasted longer, and research on learning loss is scarce.


Methodology, Methods, Research Instruments or Sources Used
To properly estimate the effect of school closures and to distinguish it from the effect of the 2016 structural changes, we compare the expected and actual achievement of three cohorts of students in secondary schools. We assume students should gain a minimum of 0.1 standard deviation (SD) during one year of education. That is a safe assumption but in line with previous studies comparing 15, 16, and 17-year-old student results on the PISA scale in Poland (Jakubowski et al., 2022). International evidence indicates the gains should be larger, around 0.2 SD, which would make our
results more significant as they increase the expected achievement (Avvisati and Givord, 2021). We also assume students tested in autumn (10th grade in TICKS 2021) have a similar achievement to those tested one grade below in the spring (9th-grade assessment in PISA 2003-2018). Assuming any achievement progress between spring and autumn makes our results even more significant.

Conclusions, Expected Outcomes or Findings
The Polish success story of rapid social and economic progress relied strongly on human capital improvement. Unfortunately, this factor is now under significant distress. Significant learning losses have been experienced by Polish students due to the COVID-19-induced school closures. In mathematics and science, the learning losses are equal to more than a year's worth of schooling, even though schools were closed for only part of an academic year. In addition, we show that the
2016 reforms also had a negative impact on student learning. These skills losses are likely to affect the future economic success of the students as well as the country as a whole. Future earnings are projected to decline by PLN 74,693 (more than US$15,000) per year for the affected students. The country would then lose the equivalent of 7.2% of GDP over time.

References
Avvisati, F., Givord, P. (2021). How much do 15-year-olds learn over one year of schooling? An
international comparison based on PISA. OECD Education Working Papers No. 257.
Carlana, M., La Ferrara, E. (2021). Apart But Connected: Online Tutoring and Student Outcomes
During the COVID-19 Pandemic. CEPR Discussion Paper No. DP15761.
Donnelly, R., Patrinos, H. A. (2021). Learning loss during COVID-19: An early systematic review.
Prospects 1-9.
Drucker, L.F., Horn, D. & Jakubowski, M. (2022). The labour market effects of the polish
educational reform of 1999. Journal of Labour Market Research 56, 13.
Fryer Jr, R.G., Howard-Noveck, M. (2020). High-dosage tutoring and reading achievement:
evidence from New York City. Journal of Labor Economics 38(2): 421-452
Hanushek, E. A., & Woessmann, L. (2010). The high cost of low educational performance: The
long-run economic impact of improving PISA outcomes. OECD Publishing, France.
Jakubowski M., Gajderowicz T., Wrona S. (2022). Achievement of Secondary School Students
after Pandemic Lockdown and Structural Reforms of Education System. Evidence Institute
and City of Warsaw research report.
Jakubowski M., Patrinos H., Porta E., Wisniewski J. (2016), The Effects of Delaying Tracking in
Secondary School: Evidence from the 1999 Education Reform in Poland. Education
Economics 24(6).
Patrinos, H.A., Vegas, E., Carter-Rau, R. (2022). An Analysis of COVID-19 Student Learning
Loss. Policy Research Working Paper No. 10033, World Bank.
Psacharopoulos, G., Collis, V., Patrinos, H.A. and Vegas, E. (2021). The COVID-19 cost of school
closures in earnings and income across the world. Comparative Education Review 65(2):
271-287.
Electronic copy available at: https://ssrn.com/abstract=4298822
9
UNESCO (2022). UNESCO map on school closures. Retrieved at https://covid19.uis.unesco.org/
on March 2022.
 
3:15pm - 4:45pm09 SES 02 B: Exploring Mathematical Development, Self-Concept, and Achievement in Education
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Trude Nilsen
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

The Development of Mathematical Thinking Skills and Mathematical Self-concept from the Third Grade to the End of Basic Education

Natalija Gustavson1, Satu Koivuhovi2, Mari-Pauliina Vainikainen3, Mikko Asikainen1

1University of Helsinki, Finland; 2University of Turku, Finland; 3Tampere University, Finland

Presenting Author: Gustavson, Natalija

One of the basic skills for success in the knowledge society is the ability to learn. The Finnish national learning to learn (L2L) assessment program was launched in the mid-1990s as a result of a worldwide interest in the measurement of cross-curricular competences (Hautamäki et al., 2013).

The Learning to learn longitudinal assessment brings significant value and gathers sufficient information about learning outcomes and monitor changes in students’ competence during the basic education (Hoskins & Deakin Crick, 2010).

In Finland, L2L is assessed by administering cognitive tasks measuring general reasoning and thinking skills, and self-evaluation scales measuring beliefs and attitudes towards learning (Hautamäki & al., 2002). The concept of mathematical thinking is a traditional part of Learning to learn assessment.

Children’s learning-related beliefs, self-concept and interest in a particular subject play an important role in their school performance, particularly in mathematics. In educational research, academic self-concept has been defined as students' perception of themselves within the academic environment (Marsh, 1990, Marsh and Scalas, 2010).

In this regard, it is of interest how the development of mathematical thinking occurs in schoolchildren during schooling and what other factors can influence the development and improvement of mathematical thinking.

Of particular interest in presented study was the development of mathematical thinking skills and mathematical self-concept (Marsh et al.,1988), as part of learning-related beliefs from the third grade to the ninth grade during the completion of basic education.

The main purpose of this study is to answer the following questions:

  1. How do pupils` mathematical thinking skills and mathematical self-concept develop during the comprehensive school years from the third to the ninth grade?
  1. Do mathematical self-concept on the third and the sixth grade predict the level of mathematical thinking skills on the sixth and the ninth grade?
  2. How do gender and mother’s education explain the level differences and change of mathematical self-concept?

Methodology, Methods, Research Instruments or Sources Used
Data (=2200) were drawn from a longitudinal Learning-To-Learn study in which a whole age cohort of third graders from the capital area of Finland were followed up until the end of the comprehensive school.  Data collection consisted of three measurement points (i.e. year 2016 when pupils were at third grade, year 2018 when pupils were at sixth grade and year 2021 when pupils were at ninth grade).
Measures that were used based on the framework of Finnish learning to learn test (Hautamäki et al., 2002).
Mathematical thinking skills were measured with two task types. The first task type, the Hidden Arithmetical Operators task (Arithmetical Operations for short) was developed by Demetriou and his colleagues (Demetriou et al., 1991). In each item there were one to four hidden operators (e.g., [(5 a 3) b 4 = 6).
In the second task sections of invented mathematical concepts (Sternberg et al., 2001), two invented mathematical concepts, lag and sev, were conditionally defined (for example, if a > b, lag means subtraction, otherwise multiplication, etc.). After this, the student was given a problem to solve (for example, how much is 4 lag 7 sev 10 lag 3), where the definitions had to be applied.

Mathematical self-concept was measured with a scale based on Marsh’s work on academic self-concept (Marsh et al., 1988).  The scale consisted of three items on a seven-point Likert scale ranging from one (not true at all) to seven (very true).
Data were analysed with SPSS24 for descriptive statistics and Mplus 7.2 for linear growth curve models. First, we analysed the development of mathematical thinking skills and mathematical self-concept at the level of the whole data, after which differences in development depending on the gender and mother’s education level were examined.

Conclusions, Expected Outcomes or Findings
Preliminary analyses showed that students’ mathematical thinking skills improved over time whereas self-concept in mathematics decreased statistically significantly from the third to the ninth grade.  This result aligns with earlier international findings of the decline of self-beliefs by age.
The linear growth curve model fitted the data well (RMSEA = .035; CFI = .992; TLI = .973).
The initial level of self-concept in the third grade statistically significantly predicted the student's success in the sixth-grade mathematical thinking test. Sixth grade’s mathematical thinking skills test score correlated significantly with the slope of mathematical self-concept indicating that the development of pupils’ mathematical self-concept differed depending on their performance in mathematical thinking skill test. Students who did well in the test of mathematical thinking skills at sixth grade experienced a milder decrease in their mathematical self-concept than other students.

Mathematical thinking skills test score at sixth grade, initial level of mathematical self-concept at third grade as well as the slope of mathematical self-concept predicted statistically significantly the test result in mathematical thinking skill test at ninth grade.  Overall, the model explained about 38% of the variance of ninth grade mathematical thinking skills test result.  
Gender was a statistically significant predictor of children’s mathematical self-concept. Boys' mathematical self-concept was stronger than that of girls. In addition, girls experienced a stronger decline in their self-concept over time than boys did.

References
Bong, M., & Skaalvik, E. M. (2003). Academic self-concept and self-efficacy: How different are they really? Educational Psychology Review, 15(1), 1–40. Hox, J. J.
Demetriou, A., Platsidou, M., Efklides, A., Metallidou, Y., & Shayer, M. (1991). The development of quantitative-relational abilities from childhood to adolescence: Structure, scaling, and individual differences. Learning and Instruction, 1, 19–43.
Guay, F., Marsh, H. W., & Boivin, M. (2003). Academic self-concept and academic achievement: Developmental perspectives on their causal ordering. Journal of Educational Psychology, 95(1), 124–136. https://doi.org/10.1037/0022-0663.95.1.124
Hautamäki, J., Arinen, P., Eronen, S., Hautamäki, A., Kupiainen, S., Lindblom, B., & Scheinin, P. (2002). Assessing learning-to-learn: A framework. National Board of Education, Evaluation 4/2002.
Hautamäki, J., Kupiainen, S., Marjanen, J., Vainikainen, M.-P., & Hotulainen, R. (2013). ). Oppimaan oppiminen peruskoulun päättövaiheessa: Tilanne vuonna 2012 ja muutos vuodesta 2001 [Learning to learn at the end of basic education: Situation in 2012 and change from 2001]. University of Helsinki. Department of Teacher Education Research Report 347. Unigrafia.
Hoskins, B., & Deakin Crick, R. (2010). Competences for learning to learn and active citizenship: Different currencies or two sides of the same coin? European Journal of Education, 45(1), 121–137. Crossref. ISI.
Marsh, H. W., Byrne, B. M., & Shavelson, R. J. (1988). A Multifaceted Academic Self-Concept: Its Hierarchical Structure and Its Relation to Academic Achievement. Journal of Educational Psychology, 82(4), 623–636. https://doi/10.1037/0022-0663.80.3.366
Marsh, H. W. (1990). The structure of academic self-concept: The Marsh/Shavelson model. Journal of Educational Psychology, 82(4), 623–636. https://doi.org/10.1037/0022-0663.82.4.623
Marsh, H. W., Scalas, L. F., & Nagengast, B. (2010). Longitudinal tests of competing factor structures for the Rosenberg Self-Esteem Scale: Traits, ephemeral artifacts, and stable response styles. Psychological Assessment, 22(2), 366–381. https://doi.org/10.1037/a0019225
Sternberg, R., Castejon, J.L., Prieto, M.D., Hautamäki, J., & Grigorenko, E. (2001). Confirmatory factor analysis of the Sternberg Triarchic Abilities Test in three international samples. European Journal of Psychological Assessment, 17, 1-16.
Vainikainen , M-P & Hautamäki , J 2022 , Three Studies on Learning to Learn in Finland :Anti-Flynn Effects 2001-2017 ' , Scandinavian Journal of Educational Research , vol. 66 , no. 1 , pp. 43-58 . https://doi.org/10.1080/00313831.2020.1833240


09. Assessment, Evaluation, Testing and Measurement
Paper

Tracking in English and Mathematics: Consequences for Compulsory School Students’ Self-Concept

Thea Klapp, Jan-Eric Gustafsson, Stefan Johansson

University of Gothenburg

Presenting Author: Klapp, Thea

The study's overall purpose is to explore the formation of student academic self-concept (ASC) in the subjects of English and mathematics. ASC is commonly defined as self-perceived academic ability and is related to cognitive and non-cognitive outcomes such as academic engagement, goal-setting, task choice, persistence and effort, intrinsic motivation, strategy use, academic achievement, and future career selection (Bong & Skaalvik, 2003; Marsh et al., 2019). When students perceive their previous experiences of academic activities to be positive and when they perceive that they are capable of managing future academic activities, it is thus an advantage that goes beyond immediate academic success. Rather, ASC has been shown to have prolonged effects (Marsh et al., 2001).

Because ASC frequently has been shown to be important for student success, much research has been dedicated to explaining how it is formed. The main explanation is the big-fish-little-pond effect (BFLPE), which posits that equally abled students perceive their abilities differently depending on their context (Marsh et al., 2008). A student in a high-achieving context would rate their ability to be lower than a student in a lower-achieving context, even if both students have the same abilities.

In 1962, tracking was introduced in the subjects of English and mathematics in all secondary schools in Sweden (Grades 7-9). With recommendations from teachers, students were to choose between advanced and general courses in the two subjects (Marklund, 1985). The general courses were easier and given at a slower pace than the advanced courses and tended to have lower class-average achievement. Tracking is no longer a formal practice in Swedish compulsory education, but it commonly occurs when teachers organise education in Sweden and internationally (Trautwein et al., 2006).

ASC is a well-researched area, but so far, only a few studies have conducted longitudinal analyses to investigate effects over time. There is also a need for studies that look at how ASC is affected by school systems with some form of tracking (i.e., ability stratification, ability grouping etc.). The specific purpose of the study is to explore the effects of non-tracking and tracking in secondary school on ASC in upper secondary schools. With longitudinal data from the 1980s and 90s, ASC will be measured in Grade 6 (pre-tracking) and Grade 10 (post-tracking).

Previous Research

In a longitudinal study, Marsh et al. (2001) compared students from former East and West Germany (N = 2 778). They found that when East and West Germany reunited and the schools merged, the students who had attended the selective and ability-stratified schools in West Germany were more strongly affected by the negative BFLPE when compared to the East German students. Before the reunification, East German students had not experienced an ability-stratified school system. The difference between the merged students decreased with time when the former East German students became integrated with the more selective school system. Overall, the findings of Marsh et al. (2001) indicate that school policies and systems may have an impact on the formation of student ASC.

Similarly, Liem et al. (2013) found that compulsory school students in low-ability streams in English and mathematics had higher self-concepts than students in high-ability streams when student achievement was controlled for. However, Herrmann et al. (2016) investigated the German within-school track system (N = 1 330) and found that the negative BFLPE for students in the advanced mathematics track disappeared when they controlled for positive assimilation effects. The positive assimilation effect is similar to the basking-in-reflected-glory (BIRG) effect, which both refer to the notion that attending a high-achieving class or school positively affects ASC.


Methodology, Methods, Research Instruments or Sources Used
Participants and procedure

   Data will be retrieved from the Swedish longitudinal project Evaluation through Follow-up (UGU), compiled by Statistics Sweden (Härnqvist, 2000). The sampling was a two-step stratified procedure, where municipalities were selected in the first step and classes in the second step. The UGU samples are nationally representative of their respective populations. Four birth cohorts will be used in the study, 1967 (N = 9 104), 1972 (N = 9 498), 1977 (N = 4293), and 1982 (N = 8 805). Cohorts 1967, 1972, and 1977 experienced tracking and will be merged to get a bigger sample. Cohort 1982 did not experience ability-streamed courses and will function as a control group.

   UGU consists of register, survey, and test data. Survey data was first collected in Grade 6 and then for a second time in upper secondary school. For cohorts 1967, 1972, and 1977 the second data collection occurred in Grade 10 and for cohort 1982 it occurred in Grade 12. Survey data from Grades 6 and 10/12 will be used to measure ASC pre- and post-tracking.
To deal with missing data, calibration weights and full information maximum likelihood (FIML) estimation will be used to correct for bias due to non-participation.

Measures and variables
  
   Cohorts 1967, 1972, and 1977 answered identical questions in Grade 10, while cohort 1982 answered similar but not identical questions as the other three cohorts. Measures of ASC will be constructed to be as similar as possible between the three earlier cohorts and cohort 1982. Factors will be created with indicators of students’ ASC, for example, “What kind of arithmetic skills do you think you have?” and “Did you experience any problems with arithmetic in secondary school”.
Achievement will be operationalized by grade point average (GPA) from Grade 9 and by cognitive ability from Grade 6. Cognitive ability will be measured with three tests measuring students’ verbal, spatial, and inductive abilities. Gender and parental education will also be included.

Method of Analysis

   First, descriptive analyses will be calculated. Measurement models will then be constructed in Mplus with confirmatory factor analysis (CFA), to create latent variables for ASC in Grades 6 and 10/12. Lastly, longitudinal structural equation modelling (LSEM) will be used. The tracking system enables a quasi-experimental research design, that in turn makes it possible to investigate the effect of tracking on subsequent ASC with LSEM and the control group that did not experience tracking.

Conclusions, Expected Outcomes or Findings
    Regarding the possible outcomes of the study, two contradictory effects are relevant to consider. It concerns the previously mentioned BFLPE as well as the basking-in-reflected-glory (BIRG) effect. The BIRG effect predicts that when students perceive their school or class (i.e., their reference group) to have high status, it affects their self-concepts positively (Marsh et al., 2000). The glory of attending a high-status group thus reflects on the individuals in the group, regardless of individual achievement level. In contrast, the BFLPE predicts that attending a high-achieving group affects students’ self-concept negatively, because of negative social comparison processes. Even if both effects concern the formation of self-concept, research has indicated that the BFLPE is the most dominant effect of the two (Marsh et al., 2000). I.e., the negative social comparison effect tends to have a greater impact on students’ self-concept than the positive effect of attending a high-status group.
  
   In the present study, the BFLPE hypothesis would be that students who attended the advanced courses in English and mathematics reported lower ASC in Grade 10 because their ASCs were negatively affected by the comparisons with high-ability peers in secondary school. However, it may also be that the BIRG effect is present rather than the BFLPE, which would mean that students in the advanced courses express higher ASC due to reflected glory.

References
Bong, M., & Skaalvik, E. M. (2003). Academic Self-Concept and Self-Efficacy: How
   Different Are They Really? Educational Psychology Review, 15(1), 1–40.

Herrmann, J., Schmidt, I., Kessels, U., & Preckel, F. (2016). Big fish in big ponds:
   Contrast and assimilation effects on math and verbal self‐concepts of students in
   within‐school gifted tracks. British Journal of Educational Psychology, 86(2), 222–
   240.

Härnqvist, K. (2000). Evaluation through follow-up. A longitudinal program for
   studying education and career development. In C.-G. Janson (Ed.), Seven
   Swedish longitudinal studies in behavioral science (pp. 76–114). Stockholm:
   Forskningsrådsnämnden.

Liem, G. A. D., Marsh, H. W., Martin, A. J., McInerney, D. M., & Yeung, A. S.
   (2013). The Big-Fish-Little-Pond Effect and a National Policy of Within-School
   Ability Streaming: Alternative Frames of Reference. American Educational
   Research Journal, 50(2), 326–370.

Marklund, S. (1985). Skolsverige 1950-1975 D. 4 Differentieringsfrågan.
   Stockholm: Liber Utbildningsförlaget.

Marsh, H. W., Köller, O., & Baumert, J. (2001). Reunification of East and West
   German School Systems: Longitudinal Multilevel Modeling Study of the Big-Fish-
   Little-Pond Effect on Academic Self-Concept. American Educational Research
   Journal, 38(2), 321–350.

Marsh, H. W., Kong, C., & Hau, K. (2000). Longitudinal multilevel models of the
   big-fish-little-pond effect on academic self-concept: Counterbalancing contrast
   and reflected-glory effects in Hong Kong schools. Journal of Personality and
   Social Psychology, 78(2), 337–349.

Marsh, H. W., Pekrun, R., Parker, P. D., Murayama, K., Guo, J., Dicke, T., & Arens,
   A. K. (2019). The murky distinction between self-concept and self-efficacy:
   Beware of lurking jingle-jangle fallacies. Journal of Educational Psychology,
   111(2), 331–353.

Marsh, H. W., Seaton, M., Trautwein, U., Lüdtke, O., Hau, K. T., O’Mara, A. J., &
   Craven, R. G. (2008). The Big-fish–little-pond-effect Stands Up to Critical
   Scrutiny: Implications for Theory, Methodology, and Future Research.
   Educational Psychology Review, 20(3), 319–350.

Trautwein, U., Lüdtke, O., Marsh, H. W., Köller, O., & Baumert, J. (2006). Tracking,
   Grading, and Student Motivation: Using Group Composition and Status to Predict
   Self-Concept and Interest in Ninth-Grade Mathematics. Journal of Educational
   Psychology, 98(4), 788 – 806.


09. Assessment, Evaluation, Testing and Measurement
Paper

The Influence of Mathematics Self-concept and Self-efficacy on Mathematics Achievement: Comparison between the Public and Independent Schools in Sweden

Yi Ding, Alli Klapp, Kajsa Hansen

University of Gothenburg, Sweden

Presenting Author: Ding, Yi

Achievement gaps in mathematics can be found among education systems all over the world in international large-scale assessment studies (ILSAs). In almost all education systems, students’ socioeconomic status (SES) has been documented as one of the most important factors associated with achievement, known as the “socioeconomic achievement gap” (Chmielewski, 2019), while in other education systems, achievement gaps can be accounted for by gender, immigration background, ethnicity and/or urban-rural locations of schools and students (e.g., Bondy et al., 2017; Brozo et al., 2014; Song et al., 2014). In Sweden, remarkable differences can be observed between public and independent schools and the differences might be explained by a larger share of students with well-educated parents in independent schools than in public schools (Klapp Lekholm, 2008). Taking mathematics as an example, students in independent schools perform better than students from public schools in Programme for International Student Assessment (PISA), even after controlling the background variables, and the crucial difference in achievement holds consistent from PISA 2003 to PISA 2012 regardless of the sharp decline, and the advantage of independent schools has emerged over time (OECD, 2019).

The types of schools (private or public as categorised in PISA) are generally differentiated by the ownership of schools. Private schools refer to schools managed directly or indirectly by a non-government organisation (such as a church, trade union, business or other private institution), while public schools are managed by a public education authority, government agency, or governing board appointed by the government or elected by a public franchise (OECD, 2020). In the Swedish context, instead of private schools, it would be more accurate to use the term independent schools, which can be run by private organisations to operate educational activities through a publicly funded voucher system (Yang Hansen & Gustafsson, 2016) and could be running for profit (Wiborg, 2015).

Research also indicates that students’ motivational beliefs seem to be important for academic achievement in the Swedish education system (Klapp, 2018). Previous research has established that student self-beliefs could predict and impact academic achievement, among which self-concept and self-efficacy are the most identified ones (Bong & Skaalvik, 2003; Multon et al., 1991). Mathematics self-concept is an individual’s perceived competence in mathematics (OECD, 2013), and was found strongly related to students’ general mathematics achievement (Bong & Skaalvik, 2003; Ma & Kishor, 1997). Mathematics self-efficacy measures students’ expectations and conviction of what can be accomplished when they need to solve pure and/or applied mathematics tasks. Students’ mathematics self-efficacy had a strong direct effect on mathematics problem-solving despite their general mental ability (Pajares & Kranzler, 1995).

It is well established that mathematics self-concept and self-efficacy to a varying degree are associated with students’ mathematics achievement. It has also been observed for many decades that student gender, socioeconomic status and immigration background influence academic achievement, directly and indirectly (e.g., Bondy et al., 2017; Schleicher, 2006). There is still uncertainty, however, regarding how the relations among mathematics self-concept, self-efficacy, student characteristics (SES, gender, immigrationbackground) and mathematics achievement may vary for students in different types of schools (public or independent) in the Swedish education system and over the years.

The main aim of the study was to investigate the relative importance of student mathematics self-concept and self-efficacy for mathematics achievement across Swedish public and independent schools over time, concerning student characteristics such as SES, gender and immigration background.


Methodology, Methods, Research Instruments or Sources Used
This study consists of students from Sweden who participated in PISA 2003 (N=4624, n=186 from independent schools) and PISA 2012 (N=4736, n=787 from independent schools). Mathematics self-concept (MSC) was measured by five items, where the students were asked how they feel when studying mathematics. They were supposed to report whether they strongly agree, agree, disagree or strongly disagree with the statements, such as “I get good marks in mathematics” and “I learn mathematics quickly”. Mathematics self-efficacy (MSE) was measured by eight items, indicating the perceived mathematical abilities. The students were asked to report whether they feel very confident, confident, not very confident or not at all confident in facing pure and applied mathematical tasks, such as “calculating TV discount” and “understanding a train timetable”. Mathematics achievement, as defined as mathematical literacy in PISA, captures student capability in formulating, employing and interpreting mathematics in diverse contexts (OECD, 2013). Five plausible values were generated to represent student mathematics achievement. Students were categorised into males and females in PISA. In this study, students were grouped into natives (students born in Sweden and whose at least one parent was also born in Sweden) and non-natives (students born in Sweden with non-Sweden-born parents, and students born outside Sweden as well as their parents). Student economic, social and cultural status (ESCS) is an index in PISA reflecting student family educational, occupational and cultural status.
Descriptive statistics were first investigated, giving an overview of all the variables. Secondly, multi-group confirmatory factor analyses (MGCFA) were performed to examine the factor structure and measurement invariance across the two PISA cycles and across the school types (the independent and public schools) in Sweden. Then, concerning the cluster sampling strategy in PISA and the intention of making comparisons in this study, multi-group multi-level structural equation modelling (MGSEM) was applied to study the relations between mathematics self-concept, self-efficacy and mathematics achievement, concerning students’ gender and immigration background.
SPSS 28 were used for data management and Mplus 8 for analyses.

Conclusions, Expected Outcomes or Findings
As mentioned earlier, Swedish students in independent schools achieve higher than those in public schools despite the extraordinary decline from PISA 2003 to PISA 2012. The overall results suggest that students with high levels of mathematics self-concept and self-efficacy tend to have better performance in mathematics. Students with better economic, social and cultural status are possibly to have stronger mathematics self-concept and self-efficacy and perform better in mathematics. Immigrant students perform considerably worse than non-immigrant students in mathematics and yet they perceive themselves as having higher mathematics self-concept and self-efficacy. Girls who, although performed equally well or even better than boys, hold nevertheless weaker mathematics self-concept and self-efficacy. At the school level, mathematics achievement is positively associated with economic, social and cultural status. Schools with larger portions of immigrant students seem to have lower economic, social and cultural status and mathematics achievement.
Compared to independent schools, the influence of mathematics self-efficacy is stronger than mathematics self-concept in both PISA 2003 and 2012 in public schools. Economic, social and cultural status plays a relatively less important role in mathematics self-concept, self-efficacy and achievement in public schools. Conversely, the effect of immigration background seems to be stronger in independent schools. Girls are found to have even lower levels of mathematics self-concept and self-efficacy in independent schools.
The study has significant implications for researchers and practitioners in the educational and psychological fields. Positive self-beliefs are significant representative constructs in educational psychology (Marsh et al., 2019). The results and findings from this study highlighted the important role of mathematics self-concept and self-efficacy in mathematics achievement across Swedish public and independent schools. It is important to raise teachers’ awareness of promoting students’ self-concept and self-efficacy in mathematics learning, for girls, immigrant students and students with lower SES in particular.

References
Bondy, J. M., Peguero, A. A., & Johnson, B. E. (2017). The children of immigrants’ academic self-efficacy: The significance of gender, race, ethnicity, and segmented assimilation. Education and Urban Society, 49(5), 486–517.
Bong, M., & Skaalvik, E. M. (2003). Academic self-concept and self-efficacy: How different are they really? Educational Psychology Review, 15(1), 1–40.
Brozo, W. G., Sulkunen, S., Shiel, G., Garbe, C., Pandian, A., & Valtin, R. (2014). Reading, Gender, and Engagement. Journal of Adolescent & Adult Literacy, 57(7), 584–593.
Chmielewski, A. K. (2019). The Global Increase in the Socioeconomic Achievement Gap, 1964 to 2015. American Sociological Review, 84(3), 517–544.
Klapp, A. (2018). Does academic and social self-concept and motivation explain the effect of grading on students’ achievement? European Journal of Psychology of Education, 33(2), 355–376.
Klapp Lekholm, A. (2008). Grades and grade assignment: Effects of student and school characteristics. rapport nr.: Acta Universitatis Gothoburgensis 269.
Ma, X., & Kishor, N. (1997). Attitude toward self, social factors, and achievement in mathematics: A meta-analytic review. Educational Psychology Review, 9(2), 89–120.
Marsh, H. W., Pekrun, R., Parker, P. D., Murayama, K., Guo, J., Dicke, T., & Arens, A. K. (2019). The murky distinction between self-concept and self-efficacy: Beware of lurking jingle-jangle fallacies. Journal of Educational Psychology, 111(2), 331.
Multon, K. D., Brown, S. D., & Lent, R. W. (1991). Relation of self-efficacy beliefs to academic outcomes. Journal of Counseling Psychology, 38(1), 30.
OECD. (2013). PISA 2012 assessment and analytical framework: Mathematics, reading, science, problem solving and financial literacy.
OECD. (2019). Sweden - country note - PISA 2018 results.
OECD. (2020). PISA 2018 results (volume v): effective policies, successful schools.
Pajares, F., & Kranzler, J. (1995). Self-efficacy beliefs and general mental ability in mathematical problem-solving. Contemporary Educational Psychology, 20, 426–426.
Schleicher, A. (2006). Where immigrant students succeed: A comparative review of performance and engagement in PISA 2003. Intercultural Education, 17(5), 507–516.
Song, S., Perry, L. B., & McConney, A. (2014). Explaining the achievement gap between Indigenous and non-Indigenous students: an analysis of PISA 2009 results for Australia and New Zealand. Educational Research and Evaluation, 20(3), 178–198.
Wiborg, S. (2015). Privatizing Education: Free School Policy in Sweden and England. Comparative Education Review, 59(3), 473–497.
Yang Hansen, K., & Gustafsson, J.-E. (2016). Causes of educational segregation in Sweden – school choice or residential segregation. Educational Research and Evaluation, 22(1–2), 23–44.
 
5:15pm - 6:45pm09 SES 03 B: Exploring the Relationship Between Student Wellbeing and Academic Resilience
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Jan-Eric Gustafsson
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

Does Social Well-Being Predict Academic Resilience? An Analysis of Swedish PISA 2018 Data

Deborah Elin Siebecke, Kajsa Yang Hansen, Maria Jarl

University of Gothenburg, Sweden

Presenting Author: Siebecke, Deborah Elin

Recent studies indicate that Sweden faces issues of decreasing educational equity (Siebecke & Jarl, 2022; Yang Hansen & Gustafsson, 2019), suggesting that the impact of socioeconomic background on achievement has increased. However, some students achieve high despite disadvantages in their socioeconomic background that place them at risk for low achievement. These students are often referred to as academically resilient and yield hope for a more equitable future. In general terms, resilience is grounded in the recognition that individuals’ responses to adversities differ (Rutter, 2012). While some struggle or fail in the face of adversity, others seem to adjust just fine. Those, who demonstrate positive adaptation despite being exposed to adversities, are usually considered resilient (e.g., Masten & Obradovic, 2006). The identification of supportive and risk factors can help socioeconomically disadvantaged individuals in becoming academically successful and, thus, improve educational equity.

Previous studies have indicated that individual and external resources, such as supportive adults and peers (Fergus & Zimmerman, 2005) and a student’s sense of belonging at school (Gonzalez & Padilla, 1997) can promote a student’s resilience. According to a framework by Borgonovi and Pál (2016), these indicators - that is a student’s sense of belonging at school and their relationship with their teachers, parents, and peers – also act as subdimensions of social well-being. This may imply a relationship between academic resilience and social well-being. Yet, research on the (social) well-being of academically resilient students is scarce, especially in Sweden. While the relationship between social well-being and academic resilience is underexplored, previous research does indicate a positive albeit small relationship between well-being and achievement (Bücker et al., 2018; Kaya & Erdem, 2021). However, this relationship is not straightforward and a multidimensional conceptualization of well-being is needed to assess which aspects are particularly important for achievement (Clarke, 2020). In general, well-being is hypothesized to be a multi-dimensional construct consisting of social, physical, and mental/psychological dimensions, which can further be structured in subdimensions (Colombo, 1984). The social dimension of well-being, for instance, can be measured by including subdimensions such as the students’ relationship with peers, parents and teachers and their sense of belonging at school (Borgonovi & Pál, 2016). These subdimensions have been found to be interrelated. For instance, a student’s sense of belonging is closely related to their relationship with peers and teachers (Govorova et al., 2020).

Thus, the main objective of the present study is to investigate whether and how students’ social well-being predicts their academic resilience. The present study focuses on social well-being, as one important dimension of student well-being, and attempts to capture its complexity by not only modeling its’ possible relationship to academic resilience but by also considering the interrelationship between subdimensions of social well-being. The study is anchored in Bronfenbrenner’s ecological theory and specifically focuses on the students’ microsystem, that is their close interaction with their immediate environments, as well as the mesosystem, which describes the interrelation among the environments in which the student participates (Bronfenbrenner, 1979).


Methodology, Methods, Research Instruments or Sources Used
Making use of data from the Programme for International Student Assessment (PISA) from 2018, the study investigates the relationship between academic resilience and the social well-being of 15-year-old students in Sweden. After weighing the advantages and disadvantages of different approaches to operationalizing academic resilience, we decided to apply a definition-driven approach, which is said to reflect academic resilience "in its most literal sense: academic achievement despite adversity" (Rudd et al., 2021, p. 5). Thus, academically resilient students are defined as those who achieve at or above Level 3 in the PISA domains reading, mathematics, and science, despite falling in the bottom quartile of Sweden’s distribution of the Index of Economic, Social and Cultural Status (ESCS) (Agasisti et al., 2018). Level 3 corresponds to a median achievement level that is said to prepare students “for success later in life” (Agasisti et al., 2018, p. 8). This study only focuses on socioeconomically disadvantaged students, leading to a total sample size of 1337 students, 358 of whom were considered resilient. A dichotomous variable measuring academic resilience is used as a dependent variable in the present study.
The measure of social well-being is based on a well-being framework proposed by Borgonovi and Pál (2016) and adapted to the newer measures in PISA 2018 (for an overview, see Borgonovi, 2020). According to this framework, the social dimension of well-being can be measured using students’ self-reported data on the sense of belonging at school, exposure to bullying, teacher support, teacher feedback, and parental emotional support. Each of these subdimensions of social well-being was measured as a latent variable consisting of three to six indicators.
Data analyses were run in SPSS 29 and Mplus 8. First, confirmatory factor analysis was used to test whether the data fit the measurement models. Secondly, structural models based on an extensive literature review were built. The models reflect the interrelation of subdimensions of social well-being as well as their relation to academic resilience. Due to the nested data structure (i.e., the clustering of individual data in schools) but small intraclass correlations, a single-level model was used. Standard errors of the SEM parameters were adjusted by using the TYPE = COMPLEX command in Mplus, accompanied by the robust maximum likelihood estimator, cluster, and student weights. To evaluate model fit, local and global fit indices were consulted.

Conclusions, Expected Outcomes or Findings
Preliminary analyses resulted in well-fitting measurement models for all tested well-being subdimensions (i.e., sense of belonging at school, exposure to bullying, teacher support, perceived teacher feedback, and parental emotional support). A structural model linking these subdimensions with each other, as well as with the dichotomous endogenous measure of academic resilience resulted in an overall good global and local model fit. Model results confirm the interrelation of subdimensions of social well-being that was highlighted in previous research. For instance, parental support and students’ exposure to bullying significantly predict their sense of belonging at school. Yet, preliminary results suggest that only the students’ perceived support by their teachers significantly predicts their academic resilience while other subdimensions of well-being did not indicate any significant relationship with academic resilience.
The presentation of results includes a discussion of the study’s possible limitations due to cross-sectional data, reduced statistical power by cause of group sizes, and the necessary but rather artificial dichotomization of resilient vs. nonresilient students.
Even though the study focuses on academic resilience and well-being in Sweden, results can be of importance beyond the Swedish context. Issues of educational inequity and the importance of fostering student well-being are topical and prominent across Europe. For instance, in countries such as Austria, Belgium, France, and Germany, more than 15% of the variation in science performance can be explained by the student’s socioeconomic background alone (OECD, 2018). Research on the group of academically resilient students can shed light on the reasons why some students seem to defeat the odds and show positive adaptation despite adversity. Thus, more research on academic resilience and well-being is needed – in Sweden and beyond.

References
Agasisti, T., Avvisati, F., Borgonovi, F., & Longobardi, S. (2018). Academic resilience: What schools and countries do to help disadvantaged students succeed in PISA. OECD Publishing.
Borgonovi, F. (2020). Well-being in international large-scale assessments. In T. Nilsen, A. Stancel-Piątak, & J.-E. Gustafsson (Eds.), International handbook of comparative large-scale studies in education: Perspectives, methods and findings (pp. 1–26). Springer International Publishing.
Borgonovi, F., & Pál, J. (2016). A framework for the analysis of student well-being in the PISA 2015 study: Being 15 in 2015. OECD Education Working Papers, 140. OECD Publishing.
Bronfenbrenner, U. (1979). The ecology of human development: Experiments by nature and design. Harvard University Press.
Bücker, S., Nuraydin, S., Simonsmeier, B. A., Schneider, M., & Luhmann, M. (2018). Subjective well-being and academic achievement: A meta-analysis. Journal of Research in Personality, 74, 83–94.
Clarke, T. (2020). Children’s wellbeing and their academic achievement: The dangerous discourse of ‘trade-offs’ in education. Theory and Research in Education, 18(3), 263–294.
Colombo, S. A. (1984). General well-being in adolescents: Its nature and measurement. ProQuest Dissertations & Theses Global. https://search.proquest.com/dissertations-theses/general-well-being-adolescents-nature-measurement/docview/303323578/se-2?accountid=11162
Fergus, S., & Zimmerman, M. A. (2005). Adolescent resilience: A Framework for Understanding Healthy Development in the Face of Risk. 24.
Gonzalez, R., & Padilla, A. M. (1997). The Academic Resilience of Mexican American High School Students. Hispanic Journal of Behavioral Sciences, 19(3), 301–317.
Govorova, E., Benitez Baena, I., & Muñiz, J. (2020). Predicting Student Well-Being: Network Analysis Based on PISA 2018. International Journal of Environmental Research and Public Health, 17, 4014.
Kaya, M., & Erdem, C. (2021). Students’ Well-Being and Academic Achievement: A Meta-Analysis Study. Child Indicators Research, 14(5), 1743–1767.
Masten, A. S., & Obradovic, J. (2006). Competence and Resilience in Development. Annals of the New York Academy of Sciences, 1094(1), 13–27.
OECD. (2018). Equity in education breaking down barriers to social mobility. OECD Publishing.
Rudd, G., Meissel, K., & Meyer, F. (2021). Measuring academic resilience in quantitative research: A systematic review of the literature. Educational Research Review, 34, 100402.
Rutter, M. (2012). Resilience as a dynamic concept. Development and Psychopathology, 24(2), 335–344.
Siebecke, D. E., & Jarl, M. (2022). Does the material well-being at schools successfully compensate for socioeconomic disadvantages? Analysis of resilient schools in Sweden. Large-Scale Assessments in Education, 10(11), 11.
Yang Hansen, K., & Gustafsson, J.-E. (2019). Identifying the key source of deteriorating educational equity in Sweden between 1998 and 2014. International Journal of Educational Research, 93, 79–90.


09. Assessment, Evaluation, Testing and Measurement
Paper

Student Well-Being in School and Academic Achievement by TIMSS in Finland

Timo Salminen, Jonna Pulkkinen, Jenna Hiltunen, Jenni Kotila, Piia Lehtola, Juhani Rautopuro

University of Jyväskylä, Finland

Presenting Author: Salminen, Timo; Pulkkinen, Jonna

Student well-being in school can be considered as a condition that enables positive learning outcomes but also as an outcome of successful learning and students’ satisfaction at school (Morinaj & Hascher, 2022). Students’ well-being in school refers to an emotional experience characterized by the prevalence of positive feelings and cognitions towards school, persons in school and the school context over the negative ones towards school life (Hascher, 2003). According to Hascher (2003), it consists of six dimensions, three positive, i.e., positive attitudes to school, enjoyment in school, and positive academic self-concept, and three negative, i.e., worries in school, physical complaints in school, and social problems in school, that can be used as indicators of well-being.

In Finland, the trends in students’ academic well-being (e.g. Helakorpi & Kivimäki, 2021; Salmela-Aro et al., 2018, 2021) and learning performance (e.g. OECD, 2019; Mullis et al., 2020) have been descending in the last decade. For example, grade 4 students’ performance in mathematics and science has decreased from 2011 to 2019 as evidenced by the Trends in International Mathematics and Science Study (TIMSS) (Mullis et al., 2020). The performance in mathematics declined by 10 points from 2011 to 2015 and by three points from 2015 to 2019. In science, the decrease from 2011 to 2019 was 15 points. When examining the international mathematics and science benchmarks (Mullis et al., 2020), these declines in learning outcomes mean that the percentage of high achievers has dropped from 49% to 42% in mathematics and from 65% to 56% in science during this period. Meanwhile, the percentage of the students below the low international benchmark has grown from 2% to 5% in mathematics and from 1% to 3% in science.

Previous research has detected the interrelation between student well-being and learning performance but also the need for examining this relation with possible associated factors in more detail (e.g. Bücker et al., 2018; Nilsen et al., 2022; Pietarinen et al., 2014). For example, the Programme for International Student Assessment (PISA) shows the relationship between students’ socio-economic status (SES), well-being and achievement (OECD, 2017). Further, the study using TIMSS data by Nilsen, Kaarstein and Lehre (2022) shows that a safe environment, as an aspect of school climate, and student self-concept, both indicating students’ well-being in school, declined from 2015 to 2019 and mediates the changes in mathematics achievement over time in Norway.

The above statements point out that both students’ well-being in school and academic achievements may have declined in the last decade in Finland. Thus, in this study, we ask the following research questions, using the TIMSS fourth grade assessment data:

1) How has students’ well-being in school changed, if any, from 2011 to 2019?

2) What is the relationship between students’ SES, well-being and achievement in mathematics and science?


Methodology, Methods, Research Instruments or Sources Used
The present study is based on the three cycles of curriculum-based TIMSS assessment in Finland. The data includes the 4th graders who participated in TIMSS 2011 (N = 4,638), TIMSS 2015 (N = 5,015) and TIMSS 2019 (N = 4,730). In this study, we use school climate and safety, and students’ attitudes as indicators of well-being. School climate and safety include the scales of Students’ Sense of School Belonging (3 items) and Bullying (6 items). Students’ attitudes include the scales of Students Like Learning Mathematics (5 items) and Science (4 items), and Students Confident in Mathematics (7 items) and Science (6 items). These four-point scales are from TIMSS student questionnaires. From each scale, we selected those items that were the same in all three cycles of TIMSS assessment. As an indicator of students’ SES, we used Home Resources for Learning scale which is scored based on the number of books at home, the number of home study supports, and the parents’ educational level as well as the level of occupation. In TIMSS data, the Home Resources for Learning scale is divided into three categories. In this study, we recoded it into two categories: (1) students with many resources, and (2) students with some or few resources. In addition to the above-mentioned scales, the variables of our study include mathematics and science achievement scores.  

The analysis was performed in three phases. First, we conducted a confirmatory factor analysis (CFA) to examine the validity of the variables that measure well-being. Second, to answer the first research question, we computed mean variables and studied the average changes in students’ well-being from 2011 to 2019 using these mean variables. The values of mean variables ranged from 1 to 4 (the highest value indicating the most positive view). Third, to answer the second research question, we investigated the relationship between students’ SES, well-being and achievement using the structural equation modelling (SEM) approach. This analysis was conducted for mathematics and science separately for each of the three TIMSS data sets. Five plausible values representing students’ proficiency in mathematics and science (see Martin et al., 2020) were used in the analyses. A two-stage sampling design used in the TIMSS assessment (Martin et al., 2020) was considered in the analyses.  

Conclusions, Expected Outcomes or Findings
The results of CFA confirmed the validity of the latent variables (i.e., sense of belonging, bullying, like learning and confidence) that are used to measure students’ well-being in this study. Overall, the students’ well-being was relatively good. Examination of the trends of means showed that there are some changes in students’ well-being from 2011 to 2019. After 2011, students’ sense of belonging increased and bullying decreased slightly. With respect to students’ attitudes, the trends between 2011 and 2019 were not so explicit. Between 2011 and 2015, students liking mathematics grew to some extent, whereas confidence in mathematics remained unchanged. Students liking science, instead, increased from 2011 to 2015 but decreased again from 2015 to 2019. In addition, students’ confidence in science declined between 2015 and 2019.  

The preliminary results of SEM showed that students’ SES is related both to well-being and achievement. As expected, students with higher SES (i.e., students with many resources for learning) also feel better and achieve higher results in mathematics and science. Students’ SES seemed to be related to achievement not only directly but also indirectly via confidence. However, there was no indirect effect via other well-being variables than confidence. This study supports earlier research on the meaning of students’ well-being for learning.  

In further studies, we will examine the relationship between student well-being and academic achievement also by PISA and PIRLS (Progress in International Reading Literacy Study) data collected not only before but also after the COVID-19 pandemic, which has affected, mostly negatively, students’ schooling, learning and well-being all over the world (e.g. OECD, 2021).

References
Bücker, S., Nuraydin, S., Simonsmeier, B. A., Schneider, M., & Luhmann, M. (2018). Subjective well-being and academic achievement: A meta-analysis. Journal of Research in Personality, 74, 83–94.

Hascher, T. (2003). Well-being in school – why students need social support. In P. Mayring & C. von Rhöneck (Eds.), Learning emotions – the influence of affective factors on classroom learning (pp. 127–142). Bern u.a Lang.

Helakorpi, S., & Kivimäki, H. (2021). Well-being of children and young people – School Health Promotion study 2021. Finnish Institute for Health and Welfare, Statistical Report 42/2021. https://urn.fi/URN:NBN:fi-fe2021112557144

Martin, M. O., von Davier, M., & Mullis, I. V. S. (2020). Methods and procedures: TIMSS 2019 technical report. International Association for the Evaluation of Educational Achievement (IEA).  

Morinaj, J., & Hascher, T. (2022). On the relationship between student well-being and academic achievement: A longitudinal study among secondary school students in Switzerland. Zeitschrift für Psychologie, 230(3), 201–214.

Mullis, I. V. S., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 International Results in Mathematics and Science. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: https://timssandpirls.bc.edu/timss2019/international-results/

Nilsen, T., Kaarstein, H., & Lehre, A. C. (2022). Trend analyses of TIMSS 2015 and 2019: school factors related to declining performance in mathematics. Large-scale Assessments in Education, 10(1), 1–19.

OECD (2017). PISA 2015 Results (Volume III): Students’ Well-Being, PISA, OECD Publishing, Paris. http://dx.doi.org/10.1787/9789264273856-en

OECD (2019). PISA 2018 Results (Volume I): What Students Know and Can Do, PISA, OECD Publishing, Paris, https://doi.org/10.1787/5f07c754-en.

OECD (2021). The State of Global Education: 18 Months into the Pandemic. https://doi.org/10.1787/1a23bb23-en

Pietarinen, J., Soini, T., & Pyhältö, K. (2014). Students’ emotional and cognitive engagement as the determinants of well-being and achievement in school. International Journal of Educational Research 67, 40–51.

Salmela-Aro, K., Read, S., Minkkinen, J., Kinnunen, J. M., & Rimpelä, A. (2018). Immigrant status, gender, and school burnout in Finnish lower secondary school students: A longitudinal study. International Journal of Behavioral Development, 42(2), 225–236.

Salmela-Aro, K., Upadyaya, K., Vinni-Laakso, J., & Hietajärvi, L. (2021). Adolescents’ longitudinal school engagement and burnout before and during COVID-19 – The role of socio-emotional skills. Journal of Research on Adolescence, 31(3), 796–807.
 
Date: Wednesday, 23/Aug/2023
9:00am - 10:30am00 SES 04 A: Scottish Council of Deans of Education
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Margery McMahon
Panel Discussion
 
00. Central & EERA Sessions
Panel Discussion

Scottish Council of Deans of Education - panel discussion

Margery McMahon1, Zoe Robertson2, Carrie McLennan3, Stephen Day4, Aileen Kennedy5, Lynn Gangone6

1University of Glasgow, United Kingdom; 2University of Edinburgh; 3University of Dundee; 4University of the West of Scotland; 5University of Strathclyde; 6American Association of Colleges for Teacher Education

Presenting Author: McMahon, Margery; Robertson, Zoe; McLennan, Carrie; Day, Stephen; Kennedy, Aileen; Gangone, Lynn

This panel discussion will be led by representatives of Scotland's Council of Deans of Education with input from Deans / Heads of School of Education from other contexts who will be participating in the conference. Professor Lynn M.Gangone, President of the American Association of Colleges for Teacher Education (AACTE), will contribute to the panel.

As well as setting out the role and contribution of the Scottish Council of Deans of Education, a key focus will be on discussing the challenges and opportunities experienced by those directly involved in leading teacher education faculties in universities. Themes and questions will be framed to encourage audience participation and dialogue, with each other and with the panel. While the current education reform programme in Scotland will be a focus, other areas to be explored will include shared challenges relating to teacher recruitment and retention and research priorities for teacher education.


References
Hayward, L., Baird, J.-A., Allan, S., Godfrey-Faussett, T., Hutchinson, C., MacIntosh, E., Randhawa, A., Spencer, E., & Wiseman-Orr, M. L. (2023). National qualifications in Scotland: A lightning rod for public concern about equity during the pandemic. European Journal of Education, 58, 83– 97. https://doi.org/10.1111/ejed.12543

Menter, I. (2022). Maintaining quality in teacher education: a contemporary global challenge?. Child Studies, (1), 87–105. https://doi.org/10.21814/childstudies.4128

Chair
Professor Margery McMahon (Chair of the Scottish Council of Deans of Eduction)
Margery.mcmahon@glasgow.ac.uk
University of Glasgow
 
1:30pm - 3:00pm09 SES 06 B: Teacher Quality and Educational Outcomes: Insights from Nordic Education Systems
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Leah Glassow
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

How is Teacher Quality Allocated in Swedish Secondary Schools? Evidence from Longitudinal Register Data

Eun Jeong Lee, Stefan Johansson, Maria Jarl

University of Gothenburg, Sweden

Presenting Author: Lee, Eun Jeong

With the introduction of a mandatory teacher license in 2011, Sweden had an aim to raise teacher quality in Swedish schools (Jarl & Rönnberg, 2019). However, the increasing need for more teachers as well as difficulties to find adequate positions that matched teachers’ certification introduced challenges to fulfil that aim (Hansson and Gustafsson, 2016). The current study investigates how different measures of teacher quality developed in Sweden during the past decade, particularly focusing the distribution of teacher quality across disadvantaged and privileged schools.

The Swedish Education Act 2010 emphasizes equality in education. Pupils should be provided equal education opportunities in terms of access and resources to promote their academic success, and education should work compensatory for those with special needs and disadvantaged backgrounds (Holmlund et al., 2020). The idea of compensatory allocation of resources can be related to the opportunity to learn (OTL) model which address the essential inputs and processes within a school context for students to achieve intended outcomes (Elliott & Bartlett, 2016). While socioeconomic status (SES) has been shown to affect students’ achievement, there is a growing interest among scholars regarding what factors can compensate those with low achievements and disadvantaged home backgrounds (Nilsen et. al., 2020). One aspect of OTL framework may be the access to qualified teachers. However, students with low SES had been shown to have consistently less access to teachers with high qualifications than their more advantaged peers (Luschei & Jeong, 2018; Glassow & Jerrim, 2022).

In a Swedish context, Hansson and Gustafsson (2016) studied how teacher quality was distributed among Swedish schools using teacher and student register data between 1994 and 2011. The study showed a significant variation in the number of qualified teachers, with respect to the student-teacher ratio and the number of teachers whose teaching subjects were not matched with their teacher education among schools. While the schools whose students were eligible for mother tongue language learning, which indicated their immigration background, had a higher teacher density, they had a higher number of unqualified teachers, and teachers teaching subjects were more likely to be different from their subject specialization. The study also showed that teachers with high competence tended to work at schools where students had a high level of academic achievement, and which provided a higher salary and better working environment. Hansson and Gustafsson (2016) argued that results was influenced by decentralization and marketization reforms in 1991, which allowed municipalities and schools greater freedom employing teachers.

In 2011, another teacher education reform was introduced to increase the quality and specialization of teacher education. The reform divided the one common teacher education program of 2001 into separate programs for different subjects and age groups (Åstrand, 2017). In parallel, the reform introduced a teacher license, and the teacher education qualification requirement for employment has been stricter since then (Jarl & Rönnberg, 2019). What the results are of the policy intentions regarding increasing teacher quality and the need for more teachers has not been studied. An especially salient issue is how highly qualified teachers have been distributed across schools in the past decade regarding teacher qualification and subject specialization. Against this background, we stated the following research questions:

  1. What is the proportion of teachers with a teacher license during 2013-2020 in Swedish secondary schools?
  2. What proportion of teachers have a matched position, i.e., a teacher license relevant to the subjects and age groups they are educated for?
  3. Are any differences with respect to the proportion of teachers with a license and matching as regards the distribution across schools (school areas, school types, and student social background)?

Methodology, Methods, Research Instruments or Sources Used
This study will obtain Swedish population data from the Teacher Register, and Student Register kept by Statistics Sweden. These data form part of the national follow-up system for the school sector run by the Swedish National Agency for Education to provide a comprehensive picture of educational activities and support for follow-up and evaluation at national and regional levels (Alatalo et al., 2021).

The data on teachers is conducted annually and includes school staff with educational duties (teachers, assistant teachers and other educational staff, leisure teachers, leisure instructors, school leaders, and study and career counselors) in the school forms covered by the National Agency's National Monitoring System. The data has been collected since the late 1970s, and the structure and variables of the register have changed over the years. In the present study, we obtain data from 2013-2020. One reason was that the teacher register was updated in 2013 with more precise data on teachers' positions. Another reason was that we focused on the effects of introducing a compulsory teacher license in 2011. The present study will use the information on teachers' certification levels, the degree of match between teachers' licenses and positions, and their teaching experience measured in years. The teacher register will also provide data on the school location (Cities, suburbs, and rural areas) and school type (public or private).

Student register data is also updated annually, and in the current study, we use student cohorts born between 1997 and 2004. Data include achievement data on subject grades and national tests in school year 9. It also holds information on the parental background and immigration background of each student. This information will allow us to study if formal teacher competence is equally distributed across schools, if the compensatory allocation of teacher resources is present, and whether the degree of matched positions will increase due to the teacher license demands.

Regrettably, there is no link between students and their teachers. However, it is possible to conduct longitudinal analyses of the teacher characteristics at the school level, taking student characteristics into account.  

The analyses will mainly be carried out using descriptive statistics, using mean comparisons. Regression techniques such as growth curve modeling will also be considered where it is deemed appropriate. All data were derived from Statistics Sweden (SCB) and analyzed within the MONA (Microdata Online Access) system, which is Statistics Sweden's platform for access to microdata.

Conclusions, Expected Outcomes or Findings
Results show that about 85% of teachers in grade 9 have received teacher education, although more is needed to specify this specific age group and the subject they are teaching. Our preliminary analyses suggest that this general level of certification status tends to decrease over time. The more precise measure, the matching between teacher position (subjects taught) and their teacher license, shows that approximately 80% of the teachers are teaching in a subject and grade that they were actually educated for. Furthermore, there are some notable discrepancies across schools: The proportion of teachers with a license does not differ much across privileged and disadvantaged schools, although there are indications that the gap increases over time – disadvantaged schools being on a lower level. However, the degree of matching is about 5-10% lower in disadvantaged schools, indicating difficulties in attracting teachers with adequate specializations. Regarding school type, we noted that private schools have a lower share of certified teachers and a lower degree of matching throughout the period (about 65%). The matching is particularly low at private schools where the students have a lower parental background. The results suggest that there is no compensatory allocation of teacher resources in Sweden; instead, the trend points to the opposite. Further analyses will shed light on the teacher resources in different school locations and any differences for various school subjects.  
References
Alatalo, T., Hansson, &., & Johansson, S. (2021). Teachers' academic achievement: Evidence from Swedish longitudinal register data. European Journal of Teacher Education, Ahead-of-print(Ahead-of-print), 1-21.
Åstrand, B. (2017). Swedish teacher education and the issue of fragmentation: Conditions for the struggle over academic rigour and professional relevance. In Hudson, B. (Eds.), Overcoming fragmentation in teacher education policy and practice (pp.101-152).  Cambridge education research series.
Elliott, Stephen N., & Brendan J. Bartlett. (2014, Mar.3). Opportunity to Learn. In Oxford Handbooks Editorial Board (Online Eds.), Oxford Handbook Topics in Psychology. Oxford Academic. https://doi.org/10.1093/oxfordhb/9780199935291.013.70, accessed 30 Jan. 2023.
Glassow, L., & Jerrim, J. (2022). Is inequitable teacher sorting on the rise? Cross-national evidence from 20 years of TIMSS. Large-scale Assessments in Education, 10(1), 1-20.
Hansson, &., & Gustafsson, J. (2016). Pedagogisk segregation: Lärarkompetens i den svenska grundskolan ur ett likvärdighetsperspektiv. Pedagogisk Forskning I Sverige, 21(1-2), 56-78.
Holmlund, H., Sjögren, A., Öckert, B., & Sverige. (2020). Jämlikhet i möjligheter och utfall i den svenska skolan (Rapport 2020:7). Institutet för arbetsmarknads- och utbildningspolitisk utvärdering (IFAU).
Jarl, M., & Rönnberg, L. (2019). Skolpolitik : Från riksdagshus till klassrum (Tredje upplagan ed.).
Luschei, T. F., & Jeong, D. W. (2018). Is Teacher Sorting a Global Phenomenon? Cross-National Evidence on the Nature and Correlates of Teacher Quality Opportunity Gaps. Educational Researcher, 47(9), 556–576. https://doi.org/10.3102/0013189X18794401
Nilsen, T., Scherer, R., Gustafsson, J., Teig, N., & Kaarstein, H. (2020) Teachers’ role in enhancing equity – A multilevel structural equation modelling with mediated moderation in Frønes, T., Pettersen, A., Radisić, J., & Buchholtz, N. (edit). Equity, Equality and Diversity in the Nordic Model of Education. Cham: Springer International Publishing AG.
Scheerens, J., & Blömeke, S. (2016). Integrating teacher education effectiveness research into educational effectiveness models. Educational Research Review, 18, 70-87. doi: 10.1016/j.edurev.2016.03.002.


09. Assessment, Evaluation, Testing and Measurement
Paper

Cultural and Linguistic Diversity, Classroom Climate, Reading Achievement and Teacher Quality in Norwegian Elementary Classrooms: A Longitudinal Investigation

Jacqueline Michelle Peterson

University of Stavanger, Norway

Presenting Author: Peterson, Jacqueline Michelle

The cultural and linguistic diversity of classrooms continues to increase across OECD countries significantly and steadily (OECD, 2020). In addition to the ongoing achievement gap, it has also been found that students from culturally and linguistically diverse backgrounds are experiencing lower levels of belongingness in schools (Cerna et al., 2021). This trend is concerning given that research has demonstrated students’ relatedness to be connected to students’ engagement, academic self-concept, motivation, willingness to seek help from peers, and academic achievement (Flook et al., 2005;Goodenow, 1993; Shim et al., 2013). In addition to individual characteristics, classroom composition has also been shown to affect the well-being and academic outcomes of students (Van Ewijk & Sleegers, 2010). However, results are mixed as to whether these compositional effects on academic achievement are positive (Cho, 2012; Rjosk, 2014) or negative (Van Ewijk & Sleegers, 2010; Rjosk et al., 2017). It is also unclear as to whether all students are affected similarly (Rjosk et al., 2017). Investigations into how culturally and linguistic class composition relates to students’ socioemotional outcomes have similarly produced inconclusive results (Thijs & Verkuten, 2014; Veerman et al., 2022).

It has on the one hand been theorized by Putnam’s (2007) constrict theory that the presence of cultural and linguistic diversity threatens social cohesion and increases social disorganization (Putnam, 2007). Yet, that belongingness appears to be a function of cultural and linguistic distance (Cerna et al., 2021) and orientation towards a majority norm reference group (Veerman et al., 2022) suggests that these in-group out group distinctions may not be so clear cut. Indeed, in a contrast to constrict theory, Allport’s (1954) intergroup contact theory suggests the mixing of various ethnic groups can serve to reduce prejudice and result in greater understanding and empathy among members of varying ethnic groups over time. The extent to which these positive outcomes are realized, however, may be conditional to Allport’s (1954) criteria that group members get adequate opportunity to know each other, share similar status position, are in a situation of collaboration, and are supported by the institution to which a person belongs. Applied to the context of education, both the teacher’s role, as well as a positive cooperative classroom climate, can be seen as central components of Allport’s (1954) criteria.

To assess the extent to which these theories hold within the context of Norwegian elementary classrooms, the present study aims to assess how classroom cultural and linguistic diversity relates to students’ perception of classroom climate and reading achievement over the 2nd, 3rd, and 4th grades and the role of teacher quality on these relations. In doing so, this study aims to address the inconclusiveness of research findings to date, to further expand our understand of these relations beyond the US to the Nordic context, and to contribute the limited body of longitudinal studies on these relations. Guided by Jennings & Greenburg’s Prosocial Classroom Model (2009) and drawing upon Deci & Ryan’s (1985) theory of self-determination, this study therefore aims to address the following research questions:

1) How does cultural and linguistic diversity in the classroom relate to students’ perceptions of classroom climate and reading comprehension over time in Norwegian elementary schools?

2) Does teacher quality moderate relations between classroom cultural and linguistic diversity and students’ perceptions of classroom climate over time?


Methodology, Methods, Research Instruments or Sources Used
The present study uses longitudinal multilevel structural equation modeling to conduct a secondary analysis of 2800 students nested within 150 classes from 150 schools in Southwestern Norway which comprised the control group from a larger RCT study (see Solheim at al., 2017). Measurements were carried out in June 2017, 2018, 2019 as students were completing their first, second, and third grades.

Cultural and Linguistic background of students : Drawing on Greenberg (1956)’s monolingual weighted method, parent’s reported country of birth was categorized, and weighted for its cultural and linguistic proximity to Norwegian based upon Levenshtein distances within the Indo-European languages tree (Serva & Petroni, 2008). This resulted in scale 0 of 11, where 0 = Norwegian, (1-10) = Indo-European language families, and (11) = Other Non-Indo-European language families (11). Parents’ scores were then averaged to derive a single cultural and linguistic background statistic for each student. Students’ scores were then aggregated and averaged to derive a cultural and linguistic diversity average for each class.

Classroom Climate was measured using an adapted version of Rauer & Schuck’s (2003) scale of emotional and social experience in school. Students rated seven items on a 4-point Likert scale from which a latent factor of classroom climate was then derived.

Reading Comprehension was assessed using The Norwegian version of Form 2 of the Neal analysis of Reading Ability (NARA; Neale, 1997). Students read a short text, responded to open-ended comprehension questions, and received a score of 0 = wrong and 1= right for each answer up to 32 possible points.

Data Analyses

The software Mplus (Version 8.8; Muthen & Muthen, 1998 – 2010) with the Maximum Likelihood Robust (MLR) estimator was used for all analyses. A doubly latent approach was first used to establish cross-level measurement invariance for classroom climate. Next, a measurement model was specified to establish measurement configural, measurement, and scalar invariance across the three-time waves. A structural multilevel multi-group model was then defined, with time was treated as a grouping variable. Students’ individual cultural and linguistic background was specified as a within class variable, and the aggregated cultural and linguistic diversity of the classroom as a between variable. Classroom climate was regressed on cultural and linguistic diversity at both levels. Thereafter, the paths between variables were constrained over subsequent models to test for differences in these relations across the measurement waves. Future models will include reading comprehension and teacher quality.

Conclusions, Expected Outcomes or Findings
  The latent factor measurement model of classroom climate, specified as a doubly latent model with cross-level measurement invariance demonstrated acceptable model fit for each measurement wave. The time invariance measurement models of classroom climate also demonstrated good fit and were found to be partially invariant. The multilevel structural equation multigroup model also demonstrated excellent fit measures. (χ² = 650.216, df = 144, p < .05, [RMSEA] = .037, [CFI] = .935, [SRMR] within = .025, SRMR between = .043). This preliminary model revealed a negative relationship between class cultural and linguistic diversity and classroom climate across all three-time waves at the between levels. (Wave 1 β = -.480, p < .000, Wave 2 β = -.469, p < .000, Wave 3 β = -.472, p < .000) (Wave 1 R² =.23 p = .001, Wave 2  R² =.22 p = .001, Wave 3 R²  = .223 p = .002). When these relations were constrained across measurement waves, no significant difference was observed between any of the three time points, suggesting the negative relationship holds relatively constant. No significant relations were found at the within level.  
   Future analyses will include SES as a covariate in the model, reading comprehension as an outcome and teacher quality will be assessed as a potential moderator. Though preliminary, the results point to potentially important insights for the Norwegian elementary context. The significant unchanging negative relation between class cultural and linguistic diversity and classroom climate suggests that the premises of constrict theory may hold true and emerge at an early onset. This is troubling as classroom climate has been shown to relate to a variety of student outcomes. Preliminary results suggest that teachers of culturally and linguistically diverse classrooms may need to take extra care to foster a collaborative classroom climate among peers of varying cultural and linguistic backgrounds.  

References
Allport, G. W. (1954). The nature of prejudice. Reading, MA: Addison Wesley.
Cerna, L., Brussino, O., & Mezzanotte, C. (2021). The resilience of students with an immigrant background. https://www.oecd-ilibrary.org/content/paper/e119e91a-en
Cho, R. M. (2012). Are there peer effects associated with having English Language Learner (ELL) classmates? Evidence from the Early Childhood Longitudinal Study Kindergarten Cohort (ECLS-K). Economics of Education Review, 31(5), 629-643. https://doi.org/10.1016/j.econedurev.2012.04.006
Deci, E. L., & Ryan, R. M. (1985). Intrinsic Motivation and Self-Determination in Human Behavior (1st ed. 1985. ed.). Springer US : Imprint: Springer.
Flook, L., Repetti, R. L., & Ullman, J. B. (2005). Classroom Social Experiences as Predictors of Academic Performance. Developmental Psychology, 41(2), 319–327. https://doi.org/10.1037/0012-1649.41.2.319
Goodenow, C. (1993). Classroom belonging among early adolescent students: Relationships to motivation and achievement. The Journal of Early Adolescence, 13(1), 21–43.
Jennings, P. A., & Greenberg, M. T. (2009). The Prosocial Classroom: Teacher Social and Emotional Competence in Relation to Student and Classroom Outcomes. Review of Educational Research, 79(1), 491–525. https://doi.org/10.3102/0034654308325693
OECD (2020). International Migration Outlook 2020, OECD Publishing. https://doi.org/10.1787/ec98f531-en
Putnam, R.D. (2007), E Pluribus Unum: Diversity and Community in the Twenty-first Century The 2006 Johan Skytte Prize Lecture. Scandinavian Political Studies, 30: 137-174. https://doi.org/10.1111/j.1467-9477.2007.00176.x
Rjosk, C., Richter, D., Hochweber, J., Lüdtke, O., Klieme, E., & Stanat, P. (2014). Socioeconomic and language minority classroom composition and individual reading achievement: The mediating role of instructional quality. Learning and Instruction, 32, 63–72. https://doi.org/10.1016/j.learninstruc.2014.01.007
Rjosk, C., Richter, D., Lüdtke, O., & Eccles, J. S. (2017). Ethnic composition and heterogeneity in the classroom: Their measurement and relationship with student outcomes. Journal of Educational Psychology, 109(8), 1188–1204. https://doi.org/10.1037/edu0000185
Shim, S. S., Kiefer, S. M., & Wang, C. (2013). Help Seeking Among Peers: The Role of Goal Structure and Peer Climate. The Journal of Educational Research, 106(4), 290–300. https://doi.org/10.1080/00220671.2012.692733
Thijs, J., & Verkuyten, M. (2014). School ethnic diversity and students’ interethnic relations. British Journal of Educational Psychology, 84(1), 1–21. https://doi.org/10.1111/bjep.12032
Van Ewijk, R., & Sleegers, P. (2010). Peer ethnicity and achievement: A meta-analysis into the compositional effect. School Effectiveness and School Improvement, 21(3), 237–265. https://doi.org/10.1080/09243451003612671
Veerman, G.-J. M., Heizmann, B., & Schachner, M. K. (2022). Conditions for cultural belonging among youth of immigrant descent in Germany, the Netherlands, Sweden and the United Kingdom. Comparative analysis of intergroup experiences and classroom contexts. Ethnic and Racial Studies, 45(16), 659–683. https://doi.org/10.1080/01419870.2022.2136010


09. Assessment, Evaluation, Testing and Measurement
Paper

Teacher Qualifications And Teaching Quality Related To Changes In Matehmatics Achievement in TIMSS from 2015 To 2019

Trude Nilsen1, Hege Kaarstein2

1University of Oslo, Norway; 2University of Oslo, Norway

Presenting Author: Nilsen, Trude

The Trends in Mathematics and Science Study (TIMSS) study measures students’ competence based on the participating countries’ curricula. Changes in students’ TIMSS achievement over time within a country is hence of interest to educational policy and practice as they may be related to, or possibly could reflect, changes in contextual factors important for student learning.

In Norway, students’ performance in mathematics at grade 9 decreased from 2015 to 2019 as evidenced by TIMSS. Seeing how teachers’ competence and their instruction are the most proximal to students and key to their learning outcome (Baumert et al., 2010; Darling-Hammond, 2000; Klieme et al., 2009; Praetorius et al., 2018), the present study seeks to investigate whether changes in teacher variables may be related to changes in the students’ mathematics.

Teachers’ competence is shaped by their formal level of education, the degree to which they have subject specialization, and through participation in professional development (e.g. Desimone et al., 2013). Teacher competence has proven to be important for teaching quality and for the students’ learning outcome (e.g. Baumert et al., 2010; Jentsch & König, 2022): A competent teacher tends to provide high quality teaching.

Teaching quality reflects the teaching going on in the classroom (i.e. teachers’ behavior) (Praetorius et al., 2018). It is a broad concept including different dimensions that have been found to promote student learning (e.g. Pianta & Hamre, 2009). According to the framework of the Three Basic Dimensions (TBD) (Klieme et al., 2009; Praetorius et al., 2018), teaching quality includes the following three dimensions: 1) Classroom management, which is about arranging for effective learning in the classroom (e.g. managing noise and disruptions). 2) Supportive climate, in which the teacher for example shows interest in and respect for all students, gives (individual) feedback and helps connecting new topics to what has already been learned, and 3) Cognitive activation, which is about challenging the students cognitively (e.g. students are asked to reason, interpret, solve problems).

As the TBD framework captures the main aspects of teaching quality, is extensively used in Europe and in the TIMSS contextual framework (Senden, Nilsen, & Blömeke, 2021; Mullis et al., 2020; Pianta & Hamre, 2009), it is used as the theoretical background for teacher quality in this study.

However, the quality of the teaching also depends on who is taught, on the background and composition of students (Praetorius et al., 2018). The quality of the teaching may be limited by students who for instance lack basic previous knowledge, who are uninterested, or lack basic language skills.

Knowing how important teachers and their teaching are to student learning, it is plausible that changes in teacher competence, their teaching quality and limitations to teaching quality, may be related to changes in student outcome. This study hence seeks to answer the following two research questions:

1. How have teacher competence, teaching quality and limitations to teaching quality changed from 2015 to 2019?

2. What is the relation between the changes in the predictors (i.e. teacher competence, teaching quality and limitations to teaching quality) and the changes in students’ mathematics achievement from 2015 to 2019? In other words, do changes in the predictors mediate the changes in students’ mathematics performance over time?


Methodology, Methods, Research Instruments or Sources Used
The present study includes representative samples of Norwegian grade nine students who participated in TIMSS 2015 and TIMSS 2019 along with their mathematics teachers (Nstudents=9272, Nteachers=516). Only measures identical in the two cycles were included in the study.

The following variables from the teacher questionnaire were used to measure teachers’ competence: teachers’ highest formal educational level, teachers’ subject specialization (major in mathematics and/or in mathematics education), and teachers’ participation in professional development.

For teaching quality, only two of the three dimensions were included (supportive climate and cognitive activation) as classroom management was not included in both cycles. We used the student questionnaire to measure supportive climate (5 items, e.g. “My teacher is easy to understand”), and the teacher questionnaire to measure cognitive activation (6 items, e.g. “Ask students to explain their answers”).
Limitations to teaching, was measured by teacher responses (6 items, e.g. “Students lacking prerequisite knowledge”).

Methods of analysis
Data from 2015 was merged with data from 2019, adding a dummy variable labelled Time (coded 0=2015 and 1= 2019). Then Mplus version 8 was used to estimate a two-level (students and classes) mediation structural equation model (SEM) with trend data (Murnane & Willett, 2010).

The TIMSS 2019 report showed that the Norwegian grade nine students’ achievement in mathematics had declined by 9 points since the 2015 study (Mullis et al., 2020). Consequently, the unstandardized regression coefficient of the effect of Time on achievement is expected to be negative and approximately 9 points. Our question is whether the predictors (i.e., the change in teacher competence, teaching quality and limitations to teaching quality) may mediate this decline.
Using teaching quality as an example, the hypothesis is that if teaching quality has declined over time, and if teaching quality has a positive effect on achievement, it may partly mediate the effect of Time on Achievement. The mediation coefficient (the indirect effect) would be negative, thus teaching quality could be said to partly “explain” (albeit, not causally) the decline in achievement. Teaching quality would then mediate part of the decline in achievement over time.
Alternatively, if teaching quality has increased over time, and is positively related to achievement, the result would be a positive mediation (indirect effect). As a consequence of a positive indirect effect, one might possibly claim that the higher teaching quality “prevented” an even larger decline.

Conclusions, Expected Outcomes or Findings
Results.
Regarding research question 1, the results showed that no changes were found for teachers’ specialization and professional development. Teachers’ formal level of education, supportive climate and cognitive activation increased from 2015 to 2019. Lastly, the teachers reported about a higher level of limitations to teaching in 2019 than in 2015.

The results for the second research question showed significant, positive relations from teachers’ formal level of education, supportive climate, and limitations to teaching to student achievement.
However, only the increase in teachers’ level of formal education and supportive climate each had a positive indirect effect on achievement and mediated 1.5 and 2.3 points of the 9 points decline in achievement respectively. The higher level of limitations to teaching had a negative indirect effect and mediated about 6 of the 9 points of the decline.

Discussion and conclusion.
The incline in teachers’ level of education and teaching quality could be a result of extensive strategies implemented in Norway during the last decade, aiming to increase teachers’ competence in teaching mathematics (e.g. Ministry of Education and Research, 2015). The positive indirect effect could hence indicate a prevention of an even larger decline.
The limitations to teaching, on the other hand, had a negative indirect effect, which indicates a contribution to the decline. Increased limitations to the instruction over time reflects challenges with the students who e.g. don’t speak the language, who are hungry at school and lack sleep, and who lack prerequisite knowledge. Our findings on this are in line with other studies (e.g. Wedelborg et al., 2020).

The present study may contribute to the field with regards to the methodology which is robust and useful for identifying relations between changes in predictors and changes in achievement.  It further contributes to practice and policy.

References
Baumert, J., Kunter, M., Blum, W., Brunner, M., et al. (2010). Teachers’ Mathematical Knowledge, Cognitive Activation in the Classroom, and Student Progress. American Educational Research Journal, 47(1), 133-180.

Darling-Hammond, L. (2000). Teacher Quality and Student Achievement: A Review of State Policy Evidence. Education Policy Analysis Archives, 8(1).

Desimone, L. M., Smith, T. M., & Phillips, K. J. R. (2013). Linking Student Achievement Growth to Professional Development Participation and Changes in Instruction: A Longitudinal Study of Elementary Students and Teachers in Title I Schools. Teachers College Record, 115(5), 1-46.

Gustafsson, J.-E. (2013). Causal inference in educational effectiveness research: a comparison of three methods to investigate effects of homework on student achievement. School Effectiveness and School Improvement, 24(3), 275-295.

Jentsch, A., & König, J. (2022). Teacher Competence and Professional Development. In T. Nilsen, A. Stancel-Piątak, & J.-E. Gustafsson (Eds.), International Handbook of Comparative Large-Scale Studies in Education: Perspectives, Methods and Findings (pp. 1167-1183). Springer International Publishing.

Klieme, E., Pauli, C., & Reusser, K. (2009). The Pythagoras Study: Investigating Effects of Teaching and Learning in Swiss and German Mathematics Classrooms. In J. Tomáš & T. Seidel (Eds.), The Power of Video Studies in Investigating Teaching and Learning in the Classroom (pp. 137-160). Waxmann Verlag.

Ministry of Education and Research. (2015). Competence for Quality. Strategy for professional development for teachers and school leaders towards 2025. Oslo: Ministry of Education and Research

Mullis, I. V. S., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 International Results in Mathematics and Science. Boston College, TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/timss2019/international-results/

Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.

Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, Measurement, and Improvement of Classroom Processes: Standardized Observation Can Leverage Capacity. Educational Researcher, 38(2), 109-119.

Praetorius, A. K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of teaching quality: the German framework of Three Basic Dimensions. ZDM, 50(3), 407-426.

Senden, B., Nilsen, T., & Blömeke, S. (2021). Instructional Quality: A Review of Conceptualizations, Measurement Approaches, and Research Findings. In M. Blikstad-Balas, K. Klette, & M. Tengberg (Red.), Ways of Analyzing Teaching Quality (pp. 140-172). Scandinavian University Press.

Wendelborg, C., Dahl, T., Røe, M., & Buland, T. (2020). The Student Survey 2019 [Elevundersøkelsen 2019]. NTNU Samfunnsforskning.
 
3:30pm - 5:00pm09 SES 07 B: Exploring Student Perspectives and Teacher Experiences: Feedback in Education
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Gudrun Erickson
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

Student Perceptions of Self-generated Feedback: “It Made the Course Make Sense”.

William McGuire, David Nicol, Gemma Haywood

University of Glasgow, United Kingdom

Presenting Author: McGuire, William

In the Professional Enquiry and Decision-Making course at the University of Glasgow, part of the MEd in Professional Practice, students write a 1500-word assignment, which they often find challenging. The task is new and complex; their work must be original, and they are often unclear about requirements associated with each assessment criterion. Rubrics, descriptions of what is required, are of limited help. Existing support, where students received peer and tutor feedback prior to assignment submission improves outcomes but incurs high staff workload and does not necessarily foster independence. Therefore, a complementary intervention was devised in which students generated their own feedback (Nicol, 2021).

International studies have shown that feedback is an area of European or even global concern for students even though they can create their own feedback by comparing their work against rubrics, exemplars, or peers’ work (e.g., Lipnevich et al, 2014; Nicol and McCallum, 2022). Indeed, Nicol (2021) has developed a model to explain this in which the core feedback generation mechanism is comparison, thus arguing that capacity-building for self-regulation requires student development of inner feedback capability via explicit comparisons. (Nicol and Selveretnam, 2022).

Prior research gives exemplars before student work to clarify requirements, although recently some have argued for their use after student work: a form of post-production feedback (To, Panadero and Carless, 2021). However, we argue that both modes support self-feedback production. Exemplars can be similar in presentation format and subject topic to the work the student has produced or similar in format but different in topic. With this assignment, the latter enables a focus on writing (e.g., structure, argument) without distraction from content.

Five aspects of the MEd assignment served as focus for feedback improvements: the writing of literature search strategies, literature review, ethics application, research dissemination and limitations in research designs. For each, students: (i) compared exemplars of quality work (different topic/similar format) selected from students in previous years and identified common principles; (ii) produced their own work; and (iii) compared their findings from (1) with own work. The tutor guided students through the first comparison in class with second completed individually out of class (Nicol, 2021).


Methodology, Methods, Research Instruments or Sources Used
An online survey was deployed to generate quantitative data on the students’ perceptions of the extent of learning from the different comparison processes.  The survey was constructed based on the findings two focus groups. The process of reflexive thematic analysis designed by (Braun and Clarke, 2006, 2012, 2014, 2019) and developed in (Braun and Clark 2020) was deployed to identify, analyse, and report on emergent themes within the data sets. (Braun and Clarke 2006:79). The use of a Big Q approach enabled us to use both qualitative and quantitative data which mapped onto our research design to test a theory and to let the data lead.
Conclusions, Expected Outcomes or Findings
Students were extremely positive about this approach. The before comparison clarified understanding of task requirements thereby reducing anxiety and enabled them to generate feedback while producing their own work, although this did reduce the need for the second comparison. We will discuss how to address this issue. Most reported that delineating the comparison process raised awareness that they could take more agency over feedback processes.
Results had already been been excellent on this programme, but post-intervention results were outstanding with 17/30 students being awarded first class marks in their  dissertations. The use of partial exemplars, which were more palatable for students, proved to be helpful for students. The use of exemplars both as part of the feedback design protocol and as part of the peer and tutor review process was felt to be beneficial by participants.
Next steps include a scaling up and out of the protocol and so further testing with a larger cognate group, such as a PGDE class or an M Educ class in which numbers are much higher. Another possibility would be to trial the process in a non-cognate group or even to trail the use of non-exemplars. Perhaps the greatest benefit beyond student satisfaction and attainment is the potential to develop much greater student agency.

References
Alfieri, L., Nokes-Malach, T.J., and Schunn, C.D. (2013). Learning through case comparisons: A meta-analytic review, Educational Psychologist, 48:2, 87-113, DOI: 10.1080/00461520.2013.775712

Braun, V., Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101, DOI: 10.1191/1478088706qp063oa

Braun, V., Clarke, V. (2012).  Thematic analysis. In: Cooper, H., Camic, P.M., Long, D.L., Panter, A.T., Rindskopf, D., Sher, K.J. (eds.) APA Handbook of Research Methods in Psychology, Research Designs, vol. 2, pp. 57–71. Ameri-can Psychological Association, Washington.

Braun, V., Clarke, V. (2014) Thematic analysis. In: Teo, T. (ed.) Encyclopaedia of Critical Psychology, pp. 1947–1952. Springer, New York.

Braun, V., Clarke, V. (2019). Reflecting on reflexive thematic analysis. Qualitative Research in Sport, Exercise and Health. Volume 11, 2019 – Issue 4, DOI: 10.1080/2159676X.2019.1628806

Braun, V., Clarke, V. (2020). One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qual. Res. Psychology, DOI: 10.1080/14780887.2020.1769238

Lipnevich, A.A., McCallen, L.N., Miles, K.P., and Smith, J.K.  (2014). Mind the gap! Students’ use of exemplars and detailed rubrics as formative assessment. Instructional Science, 42(4) pp.539–559, DOI:10.1007/s11251-013-9299-9

Nicol D. (2021). ‘The power of internal feedback: exploiting natural comparison processes’, Assessment & Evaluation in Higher Education, 46(5) pp,756-778, DOI: 10.1080/02602938.2020.1823314

Nicol, D., and McCallum, S. (2022). Making internal feedback explicit: exploiting the multiple comparisons that occur during peer review, Assessment & Evaluation in Higher Education, 47(3) pp.424-443, DOI: 10.1080/02602938.2021.1924620
Nicol, D. and Selveretnam, G. (2022) Making internal feedback explicit: harnessing the comparisons students make during two-stage exams. Assessment and Evaluation in Higher Education, 47(4), pp. 507-522, DOI: 10.1080/02602938.2021.1934653

To, J., E. Panadero, and D. Carless. (2022). A systematic review of the educational uses and
effects of exemplars, Assessment & Evaluation in Higher Education, 47:8, 1167
1182, DOI: 10.1080/02602938.2021.2011134


09. Assessment, Evaluation, Testing and Measurement
Paper

Does Gender predict Upper Secondary School Students’ Perceptions of Teacher Feedback?

Katharina Dreiling, Ariane S. Willems

University of Göttingen, Germany

Presenting Author: Dreiling, Katharina

A key assumption of established models of school effectiveness and improvement is that factors of the teaching quality significantly affect the development of students’ competencies and attitudes (Kyriakides & Creemers, 2008). In particular, the power of feedback as a component of teaching quality has been stressed (Hattie & Timperley, 2007; Lipnevich & Smith, 2018; Seidel & Shavelson, 2007). Drawing on the feedback theory of Hattie and Timperley (2007) three dimensions of feedback quality are distinguished: On the task level feedback informs the learner on their actual state of learning and/or performance. On the process level feedback provides information on the progress students have made toward meeting and gives hints on how to improve. On the self-regulation level feedback encourages students to regulate and evaluate their own learning process. Additionally, Willems and Dreiling (in press) suggested a dialogue related dimension of feedback involving peers as a source of feedback in evaluating students’ performances. Existing studies show that the quality of feedback affects learning outcomes on both cognitive (e.g., achievement) and motivational levels (e.g., intrinsic motivation) (Rakoczy et al., 2008; Wisniewski et al., 2020). The impact of feedback, however, is not necessarily positive which indicates that individual students differ considerably in the ways that they perceive and use the feedback they receive (Wisniewski et al., 2020). In current social constructivist models, the learner is assumed to be an active agent in receiving, perceiving, and processing feedback information (Thurling et al., 2013). Recently, Lipnevich et al. (2016) proposed a student interaction model of feedback that highlights how feedback is received by the learner and how subsequent action on feedback is influenced by the learner’s individual characteristics. Hence, examining students’ perception of feedback and its determinants has been the focus of much recent feedback research (Lipnevich & Lopera-Quendo, 2022; Winstone et al., 2017).

There is also evidence that points to gender differences in the perception and processing of feedback (Chen et al., 2011; Hoya, 2021). Yet, studies on gender differences in perceptions of feedback are limited, primarily because of the lack of existing instruments that measure multiple dimensions of feedback perception (Lipnevich & Lopera-Quendo, 2022). Against this background, we aim to investigate whether boys and girls differ in their perception of feedback in German language classes. We adopt a multidimensional view on feedback (Strijbos et al., 2021) by differentiating simple and elaborated dimensions of feedback quality that influence how the feedback is perceived and used for further learning. In order to make meaningful comparisons of means across gender groups, measurement invariance of the instrument must be established (Millsap, 2011). Thus, the purpose of the current study is threefold. First, we discuss the validation of an instrument to measure multiple dimensions of perceived feedback quality. Second, we examine the measurement invariance of the feedback perception questionnaire across gender and investigate mean differences in the feedback perception scales between gender groups. Third, we explore whether the assumed relations between gender and feedback perception exist even under control of individual performance as well as the students’ intrinsic learning motivation.


Methodology, Methods, Research Instruments or Sources Used
The presented results are based on data of the German study FeeHe (‘Feedback in the context of heterogeneity’). To the best of our knowledge, FeeHe is the first study in which different theoretically and empirically derived dimensions of feedback are systematically measured from the perspective of high school students in German language classes. A repeated-measures design with two measurement points was used to investigate students’ perception of teacher feedback and the interplay between the perceived feedback and the students’ individual characteristics. At the beginning of a school semester (t1) a total of n=810 students (Meanage= 16.69 [SD=.84]; female = 53.8%) attending the 11th and 12th grade in 49 German language courses participated in the questionnaire study. After one school semester (t2) n=696 of the students (Meanage= 17.17 [SD=.90]; female = 55.2%) were surveyed again. To assess the students’ perception of teacher feedback, we developed a new instrument which distinguishes four dimensions of perceived feedback quality: (i) a task-oriented dimension, (ii) a process-oriented dimension, (iii) a self-regulation-oriented dimension (4 items) and (iv) a dialogue-oriented dimension. All dimensions of feedback perception were assessed by four items per dimension. The internal consistencies of the scales are satisfactory to good (t1: .68≤α≤.72; t2: .76≤α≤.80). Student gender data were gathered from teacher interviews at t1. To assess students’ performance the teachers were asked to provide their current grade in German. Grades range from 1 (excellent) to 6 (insufficient/fail). For the analyses reported in this paper, grades were recoded so that higher numbers represent better performance. Students’ intrinsic motivation for German was assessed by a scale consisting of six items, measured at the beginning of the school semester (t1). The internal consistency of the scale was very good (α=.93). All scales were answered on a 4-point Likert scale ranging from 1 (totally agree) to 4 (totally disagree).
To detect structural validity of the feedback perception scales Confirmatory Factor Analyses were conducted for each measurement point. Measurement invariance and gender differences in the feedback perception were explored by applying Multigroup Confirmatory Factor Analysis. Subsequently, Longitudinal Structural Equation Modeling was used to investigate how students’ gender, performance and intrinsic learning motivation predict the initial level (t1) and changes in feedback perceptions over time (t2-t1).

Conclusions, Expected Outcomes or Findings
The results on the factor structure of individual students' perceptions of feedback quality are in line with previous research and strengthen the distinction of perceived dimensions of feedback quality (Hattie & Timperley, 2007; Willems & Dreiling, in press). These results show that students are generally able to distinguish between the four dimensions of feedback quality in their ratings. However, this distinction is not perfect as indicated by the high correlations between the task-oriented and process-oriented dimension (t1: r = .83, t2: r = .84). We argue that this is not merely a measurement issue, but rather reflects a teacher practice of providing concurrent feedback on student achievement and progress. Concerning the second research question, Multigroup Confirmatory Factor Analysis revealed measurement invariance of the identified factor structure across gender groups. Given that measurement invariance was established, the measures of students’ perceptions can be used to compare means of perceived feedback quality between boys and girls. Contrary to our expectations, we could not find any gender-specific mean differences in upper secondary school students’ perceptions of various dimensions of feedback quality. Results from Longitudinal Structural Equation Modeling revealed that initial intrinsic learning motivation and performance are significant predictors of interindividual differences in the initial level and change of feedback perceptions.
Overall, our results highlight that differences in perceptions of feedback quality can be explained by students’ individual motivational and cognitive learning characteristics rather than by their gender, and that such interindividual differences in perceptions must be taken into account when examining the effectiveness of feedback.

References
Chen, Y., Thompson, M.S., Kromrey, J.D., & Chang, G.H. (2011). Relations of student perceptions of teacher oral feedback with teacher expectancies and student self-concept. The Journal of Experimental Education, 79(4), 452–477.
Hattie, J., & Timperley, H. (2007). The Power of Feedback. Review of Educational Research, 77(1), 81–112.
Hoya, F. (2021). Unterschiede in der Wahrnehmung positiven und negativen Feedbacks von Mädchen und Jungen im Leseunterricht der Grundschule. Unterrichtswissenschaft, 49, 423–441.
Kyriakides, L., & Creemers, B.P.M. (2008). Using a multidimensional approach to measure the impact of classroom level factors upon student achievement. School Effectiveness and School Improvement, 19(2), 183–205.
Lipnevich, A. A., Berg, D. A. G., & Smith, J. K. (2016). Toward a model of student response to feedback. In G. T. L. Brown & L. R. Harris (Eds.), The handbook of human and social conditions in assessment (pp. 169–185). New York: Routledge.
Lipnevich, A. A., & Smith, J. K. (Eds.). (2018). The Cambridge handbook of instructional feedback. Cambridge: Cambridge University Press.
Lipnevich, A. A., & Lopera-Oquendo, C. (2022). Receptivity to instructional feedback: A validation study in the secondary school context in Singapore. European Journal of Psychological Assessment. Advance online publication.
Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York, NY: Routledge.
Rakoczy, K., Klieme, E., Bürgermeister, A., and Harks, B. (2008). The interplay between student evaluation and instruction. J. Psychol. 216, 111–124
Seidel, T., and Shavelson, R. (2007). Teaching effectiveness research in the past decade: the role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77, 454–499.
Thurlings, M., Vermeulen, M., Bastiaens, T., and Stijnen, S. (2013). Understanding feedback: a learning theory perspective. Educational Research Review, 9, 1–15.
Willems, A. S. & Dreiling, K. (in press). Erklären individuelle Motivationsprofile von Schülerinnen und Schülern Unterschiede in ihrer Feedbackwahrnehmung im Deutschunterricht der gymnasialen Oberstufe? Journal for Educational Research Online.
Winstone, N. E., Nash, R. A., Parker, M. & Rowntree, J. (2017). Supporting Learners’ Agentic Engagement with Feedback: A Systematic Review and a Taxonomy of Recipience Processes. Educational Psychologist, 52(1), 17–37.
Wisniewski, B., Zierer, K., & Hattie, J. (2020). The power of feedback revisited: A meta-analysis of educational feedback research. Frontiers in Psychology, 10, Article 3087.


09. Assessment, Evaluation, Testing and Measurement
Paper

Upper Secondary Teachers` Experiences with Use of Video Feedback in Student Assessment

Dorthea Sekkingstad, Ann Karin Sandal

Western Norway University of Applied Sci, Norway

Presenting Author: Sekkingstad, Dorthea; Sandal, Ann Karin

Background and theoretical framework

Research in assessment recognize formative feedback as crucial to enhance student learning and achievement (Black et al., 2004; Black & Wiliam, 1998; The Assessment Reform Group, 1999). Feedback can be related to giving information about the gap between actual level of performance and the desired outcome of a learning process, aiming to reduce the gap through formative feedback to the learner (Ramaprasad, 1983; Hattie & Timperley, 2007). Thus, feedback comprises information to the learner about performance as well as information about how to reach the learning aims. This points to an understanding of assessment as both formative and summative and the well-known concepts of assessment for and of learning (Wiliam, 2011). These perspectives have been developed in a comprehensive body of research in assessment internationally and integrated in classroom practice in numerous countries (Baird et al., 2014; Hattie & Timperley, 2007). Research in formative feedback has shown promising result as regards student learning. However, effective formative feedback must be practiced within a teaching design promoting students` use of feedback, develop common understanding of learning aims and provide quality feedback, such as timely and specific feedback, and detailed information about next step in the learning process (Hattie & Timperley, 2007).

In the Norwegian context, formative assessment and assessment for learning have been implemented in the curricula and assessment regulations since 2006, including students` self-assessment. Formative assessment as a concept is an essential part of the assessment regulations, describing when and how to assess formatively. Several national initiatives to support implementation of formative assessment in schools have been developed since 2007 (Norwegian Directorate for Education and Training, 2018). Despite this effort, there is still need for further development of formative assessment to support learning, and the annual student survey reveals decrease in formative feedback the higher in the education system (Norwegian Directorate for Education and Training, 2023). Rambøll Management Consulting (2020) shows lack of enough time to give feedback and that grades overshadow students` awareness of the learning potential in formative feedback.

Practical and pedagogical challenges related to implementation of formative feedback in teaching design call for new tools for working with formative assessment as a resource for learning in classrooms (Heitink et al., 2016) and several studies have investigated use of digital tools as useful for effective formative feedback in education (Henderson & Philip, 2015; Dawson et al., 2019; Mahoney et al., 2019). More specific, several studies show that use of video in formative feedback to students provides new opportunities for quality feedback and demonstrate more timely, detailed and personalized formative feedback which students use in further learning (Dawson et al., 2019; Kay & Bahula, 2020; Mahoney et al., 2019).

The literature review indicates that former research in VF is related to higher education and higher education institutions outside the Nordic countries. Research in VF also focus mostly on VF in language studies and studies where students receive feedback on written text (Bakla, 2017; Mahoney et al., 20219; Kay & Bahula, 2020). To our knowledge, there are few studies of use of video feedback (VF) in formative assessment in upper secondary school and from teachers` perspective. This study therefor aims to investigate upper secondary teachers` experiences with the use of video feedback to enhance students` learning. The study focus on whether VF provides new conditions for formative assessment. The research questions are:

a) How do teachers use video in formative assessment?

b) What kind of advantages and challenges can be identified in using video feedback?


Methodology, Methods, Research Instruments or Sources Used
Methods
Data are based on qualitative individual interviews with eight teachers in two upper secondary school in Norway. The teacher informants have from 10 to 30 years of experience as teachers, and from two to ten years’ experience with VF. The initial recruitment of informants was supported by the schools` headmaster, followed by teachers recruiting colleagues according to the selection criterion. The informants represent teaching in a broad range of subjects, both in general study programs and vocational programs in upper secondary school, e.g., languages, social sciences, natural science and mathematics, and economy/ business studies.
The semi-structured interview guide comprises five topics: 1) background and motivation for using VF, 2) the use of VF and VF as part of planning of teaching and assessment, 3) advantages and challenges in using VF, 4) experiences with other assessment tools and forms of assessment, and 5) experiences with supportive colleagues and leadership in schools.

To analyse data, we used thematic content analysis as a flexible framework for identify and classify patterns in the data (Krippendorff, 2004). The analysis started with an open inductive coding by two researchers, and the coding was compared and negotiated in several cycles. In the next step of the analysis process, we analysed the codes deductive and in relation to the research questions. This analysis lay ground for establishing topics and themes. The themes are the basis of the categories, presented under Results. Drawing on Geertz (1983), the themes and categories are defined and named close to the informants’ descriptions and stories. In both analysis steps, the researchers interpreted and coded the material individually before the negotiation and discussion of interpretations and establishing the themes. To analyse data in several cycles and by more than one researcher might have strengthened the validity and reliability of the study. The common interpretation of data and negotiation of codes and themes also helped to validate different interpretations of the data (Malterud, 2017). The study is conducted according to ethical guidelines in research (NSD-Norwegian centre for research data).

Conclusions, Expected Outcomes or Findings
Results and conclusion
The main findings show a formative use of VF to promote learning in the formative assessment in school subjects. VF is referred to as a new prerequisite for assessment and for increased quality feedback. The teachers report that the students engage in and use the VF during their learning processes, compared to written feedback, which to a lesser extent are read and used by the students. The findings also show that VF is a flexible format for giving feedback. VF provides scope and space for detailed, extensive and individual feedback, as well as information about the next step in learning. For example, the function “show and tell” in the computer program is valued by the teachers as an important tool for formative feedback. The use of VF also supports building relationship between the students and the teacher, although the communication is asynchronous. According to the teachers, the students experience to be “seen” and recognized by the teacher through the quality feedback. One of the challenges in the use of VF is related to the school leadership and priorities. Important prerequisite for using VF is the funding of adequate software and access to study office.

Although this study investigates VF with a small sample of informants, we find the results interesting due to VF as a tool for quality feedback and that the results are in line with previous research. Our study investigates use of VF from teachers` perspective and the findings must be interpreted with respect of self-report bias. Further research including students as informants might bring forward important information about VF in formative feedback from a student perspective. Analyses of the videos is also a relevant topic to investigate related to the research in VF.



References
Baird, J-A., Hopfenbeck, T., Newton, P., Stobart, G., & Steen-Utheim, A. (2014). State of the field review. Assessment and learning. Norwegian Knowledge Centre for Education (case number 13/4697).

Bakla, A. (2017). An Overview of Screencast Feedback in EFL Writing: Fad or the Future? Conference Paper: International Foreign Language Teaching and Teaching Turkish as a Foreign Language (27-28 April 2017), Bursa, Turkey.  

Black, P. & Wiliam, D. (1998). Inside the Black Box: Raising Standards through Classroom. Phi Delta Kappan (92)1, 81-90.

Black, P. et al. (2004). Working Inside the Black Box: Assessment for Learning in the Classroom. Phi Delta Kappan. September 2004.

Dawson, P., Henderson, M., Mahoney, P., Phillips, M., Ryan, T., Boud, D., & Molloy, E. (2019). What makes for effective feedback: staff and student perspectives, Assessment & Evaluation in Higher Education, 44(1), s. 25-36, DOI:10.1080/02602938.2018. 1467877

Geertz, C. (1983). Local Knowledge. Further Essays in Interpretive Anthropology. Basic Books.

Hattie, J. & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81-112.

Heitink, M.C., van der Kleij, F.M., Veldkamp, B.P., Schildkamp, K., & Kippers, W.B. (2016). A systematic review of prerequisites for implementing assessment for learning in classroom practice. Educational Research Review, 17 (2016) 50-62.

Henderson, M. & Philips, M. (2015). Video-based feedback on student assessment: scarily personal. Australasian Journal of Educational Technology, 31(1).

Kay, R.H. & Bahula, T. (2020). A Systematic Review of the Literature on Video Feedback Used in Higher Education. Conference: EDULearn 2020 - International Conference on Education and New Learning Technologies. Seville, Spain, July 2020. DOI:10.21125/edulearn.2020.0605

Krippendorff, Klaus (2004). Content analysis: An introduction to its methodology. Sage Publications.

Mahoney, P., Macfarlane, S., & Ajjawi, R. (2019) A qualitative synthesis of video feedback in higher education. Teaching in Higher Education, 24(2), 157-179, DOI:10.1080/13562517.2018.1471457

Malterud, K. (2017). Kvalitative forskningsmetoder for medisin og helsefag (4. utg.). Universitetsforlaget.

Norwegian Directorate for Education and Training (2023). The Student Survey. https://www.udir.no/tall-og-forskning/statistikk/elevundersokelsen/
Norwegian Directorate for Education and Training. (2018). Observations on the National Assessment for Learning Programme (2010–2018). Skills development in networks. Final report 2018.

NSD-Norwegian centre for research data. https://www.nsd.no/en/find-data

Ramaprasad, A. (1983). On the definition of feedback. Behavioral Science, 28, 4–13.

Rambøll Management Consulting (2020). Vurdering i skolen. [Assessment in School]. Report.

The Assessment Reform Group (1999). Assessment for Learning: Beyond the Black Box. University of Cambridge School of Education. https://www.nuffieldfoundation.org/sites/default/files/files/beyond_blackbox.pdf

Wiliam, D. (2011). What is assessment for learning? Studies in educational evaluation, 37, 3-14.
 
5:15pm - 6:45pm09 SES 08 B: Inclusive Education and Literacy: Perspectives, Interventions, and Assessment
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Ulrika Wolff
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

Literacy Learning for Students with Intellectual Disability Using Phonic-Based and Comprehension-Based Interventions

Lisa Palmqvist1,2, Mikael Heimann2, Jenny Samuelsson3,4,5, Gunilla Thunberg3,4, Monica Reichenberg1, Emil Holmer2

1Department of Education and Special Education, University of Gothenburg, Sweden; 2Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden; 3Institute of Neuroscience and Physiology, Speech and Language Pathology Unit, Sahlgrenska Academy, University of Gothenburg, Sweden; 4Dart - Centre for AAC and Assistive Technology, Sahlgrenska University Hospital, Gothenburg, Sweden; 5Region Västra Götaland, Habilitation & Health, Gothenburg, Sweden

Presenting Author: Palmqvist, Lisa

To participate in today’s society, one has to be able to read and write. For a person with intellectual disability (ID), this requirement may be insurmountable and many never become proficient readers (Cawley & Parmar, 1995; Di Blasi et al., 2019; Lemons et al., 2013; Ratz & Lenhard, 2013a; Wei et al., 2011). Thus, we must provide effective reading instructions early. This study aimed to investigate reading development in individuals with ID enrolled in the Swedish compulsory school system for students with ID using a, for the field, methodologically rigorous design. The study focused on the effects of two reading instruction strategies, phonics-based and comprehension-based, on reading development in beginning readers with ID. The Simple View of Reading (Gough & Tunmer, 1986; Hoover & Gough, 1990; Tunmer & Hoover, 2019) describes reading comprehension as the product of word recognition and language comprehension. The model is supported by meticulous research (e.g., Lervåg et al., 2018; Lervåg & Melby-Lervåg, 2012). Word recognition is the process of correctly identifying meaningful units in text and is associated with several different pre-literacy skills, including phonological awareness. Language comprehension refers to the process of binding together multiple lexical units into coherent semantic representations, using contextual, syntactical, and inferred information. Research has found that individuals with ID have poor pre-literacy skills, word recognition, and reading comprehension. The difficulties for a person with ID also include poor executive functions and working memory, and the difficulties often increase with the severity of the ID. There is a large body of research on reading ability in the population without ID, but the research for persons with ID is lagging (Dessemontet et al, 2019). Recent studies found similar variables associated with word recognition and reading comprehension in individuals with ID as for students with a typical development, suggesting that reading manifests similarly in both groups which in turn would indicate that the methods that are known to support reading development in typically developing children should also produce positive effects on reading development in children with ID. Literature has found that phonic-based interventions are effective for teaching literacy to students with mild ID and students with severe cognitive disabilities (Ainsworth et al., 2016; Dessemontet et al., 2019). Ainsworth et al., (2016) investigated teaching phonics to students with autism, intellectual disabilities, and complex communication needs and found that children increased their letter-sound-knowledge. Dessemontet et al., (2021) performed an RCT phonic-based instruction for students with ID and found positive results. Additionally, studies have shown promising results in combining the instruction of phonic-based and comprehension-based strategies for children with ID (e.g., Browder et al., 2012; Gustafson et al., 2007). However, the samples in the studies are often small without using a control group. The current project builds on these studies using a large-scale controlled study, by implementing a comprehension- and a phonic-based approach (and a combination thereof) using digital media. The aim of the current study is to help children with ID in need of AAC to reach as fluent reading capacity as possible using three different intervention strategies: a phonic-based, a comprehension-based, or a combination of the two. In the present study, we tested three pre-registered hypotheses (Palmqvist, Samuelsson, et al., 2020):

  1. A phonics-based or a comprehension-based reading strategy improves phonological awareness.
  2. A phonics-based or a comprehension-based reading strategy improves reading ability (word recognition and reading comprehension).
  3. The combination of both reading strategies is more effective than either strategy on its own.

The hypotheses will be tested on these outcome variables: phonological awareness (1, 3), word recognition (2, 3), and reading comprehension (2, 3).


Methodology, Methods, Research Instruments or Sources Used
A total of 124 students (ngirls = 54, nboys = 70) were included in the study. They had a mean age of 13.7 years (SD = 3.3) and a mean IQ of 48 (SD = 13). Participants had to attend a Swedish special needs school and be beginning readers. In Sweden, the student receives an ID diagnosis according to the ICD-11 (WHO, 2019) prior to being enrolled in the special needs curriculum. The teachers were instructed to identify students that could not read more than approximately 20 words, which was the operationalization of being a beginning reader. To make the sample representative of the students in special needs schools, no exclusion criteria were set for additional diagnoses or aetiology of the ID. The caregivers of all participating students signed informed consent prior to the testing. The study has been approved by the Swedish Ethical Review Authority (2020-06215).

The study is a longitudinal between-group study, with four time points (t1-t4; before, during, directly after the intervention, and a follow-up), and four groups: phonics-based reading strategy, comprehension-based reading strategy, both phonics-based and comprehension-based strategies (combination group), and a comparison group who received teaching-as-usual.

Before initiating testing, background information about the participants was collected by interviewing parents (e.g., diagnoses). Testing took place in a silent environment at the participants’ school. All children were assessed on general non-verbal cognitive ability Raven’s 2; Raven et al., 2018), phonological awareness (MiniDUVAN; Wolff, 2013), word recognition (OS64 & OLAF; Magnusson & Naucler, 2010), communication skills (BAF; Frylmark, 2015), and reading comprehension (DLS Bas; Järpsten, 2004), and were allocated into one of four intervention groups. The intervention strategies were provided in a digital format (i.e., apps) and the students worked at school together with a teacher or assistant. The intervention was conducted over 12 weeks (3x30 minutes per week).

Linear mixed-effects models were used to evaluate the effects of the intervention. The outcome measures (PA, word recognition, and reading comprehension) were analyzed separately. Days were used as the time variable, starting day 1 at the date of t1. Model fit was assessed using ANOVA. The model with the best fit, indicated by χ2, was chosen. There were three contrasts performed: the comparison group versus the phonics-based intervention (Hypothesis 1), the comparison group versus the comprehension-based intervention (Hypothesis 2), and the combination group versus the phonics-based and the comprehension-based intervention (Hypothesis 3).

Conclusions, Expected Outcomes or Findings
No initial differences in reading performance between groups were observed at t1. The results showed that reading improved over time, as indicated by a main effect of time across all three outcome measures β ̂=0.09, 95% CI [0.05,0.13], t(336.07)=4.81, p<.001. The combined phonics and comprehension strategy had a positive impact on phonological awareness development β ̂=0.09, 95% CI [0.03,0.15], t(333.59)=2.78, p=.006. However, no significant differences were found in word recognition or reading comprehension based on the reading instruction strategy used.

The results support the idea that systematic instruction that includes explicit teaching of phonics, and comprehension is better than more simple instructional strategies, such as practicing sight-word reading. Phonics-based instruction improves the sensitivity to the sub-lexical structure of spoken words, while comprehension-based instruction leads to richer and more precise lexico-semantic representations. Combining the two strategies may allow students to apply their improved skills in a richer context, making the combination more effective than either strategy alone.

The results add to the previous literature that conventional literacy strategies for persons with typical development also are effective for students with ID when using a combined instructional strategy. Teachers should prioritize intensive and methodologically rich literacy instructions for their students. Furthermore, we provide evidence that digitally-based interventions for reading are more effective than teaching-as-usual for students with ID. One reason may be that the digital format enables the instructions to be provided in an adapted manner for students with ID in terms of adapted speed or response time. Additionally, the focused intervention itself might have contributed to more literacy instruction than their teaching-as-usual which in turn resulted in improved reading.

References
Ainsworth, et al., (2016). Teaching phonics to groups of middle school students with autism, intellectual disabilities and complex communication needs. Research in Developmental Disabilities, 56, 165-176.

Browder, et al., (2012). An evaluation of a multicomponent early literacy program for students with severe developmental disabilities. Remedial and Special Education, 33 , 237-246.

Dessemontet, et al., (2021). Effects of a phonics-based intervention on the reading skills of students with intellectual disability. Research in Developmental Disabilities, 111, 103883.

Dessemontet, et al., (2019). A meta-analysis on the effectiveness of phonics instruction for teaching decoding skills to students with intellectual disability. Educational Research Review, 26, 52-70.

Gustafson, et al., (2007). Phonological or orthographic training for children with phonological or orthographic decoding deficits. Dyslexia, 13 , 211–229.

Cawley, J. F., & Parmar, R. S. (1995). Comparisons in reading and reading-related tasks among students with average intellectual ability and students with mild mental retardation. Education and Training in Mental Retardation and Developmental Disabilities, 118-129.

Di Blasi, F. D., Buono, S., Cantagallo, C., Di Filippo, G., & Zoccolotti, P. (2019). Reading skills in children with mild to borderline intellectual disability: a cross‐sectional study on second to eighth graders. Journal of Intellectual Disability Research, 63(8), 1023-1040.

Gough, P. B., & Tunmer, W. E. (1986). Decoding, reading, and reading disability. Remedial and special education, 7(1), 6-10.

Hoover, W. A., & Gough, P. B. (1990). The simple view of reading. Reading and writing, 2(2), 127-160.
Lemons, C., Zigmond, N., Kloo, A., Hill, D., Mrachko, A., Paterra, M., Bost, T., & Davis, S. (2013). Performance of students with significant cognitive disabilities on early-grade curriculum-based measures of word and passage reading fluency. Exceptional Children, 79(4), 408–426.

Lervåg, A., Hulme, C., & Melby‐Lervåg, M. (2018). Unpicking the developmental relationship between oral language skills and reading comprehension: It's simple, but complex. Child development, 89(5), 1821-1838.

Melby-Lervåg, M., Lyster, S. A. H., & Hulme, C. (2012). Phonological skills and their role in learning to read: a meta-analytic review. Psychological bulletin, 138(2), 322.

Ratz, C., & Lenhard, W. (2013). Reading skills among students with intellectual disabilities. Research in developmental disabilities, 34(5), 1740-1748.

Tunmer, W. E., & Hoover, W. A. (2019). The cognitive foundations of learning to read: A framework for preventing and remediating reading difficulties. Australian Journal of Learning Difficulties, 24(1), 75-93.

Wei, X., Blackorby, J., & Schiller, E. (2011). Growth in reading achievement of students with disabilities, ages 7 to 17. Exceptional Children, 78(1), 89–106.


09. Assessment, Evaluation, Testing and Measurement
Paper

The Views of Students with Disabilities on Speech and Reading Compared to Corresponding Test Results

Jenny Samuelsson1,2,3, Emil Holmer4, Jakob Åsberg Johnels2, Lisa Palmqvist4,5, Mikael Heimann4, Gunilla Thunberg2,3

1Region Västra Götaland, Habilitation & Health, Gothenburg, Sweden; 2Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden; 3Dart - Centre for AAC and Assistive Technology, Sahlgrenska University Hospital, Gothenburg, Sweden; 4Department of Behavioral Science and Learning, Linköping University, Linköping, Sweden; 5Department of Education and Special Education, University of Gothenburg, Sweden

Presenting Author: Samuelsson, Jenny

The opinion of people with intellectual disabilities (ID) and communication difficulties can be challenging to capture. Pictorial support to enable communication, such as the methodological framework Talking MatsTM has been successful to solicit the views of both adults and young people with ID (Murphy & Cameron, 2008). We believe that a constructionist way of thinking supports the idea of listening to children and trying to understand their thoughts, likes, fears, hopes, and problems, with the goal of forming a partnership. The process of guided participation, as presented by Rogoff (2003), involves children engaging in communication and acquiring knowledge through close collaboration with their peers and surroundings. This process views children as active and capable agents of change, as described by Rogoff (2003), Tomasello (2013) and Vygotsky et al. (1978).

According to the United Nations Convention on the Rights of the Child (UNCRC) and The Convention on the Rights of Persons with Disabilities, (CRPD), all children have the right to express their views and the views shall be given due weight. The children shall be provided with the opportunity to express their views in accordance with their age and maturity. To provide all children to express their views, also those with ID and communication difficulties, environmental support and adaptations are needed.

It is important to carefully consider the challenges that individuals with ID may face when expressing their views and opinions and to use strategies to support their participation and ensure that their perspectives are included and valued. Many individuals with ID and communication difficulties need Augmentative and Alternative Communication (AAC) to understand and to be understood (Beukelman, 2020). Using interviews and questionnaires developed for the general population is often complicated for individuals with ID and communication difficulties (Santoro et al., 2022). The addition of pictures in conversations, interviews, and questionnaires can support cognitive and communication difficulties as well as executive functions and working memory (Boström et al., 2016). Resources for pictorial support facilitate comprehension and supports expression and Talking MatsTM has been widely used together with people affected by communication difficulties for both research and clinical purposes (i.e. Breeze, 2021; Stans et al., 2019).

Research supporting literacy interventions for individuals with different diagnoses and for all ages has shown promising results indicating that everyone should be given the opportunity to get a literacy education (Yorke et al., 2021). The enjoyment of reading and its positive relationship to reading ability has been well researched among students with typical development (e.g., Rogiers et al., 2020; Smith et al., 2012), but not in children with ID. The conclusion cannot be drawn that the simple fact of enjoying reading leads to increased literacy skills for neither students with typical development nor students with ID. Research in comparing children´s perspective have mostly been focusing on participation and goal setting and in comparison with proxy raters (Stans et al., 2009). The children´s views compared to corresponding test results is sparse.

In the current study, we interviewed students with ID in Swedish special needs schools. They all attended a reading intervention with digital apps and were interviewed before and after the intervention about their own communication and reading. The overall aim was to determine the relationship between students’ own views and their corresponding formal test results on speech sound production and word reading ability.

The study posed two research questions: (1) What are the students' own views of their speech and reading activities? And (2) Is there a positive correlation between the students' views of their speech and their speech sound production, and between their views on reading and their word reading ability?


Methodology, Methods, Research Instruments or Sources Used
A total of 116 students (65 boys and 51 girls) with ages ranging from 7 to 21 years old took part in this study. All the students met the inclusion criteria which were: (1) intellectual disability (2) need for AAC to understand and express themselves, and (3) inability to decode words independently and identify a maximum of 20 sight words.
This study used the pictorial framework Talking MatsTM to enable students with ID and communication difficulties to share their views on speech and reading activities. The process includes identifying a topic, discussing related options, and using a visual scale to indicate views or feelings. A practice mat with simple questions about animals was used to validate students' understanding of the method. The remaining students who completed the practice mat were asked questions about their views on communication and reading activities. The three-point visual scale with pictures of facial expressions representing “like”, “neutral/in between” and “dislike” was transformed into a numbered ordinal scale. In the comparison between students self-rating and the test results on speech sound production and word reading, we focused on four questions related to speech (merged into one variable, Speech) and three questions related to increased difficulty in reading ability (merged into one variable, Reading activities).
For assessing speech ability, a phonological test (Assessment of phonology, Frylmark) was used. The speech was transcribed and calculated as percentage phonemes correct on 138 phonemes. Reliability was excellent, with a high intraclass correlation coefficient of .997 and substantial agreement as indicated by a Cohens Kappa of .78. Two reading tests, OS 64 and OLAF were used to assess reading ability. Both tests were shortened for the present study, with OS 64 reduced to 15 items and OLAF reduced to 13 items. In OS 64, participants were shown written words and asked to match them to pictures, while in OLAF, they were shown pictures and asked to match them to written words. The test procedures for OS 64 were adapted by using enlarged symbols for visibility. The dependent variable was the number of correct answers on both tests, with a good test-retest reliability of .780.
We used descriptive statistics to analyse the interview responses. Additionally, Pearson's correlations were used to examine the relationship between students' perceptions of their speech and reading ability and their actual test results in speech sound production and reading ability.

Conclusions, Expected Outcomes or Findings
The use of pictorial support was beneficial in enabling students with ID and communication difficulties to express their views and participate in the study. The students had positive views of their speech, with 64% having positive views towards talking to one person at a time, and 46% liking talking in groups. Many students (64%) answered positively about talking on the phone. However, less than half (43%) had positive views towards their speech being intelligible for others. The student’s views on reading activities were more varied, with 47% having positive views toward reading letters, 38% for reading words, and 26% for reading sentences. The results indicate that as the degree of difficulty in the activity increases, the students' ratings were less positive.
The study found that there was a positive association between the students' views on their speech and their actual speech sound production, as well as their views on reading activities and their tested word reading ability. The correlation coefficients were calculated using Pearson's method, and were statistically significant, with small positive values of r(84) = .24, p < .05 for speech and r(104) = .21, p < .05 for reading. This indicates that the student’s views on their speech and reading abilities were in line with their actual abilities as measured by formal tests.
The students had a basic understanding of their own speech sound production and word reading abilities, as reflected in their views on these activities. The study highlights the challenges of including students with intellectual disabilities in research but emphasizes the importance of following the United Nations Convention on the Rights of the Child (UNCRC) and including these students in research. To overcome these challenges, the study suggests using pictorial support such as Talking Mats to facilitate communication and better understand the views of these students.

References
Beukelman, D. R. (2020). Augmentative & alternative communication : supporting children and adults with complex communication needs (Fifth edition ed.). Baltimore, Maryland : Paul H. Brookes Publishing Co.

Boström, P., Johnels, J. Å., Thorson, M., & Broberg, M. (2016). Subjective Mental Health, Peer Relations, Family, and School Environment in Adolescents with Intellectual Developmental Disorder: A First Report of a New Questionnaire Administered on Tablet PCs. Journal of mental health research in intellectual disabilities, 9(4), 207-231. https://doi.org/10.1080/19315864.2016.1186254

Breeze, J. (2021). Including people with intellectual disabilities in the development of their own positive behaviour support plans. Tizard Learning Disability Review.

Murphy, J., & Cameron, L. (2008). The effectiveness of Talking Mats with people with intellectual disability. British journal of learning disabilities, 36(4), 232-241. https://doi.org/10.1111/j.1468-3156.2008.00490.x

Rogiers, A., Van Keer, H., & Merchie, E. (2020, 2020/01/01/). The profile of the skilled reader: An investigation into the role of reading enjoyment and student characteristics. International Journal of Educational Research, 99, 101512. https://doi.org/https://doi.org/10.1016/j.ijer.2019.101512

Rogoff, B. (2003). The cultural nature of human development. Oxford University Press.

Santoro, S. L., Donelan, K., & Constantine, M. L. (2022). Proxy-report in individuals with intellectual disability: A scoping review. Journal of applied research in intellectual disabilities : JARID.

Smith, J. K., Smith, L. F., Gilmore, A., & Jameson, M. (2012). Students' self-perception of reading ability, enjoyment of reading and reading achievement. Learning and Individual Differences, 22(2), 202-206. https://doi.org/10.1016/j.lindif.2011.04.010

Stans, S. E. A., Dalemans, R. J. P., de Witte, L. P., & Beurskens, A. J. H. M. (2019). Using Talking Mats to support conversations with communication vulnerable people: A scoping review. Technology and disability, 30(4), 153-176. https://doi.org/10.3233/TAD-180219

Tomasello, M. (2013). Origins of human communication. MIT Press.
UN Convention on the Rights of the Child, (1989). https://www.ohchr.org/en/instruments-mechanisms/instruments/convention-rights-child

UN Convention on the Rights of Persons with Disabilities (2006). https://www.un.org/development/desa/disabilities/convention-on-the-rights-of-persons-with-disabilities/article-7-children-with-disabilities.html

Vygotsky, L. S., Cole, M., John-Steiner, V., Schribner, S., & Souberman, E. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.


09. Assessment, Evaluation, Testing and Measurement
Paper

The Impact of Multilingualism on Children’s Reading/Writing Skills and Scholastic Performance

Linda Romanovska, Ineke Pit-Ten Cate, Sonja Ugen

University of Luxembourg, Luxembourg

Presenting Author: Romanovska, Linda

While research on multilingualism has shown both, positive (e.g. inhibition; Coderre et al., 2013), and negative (e.g. vocabulary; Bialystok et al., 2008) effects on cognition and language proficiency, its influence on scholastic achievement appears to be largely negative (Hoffmann et al., 2018; Martini et al., 2021). Children in Luxembourg are educated in a multilingual educational system. In Kindergarten, the main teaching language is Luxembourgish. This switches to German for literacy acquisition in elementary school, with French taught as a second language. Despite its small size, Luxembourg is also highly multi-cultural, boasting 170 nationalities (The Government of the Grand Dutchy of Luxembourg, 2023). Thus, many of the children in the school system do not speak the language(s) of instruction at home. Data from the Luxembourgish national school monitoring program reveals significant differences in German reading comprehension in grade 3 depending on the language spoken at home. Because Luxembourgish is linguistically close to German, Luxembourgish-speaking children generally perform better than children who do not speak Luxembourgish at home (Hoffmann et al., 2018; Martini et al., 2021).

Furthermore, the language-based differences in children’s scholastic performance complicate the diagnostic process of children with potential learning disorders, such as dyslexia and/or dyscalculia. In Luxembourg, the language in which children are screened and diagnosed for potential learning disorders is usually identical to the main language of instruction at school, which at time of diagnosis (typically grade 3) is German. It is therefore difficult to distinguish poor performance based on potential difficulties with reading/writing or mathematics from poor performance based on low language proficiency in the test language. Furthermore, the diagnostic tools currently employed in Luxembourg are developed in countries with primarily one language of instruction, challenging the validity of the diagnostic process in a multilingual population (Ugen et al., 2021).

We have thus developed a comprehensive reading/writing test battery adapted to the Luxembourgish educational curriculum and multilingual environment. Children’s potential language proficiency differences in the test language (German) are taken into account using simplified instructions with reduced language load, multiple examples, varying degrees of difficulty of the test materials, as well as the construction of distinct language-group norms, depending on the language(s) spoken at home. This helps avoid over-diagnosis of reading and writing disorders in children who do not speak the language(s) of instruction at home and underdiagnosis of children who do. The developed test battery assesses children’s performance in key domains relevant for reading and writing comprising phonological skills, (non)word and text reading (fluency and accuracy), reading comprehension, writing, and vocabulary. Furthermore, we link children’s performance in the newly developed test battery to their performance in the Luxembourgish national school monitoring program.


Methodology, Methods, Research Instruments or Sources Used
We have tested 214 children during the pre-test phase of the project (February – June 2022; age 8 – 12; M = 9.59; SD = 0.68; 95 girls) and will test approximately 735 children during the validation and norming phase (February – June 2023). All children attend grade 3 in public primary schools in Luxembourg. The distribution of classes participating in the project covers all 15 regions of the country, resulting in a representative sample of the Luxembourgish school population.

Children complete the 9 sub-tests of the novel reading/writing test battery, which includes precursor skills: Rapid Automatized Naming (RAN), non-word phoneme segmentation, non-word phoneme deletion; reading skills: word and non-word reading, text reading and comprehension; writing skills: gap dictation and text dictation; as well as a receptive vocabulary task. The vocabulary and writing skills are assessed in a group setting (all children complete the tasks together in the classroom), the precursor and reading skills are assessed individually in a quiet room in the school. The total testing time (group test + individual tests) does not exceed 90 minutes per child. All tests are conducted by trained test administrators following a standardised procedure.

The pre-test data were analysed per sub-test using Repeated Measures Analysis of Variance with language group as the between-subject factor and results of the sub-test (per category where applicable) as the within subject factor. Significant main effects of language group were explored using post-hoc pairwise-comparisons (Bonferroni corrected t-tests). Four language groups were created based on the frequencies of the reported language(s) spoken at home: Luxembourgish/German monolingual, Luxembourgish/German bilingual, Romance language (e.g., French, Portuguese, Spanish) mono- and bilingual, Other language (e.g., English, Slavic) mono- and bilingual.

The results of each sub-test of the novel reading/writing test battery were also correlated with children’s performance on German listening and reading comprehension in the Luxembourgish national school monitoring programme (Bonferroni corrected Pearson correlations).

Conclusions, Expected Outcomes or Findings
The results of the pre-test phase show that children, who speak Luxembourgish or German at home outperform children who speak a Romance or Other language at home. Particularly, significant differences between language groups were observed for: word reading accuracy (F(3,190) = 4.94, p = .003); word reading fluency (F(3,190) = 4.59, p = .004); text reading accuracy (F(3,190) = 8.73, p < .001); text reading fluency (F(3,190) = 11.50, p < .001); text comprehension  (F(3,190) = 12.45, p < .001); gap dictation (F(3,180) = 10.52, p < .001); text dictation  (F(3,180) = 18.22, p < .001). The significant main effects of language highlight the need for separate language group norms for screening and diagnostic purposes. The lack of main effects of language for non-word phoneme deletion, non-word phoneme segmentation, and non-word reading indicate that the sub-tests using non-words were successfully constructed to account for language proficiency effects.

Significant Pearson correlations between the school monitoring results of German listening (.28 < |𝜌| < .59) and German reading comprehension (.24 < |𝜌| < .65) and the majority of the newly developed sub-tests of the reading/writing test battery were also observed. These correlations provide a measure of construct validity, illustrating the significant link between children’s scholastic performance and performance in the novel reading/writing test battery.

We expect to replicate these initial findings with a larger sample of children during the validation and norming phase of the project and supplement our data analyses with more detailed results highlighting the distribution of scores per sub-test based on language spoken at home and its effect on scholastic performance as assessed by the Luxembourgish national school monitoring program.

References
Bialystok, E., Craik, F., & Luk, G. (2008). Cognitive Control and Lexical Access in Younger and Older Bilinguals. Journal of Experimental Psychology: Learning Memory and Cognition, 34(4), 859–873. https://doi.org/10.1037/0278-7393.34.4.859
Coderre, E. L., van Heuven, W. J. B., & Conklin, K. (2013). The timing and magnitude of Stroop interference and facilitation in monolinguals and bilinguals. Bilingualism, 16(2), 420–441. https://doi.org/10.1017/S1366728912000405
Hoffmann, D., Hornung, C., Gamo, S., Esch, P., Keller, U., & Fischbach, A. (2018). Schulische Kompetenzen von Erstklässlern und ihre Entwicklung nach zwei Jahren. In T. Lentz, I. Baumann, & A. Küpper (Eds.), Nationaler Bildungsbericht (pp. 84–96). University of Luxembourg & SCRIPT.
Martini, S., Schiltz, C., Fischbach, A., & Ugen, S. (2021). Identifying Math and Reading Difficulties of multilingual children: Effects of different cut-offs and reference group. In M. Herzog, A. Fritz-Stratmann, & E. Gürsoy (Eds.), Diversity Dimensions in Mathematics and Language Learning (pp. 200–228). De Gruyter Mouton.
The Government of the Grand Dutchy of Luxembourg. (2023, January) Society and culture – Population Demographics. https://luxembourg.public.lu/en/society-and-culture/population/demographics.html
Ugen, S., Schiltz, C., Fischbach, A., & Pit-ten Cate, I. M. (2021). Lernstörungen im multilingualen Kontext. Diagnose und Hilfestellungen. Melusina Press. https://doi.org/10.26298/bg5s-ng46


09. Assessment, Evaluation, Testing and Measurement
Paper

Is Paper-based Reading Achievement a Better Predictor of Later Reading Achievement Than is Digital Reading Achievement?

Elpis Grammatikopoulou, Stefan Johansson, Monica Rosén

Göteborgs Universitet, Sweden

Presenting Author: Grammatikopoulou, Elpis

Both IEA and the Swedish National Agency for Education (Skolverket) stresses the importance of reading literacy as they recognize it as an individual right and a prerequisite to participate in society, to gain knowledge of school subject and being able to argue and participate in decision making situations (Skolverket, 2016; Mullis, Martin & Sainsbury, 2015). The definition of reading literacy has evolved over time from the simple view of reading as a decoding and word recognition to involve further skills as comprehension and meta-comprehension (Gough & Tunmer, 1986; Chall, 1989; Sefton-Green, Marsh, Erstad & Flewitt, 2016) and with the entrance of digital means in everyday and school life has grown further to involve digital and social skills as well (Coiro et al., 2008; Leu et al., 2009). The differences between paper-based and digital reading have been under close scrutiny in recent past indicating that the two modalities differ in several aspects besides the obvious format, such as the skills required, the processes needed and the consequences in comprehension and concentration (Delgado et al., 2018; Baron, 2017). Previous research has explored and confirmed the paper-based reading ability as a predictor of later achievement (Butler et al., 1985; Sparks et al., 2013;). Less is known though about the role digital reading ability as a predictor for later achievement. The present study will shed light to the possible effect of difference between digital reading assessment and paper-based reading assessment as a predictor of later reading achievement and general achievement. So far, most studies have focused on the individual factors that may influence achievement in reading. The present study examines the two modalities in comparison along with individual factors, such as SES and immigrant background.

IEA’s PIRLS and ePIRLS both measures reading literacy in paper-based and digital form respectively. In 2016, Swedish 10-year-olds completed both the paper-based and digital version of the reading assessment PIRLS. The correlation between students’ test scores amounted to 0.79, thus a high, yet not a perfect, relationship. The correlation revealed that there likely are differences between the two formats and that some students perform better in paper-based reading whereas others perform better on the digital test. IEA analyzed the differences that emerged between PIRLS and ePIRLS results for all countries. Swedish 10-year-olds were generally better in digital, but the difference was not statistically significant.

The main purpose of the current study is to investigate the predictive validity of the paper-based and digital reading scores in PIRLS by comparing these scores with the core subject marks and the overall marks in grade 6 for the students that participated in PIRLS 2016. A hypothesis is that the paper-based scores will be more closely related to the subject grade and overall grades, partly because paper-based reading was more prominent in schools at this point in time. Drawing from Clinton’s meta-analysis (2019) suggesting that paper-based reading is associated with deeper comprehension and with metacognitive processes (accuracy in prediction of achievement), we hypothesize that students who are relatively stronger in paper-based reading test to have higher marks in schoolyear 6 than students who scored better in digital reading. Against this background, two specific research questions are posed:

  1. What is the relationship between paper-based and digital reading achievement in grade 4 and core subject marks and GPA in grade 6?
  2. Is there a stronger relationship between grades in Grade 6 and the reading results in PIRLS for students who are relatively stronger in paper-based reading?

Methodology, Methods, Research Instruments or Sources Used
The present study uses Swedish register data together with PIRLS and ePIRLS scores. Sweden participated in both PIRLS and ePIRLS IN 2016 with approximately 4000 students. Through the Swedish unique social security number (personnummer) that was collected from PIRLS students, a number of register variables can be related to their results in PIRLS and ePIRLS. The first subject marks are collected in grade 6, thus for PIRLS students in 2018. Compulsory school in Sweden is 9 years and it is divided into three periods: grades 1-3, 4-6 and 7-9. The evaluating system up to grade 6 is mostly descriptive. By the end of fall semester in grade 6, students are appointed with grades according to their performance in different courses. There are 5 passing grades (A-E), while F indicates that the student has not reached the lowest benchmark for a passing grade. The courses that students attend are 16 (Swedish, mathematics, english, biology, physics, chemistry, technology, geography, history, religion, social studies, visual arts, music, handicrafts, physical education and home economics). Out of them Swedish, mathematics and English are the core subjects. Parental education is used as an indicator of SES and language use at home is used as an indicator of immigrant background.
We will conduct hierarchical regression analysis to investigate if and to what extend can the difference between PIRLS and ePIRLS scores predict the overall marks and the core subject marks.

Conclusions, Expected Outcomes or Findings
Swedish students had an average of 559 points on ePIRLS and 555 points on PIRLS. We calculated correlations which indicated a strong -yet not perfect- relationship between paper-based reading and digital reading, about .80. This suggests that most of those who perform well in one format also perform well in the other. The average score in Swedish language was 14 (0-20), 13.4 (0-20) in maths, 15 (0-20) in English and 205 (0-320) in GPA. In the first step, we calculated the difference between the two test scores (PVepirls –PVpirls). The correlation between the difference of the two tests and Swedish language grade was not significant, while the correlation between the difference and math grades and English language grades was moderate (.52 and .68 respectively) and weak to the GPA (.35).
Preliminary results showed that there is a relationship between grades in mathematics and English language and the difference between PIRLS and ePIRLS scores, indicating that students that scored better in the digital assessment achieved higher marks in mathematics and English language. However, the effect on the Swedish or the overall marks was not statistically significant. Having in mind the considerable predominance of English language online (Meurant, 2009), the present results comes in agreement with previous research indicating the relationship between digital literacy and English language (Alakrash et al., 2021). The present study also confirms previous research that found a pattern of digital literacy related with achievement in mathematics (Hu et al., 2018; Skryabin et al., 2015). Nevertheless, the relationship between digital literacy and Swedish language that was found not significant needs further exploration. A possible explanation can be the fact that Swedish language teachers do not focus on digital literacy enough, as well as the fact that the most common language online is by far English.

References
Alakrash, H., Razak, N. A., & Krish, P. (2021). Social network sites in learning english; an invstigation on attitudes, digital literacy and usage. Linguistica Antverpiensia, 2021(1), 26–43.
Baron, N. S. (2017). Reading in a digital age. Phi Delta Kappan, 99(2), 15–20.
Butler, S. R., Marsh, H. W., Sheppard, M. J., & Sheppard, J. L. (n.d.). Seven-Year Longitudinal Study of the Early Prediction of Reading Achievement.
Chall, J. S. (1989). “Learning to Read: The Great Debate” 20 Years Later: A Response to ‘Debunking  
            the Great Phonics Myth.’ The Phi Delta Kappan, 70(7), 521–538.
             http://www.jstor.org/stable/20403953
Clinton, V. (2019). Reading from paper compared to screens: A systematic review and meta‐analysis. Journal of Research in Reading, 42(2), 288–325. https://doi.org/10.1111/1467-9817.12269
Coiro, J. (Ed.). (2008). Handbook of research on new literacies. Lawrence Erlbaum Associates/Taylor & Francis Group.
Delgado, P., Vargas, C., Ackerman, R., & Salmerón, L. (2018). Don’t throw away your printed books: A meta-analysis on the effects of reading media on reading comprehension. Educational Research Review, 25, 23–38. https://doi.org/10.1016/j.edurev.2018.09.003
Gough, P. B., & Tunmer, W. E. (1986). Decoding, Reading, and Reading Disability. Remedial and Special Education, 7(1), 6–10. https://doi.org/10.1177/074193258600700104
Hu, X., Gong, Y., Lai, C., & Leung, F. K. S. (2018). The relationship between ICT and student literacy in mathematics, reading, and science across 44 countries: A multilevel analysis. Computers and Education, 125, 1–13.
Leu, D. J., O’Byrne, W. I., Zawilinski, L., McVerry, J. G., & Everett-Cacopardo, H. (2009). Comments on Greenhow, Robelia, and Hughes: Expanding the New Literacies Conversation. Educational Researcher, 38(4), 264–269. https://doi.org/10.3102/0013189X09336676
Meurant, R. C. (2009). The Significance of Second Language Digital Literacy Why English-Language Digital Literacy Skills Should be Fostered in Korea. 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, 369–374. https://doi.org/10.1109/ICCIT.2009.192
Mullis, I. V. S., Martin, M. O., & Sainsbury, M. (2016). PIRLS 2016 Reading Framework.
Sefton-Green, J., Marsh, J., Erstad, O., & Flewitt, R. (n.d.). Establishing a Research Agenda for the Digital Literacy Practices of Young Children. 37.
Skolverket (2016). Att läsa och förstå. Stockholm: Skolverket.
Skryabin, M., Zhang, J., Liu, L., & Zhang, D. (2015). How the ICT development level and usage influence student achievement in reading, mathematics, and science. Computers and Education, 85, 49–58.
Sparks, R. L., Patton, J., & Murdoch, A. (2014). Early reading success and its relationship to reading achievement and reading volume: Replication of ‘10 years later’. Reading and Writing, 27(1), 189–211. https://doi.org/10.1007/s11145-013-9439-2
 
Date: Thursday, 24/Aug/2023
9:00am - 10:30am09 SES 09 B: Advancing Assessment Methods and Insights for Education Systems
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Stefan Johansson
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

Measuring and Misrepresenting the Missing Millions: the OECD’s Assessment of out-of-School Youth in PISA for Development

Xiaomin Li

Beijing Normal University, China, People's Republic of

Presenting Author: Li, Xiaomin

As the education agenda of global agencies changed after 2015 to emphasise minimum standards of quality for all countries to be delivered by 2030, the OECD has sought to expand its most successful comparative instrument, the Programme for International Student Assessment (PISA), to include low- and middle-income countries (LMICs). In 2014, it introduced PISA for Development (PISA-D) as the means to establish PISA as a universal measure of learning, and in 2020, it declared PISA-D a success. The most innovative feature of PISA-D was that the assessment included out-of-school youth (OOSY); that task was sub-contracted to Educational Testing Service (ETS). The OOSY population is a geographically dispersed group which present considerable challenges to any researchers seeking to access them. Given this, we ask: who did the OECD assess? More specifically, how did the OECD define the target population of PISA-D out-of-school assessment, what was the sampling frame, and were they accurately represented in the PISA-D OOSY sample?

Much of the existing literature on the OECD influence is based on different theoretical positions. These differences in perspective have real consequences, however, often determining which legitimation dynamics researchers see and which they overlook. In this paper, we seek to adopt a holistic approach by drawing on Suchman’s (1995) framework for analysing the multiple sources of organisational legitimacy and the means by which it is promoted and repaired.

In applying Suchman’s framework, we argue that PISA-D was a macro level exercise designed to legitimate the OECD’s extension of PISA into LMICs and to establish its role in a new arena. The incorporation of OOSY in the assessment was a key micro level endeavour which would allow the OECD to achieve that end and, if not done properly, would challenge aspects of its legitimacy. For example, not assessing sufficient OOSY would debase the quality of the OECD’s products and services; this would also damage the OECD’s moral claims with regard to monitoring the attainment of the SDGs and promoting an inclusive approach. In parallel, at the cognitive level, this would challenge the whole logic of the novelty and value of PISA-D. Overall, the successful identification and assessment of OOSY was vital to ensuring its legitimacy. This would require the OECD to either address the considerable difficulties of accessing OOSY or find a tactical solution which obscured the many challenges to its legitimacy.

Suchman (1995) also analyses how organisations respond to challenges to their legitimacy and identifies three broad approaches: (a) offer normalising accounts; (b) restructure, and (c) don’t panic. He suggested that although legitimacy crises may coalesce around performance issues, most challenges ultimately rest on failures of meaning, where ‘audiences begin to suspect that putatively desirable outputs are hazards, that putatively efficacious procedures are tricks, or that putatively genuine structures are facades’ (1995, 597). Consequently, the initial task in mending a breach of legitimacy usually will be ‘to formulate a normalising account’ that separates the threatening revelation from larger assessments of the organisation as a whole. He identified ‘justifications’ and ‘explanations’ as the two main types of normalising accounts. Suchman also noted that organisations may also re-establish legitimacy through micro-level strategic restructuring, in the sense that ‘narrowly tailored changes that mesh with equally focused normalising accounts can serve as effective damage-containment techniques’ (ibid., 598).


Methodology, Methods, Research Instruments or Sources Used
To understand what challenges the OECD encountered and how it managed to address them, we firstly draw on two categories of documents: the first are the UNICEF and UNESCO Institute for Statistics (UIS) publications on the Out-of-school Children Initiative (OOSCI) and Lewin’s (2011) work as part of the Consortium for Research on Educational Access, Transitions and Equity (CREATE) initiative which provide the standard approaches to identifying OOSY and describing their characteristics which PISA-D draws upon. The second are the OECD publications which explain the PISA-D out-of-school sample design and selection plans , and which present the final results . We also draw on three interviews with: a key member of the PISA-D team at the OECD, a technical expert who had undertaken OOSY surveys, and a lead analyst from one of the piloting nations.
Conclusions, Expected Outcomes or Findings
We argue that, as an organisation with no experience with assessing OOSY and working in poorer nations, the OECD was faced by a ‘disruptive event’ (Suchman 1995) as it was unable to effectively sample youth based on their initial definition. This event, if not addressed immediately, would interrupt its ongoing PISA-D legitimation activities and may severely deplete its long-term legitimacy. In line with Suchman’s (1995) analysis of how organisations respond to challenges to their legitimacy, we demonstrate that the OECD pursued a normalising strategy by tailoring and justifying how OOSY were defined and by minimising coverage of its tactical changes. Consequently, it avoided addressing the many problems which face researchers on OOSY by quietly imposing a sampling frame which matched its available sources of data and established methodologies.
The analysis builds on our earlier work which identified the broader strategies that the OECD employed to create the legitimacy to monitor SDG 4 (Li and Morris 2022) and extends it by focusing on legitimacy maintenance and repair work. It also contributes to the important work of others who have critiqued the validity and impact of various assessments undertaken by global agencies.

References
Addey, Camilla. 2017. ‘Golden Relics & Historical Standards: How the OECD Is Expanding Global Education Governance through PISA for Development’. Critical Studies in Education 0 (0): 1–15. https://doi.org/10.1080/17508487.2017.1352006.
Auld, Euan, and Paul Morris. 2021. ‘A NeverEnding Story: Tracing the OECD’s Evolving Narratives within a Global Development Complex’. Globalisation, Societies and Education 19 (2): 183–97. https://doi.org/10.1080/14767724.2021.1882959.
Berten, John, and Matthias Kranke. 2019. ‘Studying Anticipatory Practices of International Organisations: A Framework for Analysis’. Framing Paper for Workshop on Anticipatory Governance at 6th European Workshops in International Studies. Kraków.
Carr-Hill, Roy. 2013. ‘Missing Millions and Measuring Development Progress’. World Development 46 (June): 30–44. https://doi.org/10.1016/j.worlddev.2012.12.017.
Grek, Sotiria. 2009. ‘Governing by Numbers: The PISA “Effect” in Europe’. Journal of Education Policy 24 (1): 23–37. https://doi.org/10.1080/02680930802412669.
Grey, Sue, and Paul Morris. 2018. ‘PISA: Multiple “Truths” and Mediatised Global Governance’. Comparative Education 54 (2): 109–31. https://doi.org/10.1080/03050068.2018.1425243.
Lewin, Keith. 2011. ‘Making Rights Realities: Researching Educational Access, Transitions and Equity’. Project Report. Brighton: University of Sussex. http://www.create-rpc.org/pdf_documents/Making-Rights-Realities-Keith-Lewin-September-2011.pdf.
Li, Xiaomin, and Euan Auld. 2020. ‘A Historical Perspective on the OECD’s “Humanitarian Turn”: PISA for Development and the Learning Framework 2030’. Comparative Education 56 (4): 503–21. https://doi.org/10.1080/03050068.2020.1781397.
Hamilton, Mary. 2017. ‘How International Large-Scale Skills Assessments Engage with National Actors: Mobilising Networks through Policy, Media and Public Knowledge’. Critical Studies in Education 58 (3): 280–94. https://doi.org/10.1080/17508487.2017.1330761.
Martens, Kerstin. 2007. ‘How to Become an Influential Actor - The “comparative Turn” in OECD Education Policy’. In New Arenas of Education Governance: The Impact of International Organisations and Markets on Educational Policy Making, edited by Kerstin Martens, Alessandra Rusconi, and Kathrin Leuze. Basingstoke: Macmillan.
Li, Xiaomin, and Paul Morris. 2022. ‘Generating and Managing Legitimacy: How the OECD Established Its Role in Monitoring Sustainable Development Goal 4’. Compare: A Journal of Comparative and International Education 0 (0): 1–18. https://doi.org/10.1080/03057925.2022.2142038.
Robertson, Susan L. 2020. ‘Guardians of the Future: International Organisations, Anticipatory Governance and Education’. Presented at the International Webinar on UNESCO’s and OECD’s Ambition to Govern the Future of Education, Copenhagen, April 23.
Suchman, Mark C. 1995. ‘Managing Legitimacy: Strategic and Institutional Approaches’. The Academy of Management Review 20 (3): 571–610. https://doi.org/10.2307/258788.
Zapp, Mike. 2020. ‘The Authority of Science and the Legitimacy of International Organisations: OECD, UNESCO and World Bank in Global Education Governance’. Compare: A Journal of Comparative and International Education, 1–20. https://doi.org/10.1080/03057925.2019.1702503.


09. Assessment, Evaluation, Testing and Measurement
Paper

Tinkering towards an assessment of Global Competence

Harsha Chandir, Radhika Gorur, Jill Blackmore

Deakin University, Australia

Presenting Author: Blackmore, Jill

PISA has become the “world’s premier yardstick” against which the “quality, equity and efficiency” of national education systems are evaluated (Gurría in OECD, 2018a, p. 2). PISA claims to be able to compare, on a single scale, the performance of education systems around the globe. These comparative measures have “contributed to the constituting of a global commensurate space of educational performance” (Rizvi & Lingard, 2010, p. 135), regardless of the varying political, economic, social, and cultural contexts of participating nations. Data from PISA are being used by countries to identify “gaps” in their education systems and to develop policies to “move up” on the league tables (Meyer & Benavot, 2013; Wiseman, 2013). A recent development in this space has been the OECD’s development of an internationally comparable measure of Global Competence.

The use of PISA measures as benchmarks for shaping global policies and governing education makes it important to examine the process of how “PISA knowledge” is arrived at. Developing a set of measures requires normative decisions about what the concept encompasses. The assessment of global competence provides a useful example of examining the development of this particular form of global knowledge. Given the multifaceted definitions and understandings of this term, this paper empirically examines the challenges such efforts at stabilising the definition faced, and the ways in which these were negotiated. Locating our study in the interdisciplinary field of Science and Technology Studies (STS), and deploying the concept of tinkering (Knorr Cetina, 1981), we attend to the practices that stabilised the assessment of Global Competence in PISA 2018.

To make globally acceptable knowledge, various epistemic, cultural and political perspectives are brought together in relations of mutual learning and construction, and through iterative processes of expert consultation, country feedback, committee endorsement, etc. These encounters, where diverse perspectives are brought together, have the potential to be hijacked by more outspoken or forceful participants. Moreover, these processes typically take several months during which a range of unexpected events may occur or challenges posed to the successful completion of the endeavour. Tinkering is the way in which actors and events are managed – through cajoling, placating, compromising, modifying, etc. to ensure that the project does not collapse.

Drawing on empirical data relating to the development of the assessment of global competencies, we provide examples tinkering in the development of PISA’s tests of global competency. We highlight three key tinkering moves by the OECD during the process of developing the assessment. In the first move, the OECD replaced the initial Global Competence Expert Group with another group of experts to placate the PISA Governing Board, which objected to the heave economic slant of the first expert presentation. In the second tinkering move, the OECD retrospectively aligned the PISA assessments of global competency with the global competence framework with the UN SDGs. This enabled the OECD to gather more (of the right) allies to support its efforts, and provided “a moral legitimacy the OECD has not enjoyed with the traditional PISA initiative and its narrow economic focus” (Auld & Morris 2019b. p.11). A third tinkering move was the push by the OECD to administer the assessment even when only a minority of the countries decided to participate, arguing that more nations might join subsequent rounds.


Methodology, Methods, Research Instruments or Sources Used
This paper offers data from publicly available documents as well as semi-structured interviews with key OECD officials and members of PISA 2018’s Global Competence Expert Group, to highlight three tinkering moves. By tracing the practices of the assessment development, this study aims to understand how a particular ontology of global competence was stabilised in PISA 2018.
Conclusions, Expected Outcomes or Findings
By enrolling experts and considering feedback from countries, PISA can be said to be a collaborative and democratic global process. However, a closer examination reveals that in spaces when there is uncertainty, and when stalemates develop between different groups, decision making lies with the OECD which tinkers to steer actors in ways that primarily benefit the organisation’s pre-determined agenda.
Tracing this process of making global knowledge of global competence allows for an exploration of “which kind of society and which idea of humanity is pursued and enacted” (d'Agnese, 2018, p. 16) in the OECD’s assessments of global competence – and more generally the PISA project. As the OECD attempts to develop other “global” measures of literacies (OECD, 2018b), it is important to open up the politics of their production. By putting centre-stage the controversies and negotiations, the processes that stabilise these assessments can be opened up to critical scrutiny.

References
Auld, E., & Morris, P. (2019a). Science by streetlight and the OECD’s measure of global competence: A new yardstick for internationalisation? Policy Futures in Education, 17(6), 677-698. https://doi.org/10.1177/1478210318819246
Auld, E., & Morris, P. (2019b). The OECD’s Assessment of Global Competence: Measuring and making global elites. In L. C. Engel, C. Maxwell, & M. Yemini (Eds.), The Machinery of School Internationalisation in Action (pp. 17-35). Routledge
d'Agnese, V. (2018). Reclaiming education in the age of PISA: Challenging OECD’s educational order. Routledge.
Knorr Cetina, K. (1981). The manufacture of knowledge: An essay on the constructivist and contextual nature of science. Pergamon.
Meyer, H. D., & Benavot, A. (Eds.). (2013). PISA, power, and policy: The emergence of global educational governance. Symposium books
Organisation for Economic Co-operation and Development. (2018a). PISA 2015 results in focus. https://www.oecd.org/pisa/pisa-2015-results-in-focus.pdf
Organisation for Economic Co-operation and Development. (2018b). The future of education and skills 2030: The future we want. https://www.oecd.org/education/2030/E2030%20Position%20Paper%20(05.04.2018).pdf
Rizvi, F., & Lingard, B. (2010). Globalizing education policy. Routledge
Wiseman, A. W. (2013). Policy responses to PISA in comparative perspective. In H. D. Meyer & A. Benavot (Eds.), PISA, power, and policy: The emergence of global educational governance (pp. 303-322). Symposium books.


09. Assessment, Evaluation, Testing and Measurement
Paper

Are Students Underachieving in PISA? The Issue of Test Motivation in Low-Stakes and High-Stakes Tests

Linda Borger1, Stefan Johansson1, Rolf Strietholt2

1University of Gothenburg, Sweden; 2Technische Universität Dortmund, Germany

Presenting Author: Borger, Linda

International large-scale assessments (ILSAs) are playing an increasingly important role in decision-making and reforms, both nationally and internationally (e.g. Grek, 2009; Lindblad et al., 2018). One of the most influential ILSAs is the Programme for International Student Assessment (PISA). Given the impact PISA has on educational debate and policy, it is crucial that results are trustworthy. Yet, parallel to an increase in the number of ILSAs, there has been growing validity concerns regarding for example the content being tested, the influence on national educational systems and potential bias due to lack of sample representativeness (Grek, 2009; Jerrim, 2021; Meyer & Benavot, 2013). Relatively few studies, however, have focused on whether students are motivated to do their best in ILSAs as compared with high-stakes tests.

A motive for our research is international evidence suggesting that tests of low stakes impact student motivation and effort (Finn, 2015; Wise & DeMars, 2005). Whereas the relation between high- and low-stakes testing has been studied previously, findings are inconsistent, and we know little about this relationship in Sweden. Therefore, the following study examines whether there is evidence for the hypothesis that the lack of personal consequences may bias PISA test scores downwards. Indeed, self-reports from Swedish students indicate that they do not do their best in PISA (Eklöf & Hopfenbeck, 2019). However, PISA test scores have not yet been compared to external criteria such as national test scores. The theoretical framework used to interpret the results of the present study is the expectancy-value theory (Eccles & Wigfield, 2002; Wigfield & Eccles, 2000), postulating that test motivation depends on the student's expectations of succeeding at a particular task, the value the student places on the task, and the interaction between the two (Eccles & Wigfield, 2002). The expectancy-value theory has successfully been used in previous studies to explain the test-taking motivation construct (e.g., Eklöf & Knekta, 2017).

Previous research on the relationship between national tests and PISA/TIMSS revealed moderate to high but imperfect correlations (Skolverket, 2022; Wiberg, 2019; Wiberg & Rolfsman, 2019). One possible explanation is that ILSAs have low stakes for students while national tests have high stakes. In order to test the assumption that motivation influences student achievement, we will examine whether test motivation moderates the relationship between PISA scores and the national test scores. Skolverket (2022) found a correlation of .61 between the two measures but our hypothesis is that the relationship is different for different levels of motivation to take the PISA test. With reference to the expectancy-value theory we assume that the average level of test motivation is higher for national tests since this is a test with higher stakes. For students that were particularly unmotivated to do the PISA test, the correlation with their national test scores could therefore be lower. Consequently, the study examines the following research questions: (1) What is the correlation between test motivation in PISA and PISA achievement? and (2) Is the relationship between low-stakes PISA test scores and high-stakes national test scores moderated by students’ test motivation in PISA?


Methodology, Methods, Research Instruments or Sources Used
In the 2018 Swedish PISA test, students’ personal identification numbers were collected, making it possible to link PISA tests scores with register data on students’ national test grades and student background characteristics, collected from Statistics Sweden (SCB). The analyses are based on this combined dataset, including a sample of 5,504 students. The main method used was latent moderated structural equations modelling. The outcome variable is students’ PISA achievement, measured through the ten plausible values by including the type = imputation option in Mplus, the software used. Since reading was the major domain in PISA 2018, the analyses focus on reading. However, robustness checks were conducted using PISA achievement in mathematics and science.

The predictors used are students’ motivation to take the PISA test, formulated as a latent variable and used as moderator in the interaction analysis, and students’ national test grade. The latent variable PISA_Motivation is measured by six statements about students’ motivation in PISA, answered on a four-point Likert scale ranging from “strongly agree” to “strongly disagree” (reverse-coded in the analyses). The scale is provided as a national option in the PISA student questionnaire and contains items intended to measure effort, e.g., “I felt motivated to do my best in the PISA test” and importance, e.g., “Doing well in the PISA test was important to me”. Cronbach's alpha for the PISA motivation scale was .90 for the six items, indicating a high internal consistency. As an indicator of a high-stakes assessment, the students’ national test grade in reading, ranging from A–F and coded numerically, was used as an observed independent variable. Student background characteristics will be used as control variables in further analyses.

In a first step, a measurement model of PISA_Motivation was estimated using confirmatory factor analysis (CFA), and model fit was ensured. Subsequently, structural models were estimated in consecutive steps (Muthén, 2012), starting with models without latent interaction, and then including both main effects and the latent interaction in the final model. The independent observed variable (national test grade) was centered prior to analysis. Model fit was evaluated using commonly used fit indices for structural equation modelling (Marsh et al, 2005). Models were estimated using MLR, and the complex option in Mplus was employed to account for the nested data structure. Analyses were weighted using the final student weight. Missing data was treated under the default method in Mplus (Full Information Maximum Likelihood).

Conclusions, Expected Outcomes or Findings
Results revealed a significant positive correlation between PISA_Motivation and PISA achievement (r = .15), indicating that test motivation predicts achievement. In line with Skolverket (2022), the correlation between PISA achievement in reading and the national test grade in reading was found to be around .6. When controlling for students’ reading ability, in the form of the grade on the high-stakes national test, PISA_Motivation still significantly and positively influenced PISA achievement. In the final model, a significant positive interaction was shown between PISA_Motivation and the national test grade (β = .05, p < .001), indicating that students’ motivation in PISA affects the strength of the relationship between the high-stakes national test grade and the low-stakes PISA achievement.

Graphical analyses of the interaction effects for students with different motivational levels showed that the simple slope differed particularly for students who indicated a low level of motivation in PISA and who received high grades on the high-stakes national test. The students with low motivation in PISA thus had a lower correlation between their PISA test score and their national test grade than the students who reported high motivation. This could be explained, in accordance with the expectancy-value theory, by the fact that these students put in less effort in PISA than on the national test because they do not see PISA as important to them personally. In sum, the study provides some evidence that the low-stakes nature of PISA may bias test scores for certain groups of students, in particular high achievers on the national test with low reported motivation in PISA. In the discussion, other reasons for the discrepancy between PISA test scores and national test grades will be addressed, such as differences in content, format and aims. Additionally, problematic aspects of measuring test effort with self-reported measures are considered.

References
Baumert, J., & Demmrich, A. (2001). Test motivation in the assessment of student skills: The effects of incentives on motivation and performance. European Journal of Psychology of Education, 16(3), 441–62. https://doi.org/10.1007/BF03173192

Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual Review of Psychology, 53(1), 109-132, https://doi.org/10.1146/annurev.psych.53.100901.135153

Eklöf, H. & Knekta, E. (2017). Using large-scale educational data to test motivation theories: A synthesis of findings from Swedish studies on test-taking motivation. International Journal of Quantitative Research in Education, 4(5), 52-71.

Eklöf, H. & Hopfenbeck, T. (2019). Self-reported effort and motivation in the PISA test. In B. Maddox (Red.), International large-scale assessments in education: insider research perspectives (s. 121–136). Bloomsbury Academic.

Finn, B. (2015). Measuring motivation in low-stakes assessments (Research Report No. RR-15-19). Princeton, NJ: Educational Testing Service. doi:10.1002/ets2.12067

Grek, S. (2009). Governing by numbers: the PISA ‘effect’ in Europe. Journal of Education Policy, 24(1), 23-37. https://doi.org/10.1080/02680930802412669

Jerrim, J. (2021). PISA 2018 in England, Northern Ireland, Scotland and Wales: Is the data really representative of all four corners of the UK?. Review of Education, 9(3). https://doi.org/10.1002/rev3.3270

Lindblad, S., Pettersson D., & Popkewitz, T.S. (2018). Numbers, Education and the Making of Society: International Assessments and Its Expertise. Routledge

Marsh, H. W., Hau, K., & Grayson, D. (2005). Goodness of fit evaluation in structural equation modeling. In A. Maydeu-Olivares and J. McArdle (Eds.), Contemporary Psychometrics (pp. 275–340). Erlbaum.

Meyer, H. D., & Benavot, A. O. (Eds.). (2013). PISA, power, policy. The emergence of global educational governance. Oxford Studies in Comparative Education.

Muthén B. (2012). Latent variable interactions. http://www.statmodel.com/download/LV%20Inter action.pdf

Skolverket. (2022). PISA 2018 och betygen. Analys av sambanden mellan svenska betyg och resultat i PISA 2018 [PISA 2018 and school grades. Analyses of the relationship between Swedish school grades and results in PISA 2018]. Skolverket.

Wiberg, M. (2019). The relationship between TIMSS mathematics achievements, grades, and national test scores. Education Inquiry, 10(4), 328-343. https://doi.org/10.1080/20004508.2019.1579626

Wiberg, M., & Rolfsman, E. (2019). The association between science achievement measures in schools and TIMSS science achievements in Sweden. International Journal of Science Education, 41(16), 2218-2232. doi:10.1080/09500693.2019.1666217

Wigfield, A., & Eccles, J. S. (2000). Expectancy-value theory of achievement motivation. Contemporary Educational Psychology, 25, 68–81. https://10.1016/ceps.1999.1015

Wise, S. L., & DeMars, C. E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10(1), 1–17. https://10.1207/s15326977ea1001_1


09. Assessment, Evaluation, Testing and Measurement
Paper

A Framework to Estimate and Enhance Effectiveness of Large-scale Assessments in Next Generation Learning Systems

Priyanka Sharma, Amit Kaushik

Australian Council for Educational Research, India

Presenting Author: Sharma, Priyanka

The effectiveness of initiatives in an educational context is often interpreted in terms of their impact on learning outcomes for every unit of investment. Governments invest a significantly high amount of money and effort in large-scale assessments (LSAs) with the intent to provide data-driven evidence to policymakers and researchers. Such evidence indicates the quality parameters of the education system in terms of learning level, equity, sustainability, and other predefined dimensions. Validity of such information is paramount due to its crucial role in decision-making for inputs, functional strategies, and goal setting for intended outputs, which if implemented as intended are most likely to lead to improvement in learning outcomes. Therefore, it is not an exaggeration to say that the effectiveness of LSAs can also be interpreted in terms of gain in learning outcomes or other dimensions, like any other measure. However, measuring and demonstrating effectiveness of LSAs remains a challenge due to multiple reasons besides the complexities involved in efficacy and effectiveness research, like the notion that assessment data themselves offer solutions. Authors make a compelling argument that data including assessment data do not provide solutions, rather they assist the policy and research community by providing valid evidence and insights enabling informed policy formulation and implementation decisions. Rich data and information provided by large-scale assessments can further be used to analyze the impact of those policy decisions.

The paper consists of three major parts. The first part reviews existing initiatives and proposes a logic model based on reasoning to estimate the effectiveness through evidence and/or counterevidence. A logic model depicts how an initiative is expected to make a difference, using explicit statements of the activities that are likely to bring about the intermediate changes and the impact the initiative intends to make. The proposed model postulates that if evidence generated by LSAs at T1 point of time were utilized to make appropriate modifications in policy and interventions regarding inputs, processes, organizational functioning, governance, monitoring mechanism, and outputs can lead to lead to learning gains per unit of investment at T2 point of time.

The second part builds on policy research and secondary analyses of large-scale assessments conducted in India to generate insights into the policy and practice that emerged from large-scale assessments. While the study primarily uses the assessment data and information from the National Achievement Survey (NAS) and the Annual Status of Education Report (ASER), it makes an effort to corroborate the findings with International and national LSAs in a similar context.

The third part recommends a policy implementation framework consisting of a series of steps to design system-specific strategies and monitor efforts. These steps are organized into two main phases: i) a ‘diagnostic’ phase to identify priority areas or enabling outcomes; ii) an ‘action’ stage to devise, implement and evaluate concrete policy interventions. The diagnostic stage mainly consists of cost-effective action-oriented surveys with a tiered approach, while the action stage consists of evidence-driven developmental goals and action plans for various levels of the system, alignment between all actors involved, customized interventions at school level with continuous monitoring in the cycle of 'assess-act-assess'

Education systems around the world have emphasized the need to transform assessments to improve learning. Proposed framework and model may be vital in designing learning systems to improve learning outcomes through effective systemwide assessments. However, there cannot be a one-size-fits-all policy mix. Feasible policy choices depend on contexts, social preferences, and political constraints. A robust and independent institutional framework, stakeholder engagement, and credible communication strategies are vital to enhancing the effectiveness of LSAs and eliminating learning poverty to achieve sustainable development goals.


Methodology, Methods, Research Instruments or Sources Used
The aim of the study was to develop a framework to assess the effectiveness of large-scale assessments, gather evidence of effectiveness, and then recommend an implementation framework to enhance the effectiveness. Accordingly, a mixed research approach was adopted. The methodology of the study has three main components:
1. A literature review of relevant literature on the effectiveness of LSAs, policy initiatives as a result of LSAs, and implementation research in the context of system-level assessments
2. Secondary analysis of ASER and NAS data for the pre-COVID period
3. Drafting a logic model, followed by an implementation framework to utilize the meaningful findings of LSAs to improve quality dimensions of education, based on main findings of the review  
The scope of the literature review was not limited to large-scale assessments in India, but it also covered the role of international LSAs PISA, TIMSS, PIRLS, SACMEQ, PASEC and national LSAs like NAPLAN, and NAEP in educational policies and their impact. Investigators conducted the review along four key components:
• Model of intent and model of change behind LSAs  
• Use of findings of LSAs in the formulation of policy measures
• Framework of planning, implementation and monitoring of the policy initiatives emerged from LSAs  
• Effectiveness studies in LSAs or use of LSAs as a metric in education effectiveness studies
Investigators also undertook secondary analyses of ASER data since 2005 to analyze the cohort relationships associated with learning achievement in basic literacy and numeracy among the learners in the age group corresponding to grades three to eight. ASER is an annual survey report published by the education non-profit Pratham and aims to provide reliable estimates of enrolment and basic learning levels. Basic reading and basic arithmetic abilities are assessed for learners in the age group of 5-16 years. Secondary analyses of NAS data for grade 3, 5 and 8, and learning data of few other countries from UNESCO Institute of Statistics (UIS) were also undertaken. Then triangulation technique was adopted to consolidate the findings.

Conclusions, Expected Outcomes or Findings
The role of LSAs as a tool to improve the quality of education was recognized in 2000 with the launch of the Program for International Student Assessment (PISA) by OECD. This triggered LSAs as policy research in many parts of the world. However, in the past 20 years learning level of students in many countries has either declined or plateaued. Despite spending several years in school, millions of children are unable to achieve basic literacy and numeracy skills (ASER, 2018). More than 50% of primary school children in South Asian nations were in learning poverty even before the COVID-19 pandemic, and this number is projected to be around 80% due to COVID-19- related school closures (World Bank et al., 2022). The report of NAS 2021 has indicated a similar trend (NCERT, 2022).
The review showed that the majority of systems lack a concrete model regarding how LSAs are expected to impact the actions and learning outcomes. Measurement of learning achievement with no follow-up plan of action results in low efficacy of LSA initiatives. Experts have raised an alarm around the deepening learning crisis and recommended three complementary strategies: assess learning in order to measure and track learning better; act on the results or evidence to guide innovation and practice; and, align actors to remove barriers and make the whole system work for learning (World Bank, 2018).  These complementary strategies may be utilized to derive a logic model as a common wireframe for planning, implementation, and monitoring of outcomes.
The proposed tiered approach to assessments to identify priority areas followed by concrete evidence-driven policy interventions and monitoring mechanisms may enable LSAs-driven improvement in learning. The model can assist policymakers and researchers to estimate the impact of stage-specific decisions on outcomes, and disaggregate the impact of individual intermediary enablers on intended outcomes.

References
ASER Centre. (2018). Annual Status of Education Report (Rural) 2018. http://img.asercentre.org/docs/ASER%202018/Release%20Material/aserreport2018.pdf
NCERT (2019). National Achievement Survey 2017. National report to Inform Policy, Practices and Teaching Learning. National Council of Educational Research and Training. Ministry of Education. Government of India. https://nas.gov.in/report-card/2017
NCERT (2022). National Achievement Survey. National Report 2021. National Council of Educational Research and Training. Ministry of Education. Government of India. https://nas.gov.in/report-card/2021
World Bank 2018. World Development Report 2018: Learning to Realize Education’s Promise. Washington, DC: World Bank. doi:10.1596/978-1-4648-1096-1.
World Bank, UNESCO, UNICEF, USAID, FCDO, Bill & Melinda Gates Foundation. (2022). The State of Global Learning Poverty: 2022 Update. https://www.unicef.org/reports/state-global-learning-poverty-2022.
MHRD. (2020). National Education Policy 2020.   https://www.education.gov.in/sites/upload_files/mhrd/files/NEP_Final_English_0.pdf. Ministry of Education (erstwhile Ministry of Human Resource Development). Government of India.
 
1:30pm - 3:00pm28 SES 11 C: Educational inequalities and post-pandemic education
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Ofir Sheffer
Paper Session
 
28. Sociologies of Education
Paper

Where are the High Schoolers? Dwindling Participation in after-school programs

Ofir Sheffer

Kaye College, Israel

Presenting Author: Sheffer, Ofir

High schoolers are less inclined to attend and persist as members of non-formal education (NFE) organizations, despite the increase options at their disposal (Afterschool Alliance, 2020). This development stands in contradistinction to the growing public and academic interest in such frameworks throughout the world, owing to their potential contribution to the lives of young people. More specifically, youth organizations, grassroots associations, and youth councils, inter alia, provide opportunities for teens to connect to positive role models, form social ties, and broaden their repertoire of personal skills (Polson et al., 2013). Voluntary attendance in NFE programs indeed have a positive bearing on teenagers in an array of fields, especially when they take part on a regular and ongoing basis (Fulton, 2019).

However, despite the well-documented benefits of NFE participation for older youth, their participation wanes with age. A drop in attendance raises questions as to the relevance of NFE for older youth and their compatibility for adolescents' developmental needs (Deutsch & Jones, 2013). Correspondingly, data from Israel indicate that there are disparities in the regularity and persistence of participation between adolescent boys and girls. Surveys conducted by the Israel Ministry of Education (National Authority for Measurement and Evaluation, 2015) show that boys tend to dropout at earlier ages than girls in long-term civic-community programs.

Given the declining number of youth, particularly boys, the study's broader objective is to examine how widespread is the phenomenon among NFE frameworks, and how different organizations identify and respond to this demographic change. As such, the study's objectives are: (a) to collect up-to-date data on older youth (16-18) participation and persistence in the past 5 years; (b) to Inquire as to management-level's awareness to the demographic changes; (c) To locate possible explanations for the phenomenon; (d) acquaint ourselves with the institutional attitude towards working with older youth, from a gender-oriented perspective.

Out of all the various NFE frameworks and attendant goals, the present study concentrates on organizations that center around a vision of fostering leadership and active citizenship. These outlets provide youth an opportunity to experiment with decision-making processes, formulate policy, and embrace communal values within a democratic environment (Akiva et al., 2014; Checkoway, 2011). In light of the above, this project falls under the field of civic participation and social activism. Like other scholars, I am interested in the burning social questions that pertain to the downward trend of late-teens’ participation in civic enterprise. My point of departure is that NFE frameworks that allow for experimentation with decision-making, problem-solving, and policy-making on the communal level are bound to ratchet up participation in society down the road.

current proposal for presentation is written in respond to Q (a) collect up-to-date data on older youth (16-18) participation and persistence in the past 5 years; Q (b) Inquiring as to management-level's awareness to the demographic changes. Finishing interviews, I realize that almost all the Interviewees were not aware to the phenomenon of a decrease in the participation of boys and that the number of girls exceeds boys in an appreciable margin. From that, I chose to present in this article preliminary findings revolving around the field of organizational blindness.


Methodology, Methods, Research Instruments or Sources Used
Data was collected from seven civic-communal organization in Israel, working national wide, reflecting the current diversity of the educational field from gender, ethnicity, and socio-economic background of the high school participants. All organizations gave my access to current data on registration, participation, and graduation of older youth (15-18) in the past five years. Considering that during COVID-19 most of the organizations were struggling to maintaining a regular and consistent activity. Since data on enrollment only reflects a basic level of commitment and does not illuminate engagement (Akiva & Honor, 2016(, I also collected data on participation in special activities like leadership courses and summer camps, looking to see not only who took part but who invested themselves in the organization, took more responsibility, formal leadership roles and even joined the alumni organization. Segmentation process shows boys are 41% among the 16-year-olds, 36% among the 17-18-year-olds and are more likely to drop out of the frameworks before the end of the year. They are also present in fewer numbers in training courses for leadership positions (42%) and even less take part in an extra service-volunteer year before the army (31%).
Additionally, I interviewed one or two representors from every organization, holding a high-management position. Choosing to focus on management-level due to their knowledge on strategic planning, organizational challenges and having a comprehensive picture of the national differences from region to region. As is common in Israel, many of the interviewees were graduates of the organizations themselves, growing up in the organization from junior positions to management. Thus, adding value of time perspective and in dept knowledge of organizational culture, changes in goals and culture, and inner understanding of young people's views and patterns of participation. A total of 12 interviews were conducted so far.
My interview manual encompassed a set of questions concerning trends of change in youth participation; strategically targeted audience; challenges of attracting and maintaining older youth and gender differences in commitment, persistence, and motivation. By means of an inductive analysis of the data, codes from the interviews were formulated with the ATLAS.ti program. Quit early I realized that almost all the Interviewees were not aware to the phenomenon of a decrease in the participation of boys. Thereafter, the central issue of organizational blindness was axially coded. Preliminary findings are presented in this document.

Conclusions, Expected Outcomes or Findings
One answer to organizational blindness is that many organizations in the educational field are experiencing an increase in the number of participants in younger ages. This increase is due to strategic planning of addressing new target audiences, who have not yet taken part in the activities. This creates a false representation of growth in organizational members because, among the strong and traditional populations, the number of registered members has weakened, reflecting a decrease in the commitment of some target audiences. Many of the interviewees were aware of the change in the target audience and even testified that it was a strategic decision of the organization.
The existing literature defines organizational inertia as the inability to enact internal change in the face of significant external change (Gilbert, 2005). My preliminary analysis shows dropout is seen as an inherent situation in voluntary education systems. But these educational organizations hold a very strong narrative as social change agents, their funding also involves working in the socio-economic periphery, which reinforces the organizational narrative of the social change mission. This ideology creates an idealization of the mission to expand, which may be necessary for maintaining a sense of organizational identity. It also sets goals, such as expanding the target audience, because expansion means influence. Although, this is how a blind spot is created towards the weakening of the existing members and their needs.
Social processes in which men leave frameworks and there is a proliferation of women, eventually lead to a decrease in the value and status of the framework. The question arises - what process of devaluation non-formal education frameworks are experiencing that leads boys to choose not to take part in them in late adolescent.

References
Afterschool Alliance (2020). America after 3pm: Afterschool programs in demand,
policy report.
Akiva, T., Cortina, K. S., & Smith, C. (2014). Involving youth in program decision-
making: How common and what might it do for youth? Journal of Youth and
Adolescence, 43, 1844–1860.  DOI:10.1007/s10964-014-0183-y
Akiva, T. & Horner, C. G. (2016). Adolescent motivation to attend youth programs:
A mixed-methods investigation. Applied Developmental Science, 20(4),
278–293.‏ doi.org/10.1080/10888691.2015.1127162
Gilbert, C. G. (2005). Unbundling the structure of inertia: Resource versus routine
rigidity. Academy of Management Journal, 48(5), 741-763.
https://doi.org/10.5465/amj.2005.18803920
Checkoway, B. (2011), “What is Youth Participation?” Children and Youth Services
Review 33/2: 340-345. 10.1016/j.childyouth.2010.09.017
Deutsch, N. L. & Jones, J. N. (2008). “Show me an ounce of respect”: Respect and
authority in adult-youth relationships in after-school programs. Journal of
Adolescent Research, 23(6), 667–688.
https://psycnet.apa.org/doi/10.1177/0743558408322250
Fulton, C. (2019), Exploring the Roles of Youth in Community Programming and
Their Connections to Positive Youth Development and Involvement in
Community, PhD diss., Columbus: Ohio State University.
Israel Ministry of Education (2015), Youth movements in Israel: An assessment of the
relative size, policy paper [Hebrew].
Polson, E. C., Kim, Y. I., Jang, S. J., Johnson, B. R., & Smith, B. (2013). Being prepared
and staying connected: Scouting’s influence on social capital and community
involvement. Social Science Quarterly, 94(3), 758–776.‏
https://doi- /10.1177/0044118X06295051


28. Sociologies of Education
Paper

Post-pandemic Continuities and Changes in Basic Education (ISCED 1) in Portugal

Teresa Teixeira Lopo1, Inês Vieira1, Paulo Sargento2, Ana António3, José Viegas Brás3, Maria Neves Gonçalves4

1Lusófona University, CeiED-OP.Edu: Observatory for Education and Training Policies, Portugal; 2ERISA-IPLUSO, Lusófona University, CEAD Francisco Suárez, Portugal; 3ESEL-IPLUSO, Lusófona University, CeiED - Interdisciplinary Research Centre for Education and Development; 4ESEL-IPLUSO, CeiED - Interdisciplinary Research Centre for Education and Development

Presenting Author: Lopo, Teresa Teixeira

In this paper, we propose to discuss the first results of an ongoing research project focused on the analysis of the post-pandemic changes introduced in the Portuguese schools of first cycle (the first four years of schooling – grades one to four) and the second cycle (the next two years – grades five and six) of basic education (ISCED 1).

At the national level, the several research works conducted on the shutdown of schools, with the imposition of confinement and of an emergency remote teaching (e.g., Alves & Cabral, 2020; Benavente et al., 2020; CNE, 2021; Fernandes et al, 2021; IAVE, 2021; Martins, 2020) highlighted as main effects on the education system: 1) the worsening of school inequalities translated, namely, in the differentiated access to working conditions in the family space, to technological equipment, to knowledge and to digital literacy; 2) the loss of learning; 3) the increased risk of dropping out; 4) limitations at the level of the development of emotional and social skills of the students.

Similarly, other international studies (e.g., Bannink & Dam, 2021; Cohen-Fraade & Donahu 2022; König et al., 2020; Mari et al., 2021; MacIntyre et al., 2020; OECD, 2021; Pirone, 2021; Zacanjo et al, 2022), highlighted: 1) the relationship of the pandemic to the unravelling of social and economic inequalities and the worsening of school inequalities, particularly in Southern and Eastern Europe; 2) the effects of "fractured ecologies" (Bannink & Van, 2021, p.2 ) resulting from the dissociation of sharing the same physical space between teachers and students, amplified by computer-mediated communication, in the organization of teaching, curriculum compliance and motivation for learning; 3) the gaps in the digital skills of the teaching staff; and 4) the working conditions of teachers generated by the pandemic, in a perspective of psychological well-being, especially among those with children and with greater difficulties in balancing professional duties and personal and family life.

In the review of the scientific literature published between 2020 and 2022, we found, however, that the research conducted focused either on the analysis of the effects of the pandemic on education, as we have explained, particularly in secondary and higher education, or on the type of changes generated by crisis management plans implemented in schools.

The proposal of this project arises, precisely, from the identification of this gap and intends to answer the following research questions, considering the three central axes of analysis highlighted by this literature review:
Q1. What post-pandemic changes were implemented to mitigate school inequalities?
Q2. What post-pandemic changes can we identify in the working conditions provided to teachers and in the promotion of their well-being and mental health?
Q3. What post-pandemic changes can we identify in supporting the recovery of students' learning, in monitoring and assessing their learning?


Methodology, Methods, Research Instruments or Sources Used
To answer these questions, the research plan follows an integrative mixed-methods methodological approach (Åkerblad et al., 2020) that includes questionnaires and focus groups.

The questionnaires are being applied up to a maximum number of 300 directors and 700 teachers, based on a purposive sample of 1st cycle (N= 3 589) and 2nd cycle (N=916) public schools, considering that: 1) survey instruments do not aim at inference of attributes for a population; 2) an intentional sample often preserves relationships between variables. Data will be processed using SPSS (closed-ended questions; descriptive and multivariate statistics) and MAXQDA (open-ended questions; content analysis) software.

Three focus groups will be conducted considering the axes of analysis of this work: 1) socioeducational inequalities; 2) working conditions, well-being and mental health of teachers; and 3) recovery of learning, monitoring and assessment of students. The recruitment of participants will seek to ensure that stakeholders are representative. A maximum of 12 participants will be invited. Each discussion session will have a maximum duration of 3 hours, divided into two parts of 1.5 hours each. The contributions will be audio-recorded, transcribed and analyzed using MAXQDA software.

Conclusions, Expected Outcomes or Findings
The first results of this work suggest that there was: 1) an extension of some of the equity measures previously implemented by the government, such as the increase of students with social and economic aid and the implementation at a national level of an Integrated Plan for the Recovery of Learning; 2) a reinforcement of technological equipment in schools and of the supply of continuous teacher training in this area; 3) by contrast, the changes introduced in their pedagogical practices, as well as, the actions implemented to promote their well-being and mental health, and addressed to the provision of training in socio-emotional skills that may support their pedagogical work in the current post-pandemic context, are less expressive; 4)
It is also at the level of socio-emotional skills, relationships, autonomy and communication with peers and teachers that students have shown the greatest difficulties in making up ground.

References
Åkerblad, L., Seppänen-Järvelä, R., &; Haapakoski, K. (2020). Integrative strategies in mixed methods research. Journal of Mixed Methods Research, 15(2),152-170.
Alves, J. M., & Cabral, I. (Eds.). (2020). Ensinar e aprender em tempo de COVID 19: Entre o caos e a redenção. Faculdade de Educação e Psicologia da Universidade Católica Portuguesa.
Bannink, A., & Dam, J. V. (2021). Teaching via Zoom: Emergent discourse practices and complex footings in the online/offline classroom interface. Languages, 6(3).
Benavente, A., Peixoto, P., & Gomes, R. M. (2020). Impacto da Covid-19 no sistema de ensino português. Resultados globais. OP. Edu – Observatório das Políticas de Educação e Formação.
Cohen-Fraade, S., & Donahu, M. (2022). The impact of COVID-19 on teachers’ mental health. Journal for Multicultural Education, 16(1), 18-29.
CNE. (2021). Efeitos da pandemia COVID-19 na educação: Desigualdades e medidas de equidade. CNE.
Fernandes, M. A. F., Machado, E. A., Alves, M. P., & Vieira, D. A. (2021). Ensinar em tempos de Covid-19: Um estudo com professores dos ensinos básico e secundário em Portugal. Revista Portuguesa de Educação, 34(1), 5-27.
IAVE. (2021). Estudo diagnóstico das aprendizagens Apresentação de resultados. IAVE.
König, J., Jäger-Biela, D. J., & Glutsch, N. (2020). Adapting to online teaching during COVID-19 school closure: Teacher education and teacher competence effects among early career teachers in Germany. European Journal of Teacher Education, 43(4), 608-622.
Mari, E., Lausi, G., Fraschetti, A., Pizzo, A., Baldi, M., Quaglieri, A., … Giannini, A. M. (2021). Teaching during the pandemic: A comparison in psychological wellbeing among smart working professions. Sustainability,13(9), 4850.
MacIntyre, P. D., Gregersen, T., & Mercer, S. (2020). Language teachers’ coping strategies during the Covid-19 conversion to online teaching: Correlations with stress, wellbeing and negative emotions. System, 94, 102352.
Martins, S. C. (2020). A educação e a Covid-19: Desigualdades, experiências e impactos de uma pandemia não anunciada. In R. M. Carmo, I. Tavares, & A. F. Cândido (Eds.), Um olhar sociológico sobre a crise Covid-19 em livro (pp.37-54). Observatório das Desigualdades, CIES-ISCTE.
OECD (2021). The state of global education. 18 months into the pandemic. OECD.
Pirone, F. (2021). School closures in France in 2020: Inequalities and consequences for perceptions, practices and relationships towards and within schools. European Journal of Education, 56(4), 536-549.
Zancajo, A., Verger, A., & Bolea, P. (2022).  Digitalization and beyond: The effects of Covid-19 on post-pandemic educational policy and delivery in Europe. Policy and Society, 41(1), 111-128.
 
3:30pm - 5:00pm28 SES 12 C: Religion in schools
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Karl Kitching
Paper Session
 
28. Sociologies of Education
Paper

Representations of the Virgin Mary in Swiss German religious school textbooks in a multi-religious society

Bernhard Rotzer

College of Teacher Education Valais, Switzerland

Presenting Author: Rotzer, Bernhard

In the sociology of knowledge according to Berger and Luckmann, it is assumed that reality is socially constituted and must be renegotiated from one generation to the next (cf. Berger & Luckmann, 1969). The institution of school is not excluded from this process, which becomes apparent with the introduction of new curricula again and again. A few years ago, the curriculum 21 was introduced in the German-speaking cantons of Switzerland, irrespective of historical and confessional boundaries, which contributed to the harmonisation of learning content. Religious education in particular experienced an innovation, which was replaced by the subject "Ethics, Religions, Community" (ERG). Religious education in the singular is now a thing of the past and has been transformed into the teaching of religions. The authors of curriculum 21 take it for granted that children are surrounded by a heterogeneous environment and have to deal with many religious traditions and world views. It is still worthwhile to deal with the Christian traces in society, but a lesson that deals with religions cannot stop at other world views. Adolescents should be introduced to different religions and thus be made capable of tolerance and democracy (Lehrplan 21). In the curriculum of 2003, religious education still focused on the Bible, the knowledge of which seemed indispensable for general education and the children were supposed to get to know the Christian cultural heritage (Lehrplan 2003). Social realities with their institutional knowledge are subject to constant processes of shift (cf. Foucault, 1974, p. 13). These processes of change are particularly noticeable in textbooks and their contents. Drawing on the sociology of knowledge, this means that the content of textbooks can change over time. Whereas in 2000 just over 75 per cent of the Swiss still belonged to the Roman Catholic or Protestant Church, the balance of power has shifted drastically in the last 20 years. Today, just under 54 per cent of Swiss citizens still feel they belong to a traditional Christian church (Bundesamt für Statistik). In 2023, we will encounter a diverse religious landscape in Switzerland, and this presentation is based on the assumption that the diversity of Swiss social relations should also have an impact on the content of religious textbooks. Textbook contents are not random products, but rather sources that are constantly renegotiated by various actors and textbook developers in social-historical discourse (cf. Wiater, 2003).

Using the example of the religious figure called Mary, which is exposed to different interpretive sovereignties beyond interdenominational boundaries and therefore it can be assumed that the perception of this person can manifest itself in different ways in a certain social construct in religious textbooks, this contribution to textbook research aims to show how representations of Mary in image and word in the textbooks of the curriculum 2003 and the curriculum 21 come to light and do or do not do justice to a society of diversity. Thus, this paper is interested in the following questions:

How is the religious figure of Mary portrayed in the textbooks of curriculum 2003 and curriculum 21? Do they do justice to a heterogeneous Swiss society? Could there have been shifts in the representations of Mary in the period between 2003 and today? And if so, what social events might have contributed to this?

Since a pluralistic society changes the interrelationships among the various religious institutions and promotes ecumenical as well as interreligious exchange (cf. Berger, 2014, p. 48), it can be assumed that the representations of Mary must also be affected by this fact.


Methodology, Methods, Research Instruments or Sources Used
In order to get to the bottom of these questions, the researcher invokes grounded theory according to Glaser and Strauss by processing the textbook texts in an inductive manner (cf. Strübing, (2014). In doing so, the author makes use of a total of eight religion textbooks that were and are used in the 2003 curriculum and in curriculum 21. From the text, codes are to be worked out that are finally assigned to the category of Mary in a cumulative manner and enable reliable statements on the synchronous as well as diachronic show between 2003 and today on the textbook contents. This qualitative content analysis will be followed by a discursive classification in order to get to the bottom of the body of knowledge on Mary in religious education textbooks with the possible shifts (cf. Rössler 2017; Mayring, 2015).
Conclusions, Expected Outcomes or Findings
When the curriculum was introduced in 2003, around 75 per cent of the Swiss counted themselves as belonging to a traditional Christian community. Since the 1970s, there have been tendencies towards religious mixing and secularisation, but in German-speaking Switzerland, the majority could be assumed to have a Christian worldview (Bundesamt für Statistik). Thus, in the religious textbooks from the 2003 curriculum, children were taught a depiction of Mary that was within the Christian-Jewish horizon. However, the authors of the textbooks did not just leave it at the traditional biblical traditions, but added Marian stories of their own design in order to present the role of Mary to the schoolchildren in a more comprehensible way, which could be linked to pedagogical considerations (Gott hat viele Namen, 1997, p. 300). Twenty years later, the religious landscape in Switzerland has changed. In 2023, for example, just under half of the Swiss still belong to a Christian denomination, a drop of over 20 per cent since the beginning of the year 2000. An increase in other religious traditions and non-denominational fews has emerged (Bundesamt für Statistik). The analysis of textbooks from Curriculum 21 onwards shows that representations of the Virgin Mary have opened up in favour of an interdenominational or even a cross-religious view and have expanded beyond denominational boundaries to include Islamic and Hindu representations (Blickpunkt 2, 2013, pp. 84-87). These facts show that the representations of Mary in current religious education textbooks can be fitted into the social reality of a pluralistic composition. In contrast to 20 years ago, these have changed from a Christian-Jewish centred and interdenominational to an interreligious approach. This shows that the role of Mary in religious education textbooks in German-speaking Switzerland is changeable according to social developments and is currently compatible with a society of religious diversity.
References
Berger, P.L. (2014). The many altars of modernity. Toward a paradigm for religion in a pluralist age. Bosten: De Gruyter.
Berger, P.L. & Luckmann, T. (1969). Die gesellschaftliche Konstruktion der Wirklichkeit. Deutsche Ausgabe, 26. Auflage, 2016). Frankfurt am Main: Fischer Verlag.
Blickpunkt 2. Religion und Kultur (2013). Lehrmittelverlag in Zusammenarbeit mit der Pädagogischen Hochschule Zürich. Zürich: Lehrmittelverlag Zürich.
Bundesamt für Statistik (2022). Abrufbar unter der folgenden Adresse: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung/sprachen-religionen/religionen.html
Foucault, M. (1974). Die Ordnung des Diskurses. Deutsche Ausgabe, 14. Auflage, 2017). München: Carl Hanser Verlag.
Gott hat viele Namen (1997). Herausgegeben vom Lehrmittelverlag des Kantons Zürich. Zürich: Lehrmittelverlag Zürich.
Lehrplan 21 (2018). Abrufbar unter https://vs.lehrplan.ch/index.php?code=b|6|1
Lehrplan 2003 (2003). Sion: Médiathèque Valais, BCV PA 4151.
Mayring, P. (2015). Qualitative Inhaltsanalyse. Grundlagen und Techniken (12. Überarbeitete Auflage). Weinheim: Beltz.
Rössler, P. (2017). Inhaltsanalyse (3. Auflage). Konstanz und München: utb.
Wiater, W. (2003). Das Schulbuch als Gegenstand pädagogischer Forschung. In W. Wiater (Hrsg.). Schulbuchforschung in Europa – Bestandesaufnahme und Zukunftsperspektive. Beiträge zur historischen und systematischen Schulbuchforschung (S. 11-22). Bad Heilbrunn: Klinkhardt.


28. Sociologies of Education
Paper

Employment Equality and Non-Religious Teachers in Religious Schools

Catherine Stapleton1, James Nelson2

1MIC, University of Limerick, Ireland; 2Queens University, Belfast

Presenting Author: Stapleton, Catherine; Nelson, James

Globalisation, socio-political shifts and increasing diversification of religious beliefs and practices present challenges for schools around the world. This is a time of transition and school communities face challenges between traditional and new ways of understanding. Teachers are at the interface of this change, including how their personal identities fit within professional environments. This paper presents an investigation into nonreligious teachers' experiences in traditionally religious schools in the Republic of Ireland (RoI) and Northern Ireland. In the RoI and, until recently, in Northern Ireland (NI) schools with a religious ethos were exempt from employment equality legislation in relation to religion (NI Fair Employment and Treatment Order 1998; Irish Employment Equality Act 1998-2011 Section 37 (1)). Historically this has been justified on religious grounds and the right of religious schools to appoint teachers who share their beliefs. Over time, populations on both sides of the border have become more religiously diverse and there has been a significant rise in the number of people with no religious belief. Some schools have responded to this increasing plurality by changing how they describe their stated ethos, this has resulted in further uncertainty around what counts as a religious school and raises questions regarding the applicability of exemptions from equality legislation for all schools on the island. Furthermore, the continued use of exemptions from equality legislation in RoI would appear to be overly generous in comparison to other European states. The research question, therefore, was as follows:

To what extent is religion or belief a factor in the appointment or promotion of non-religious teachers in Post-Primary schools with a religious ethos on the island of Ireland?

The epistemology underpinning this research is social constructivism. Theories of identity and teacher agency, particularly ecological agency (Priestly et al. 2015), underpin the analysis of the findings.

The research methodology was qualitative and the researchers undertook semi-structured interviews with fifteen non-religious post-primary teachers. Thematic analysis supported by NVivo 10 computer software was used to analyse the data.

The key findings are that religion or belief was a factor in the appointments of all the teachers to varying degrees. In schools managed by Catholic authorities, candidates’ beliefs were explicitly taken into consideration. While in other schools, that hold religious values, implicit religious influences were at play in teacher appointments. It was also found that temporary contracts and probation periods meant teachers were subjected to a protracted assessment of their suitability for posts, including their ‘fit’ with a school's religious ethos. The majority of the participants felt a need to suppress their non-religious identity and conform to the schools’ religious culture, causing identity dissonance and personal ethical conflicts.


Methodology, Methods, Research Instruments or Sources Used
To answer the research questions the researchers chose to gather qualitative data from a sample of teachers in both jurisdictions. As explored in the literature review, those Post-Primary teachers who are non-religious may lack formal protections against discrimination in employment on the basis of their beliefs.
The researchers recruited Post-Primary teachers who self-identified as non-religious and had experience working in a school with a religious ethos. Initially, a number of established humanist organisations and social network groups were contacted. However, it proved challenging to find participants and the researchers asked the organisations to re-advertise. Furthermore, the communications office at Mary Immaculate College, Limerick was also asked to advertise the research project on their platforms. Snowball sampling was utilised, whereby participants were asked at their interview if they had colleagues who may be interested in participating in the research. This enabled a wider reach to participants who were not members of non-religious groups or social media followers. Where applicable, permission was sought from the organisation and/or network gatekeeper to share an invitation to become involved in the research. The research was advertised between June and August 2020. The criteria for selection shared in the invitation were: a non-religious worldview and experience of teaching in a Post-Primary school on the island of Ireland which had a religious ethos. In total, 15 participants were interviewed five from NI and ten from RoI. When interviewed, 14 were currently teaching and one had left the teaching profession. Due to the restrictions of the Covid 19 pandemic, video-call software was used to facilitate the interviews.
The project received ethical approval from the SSESW Ethics Committee of Queen’s University Belfast.

Conclusions, Expected Outcomes or Findings
Religion or belief is a factor in the appointments of teachers. Similar to other studies of teachers in NI (Milliken et al. 2019), our data showed that application forms and interview processes are used by many schools to elicit the religious or non-religious identity of teachers and their level of commitment to the religious ethos of the school. We can see from our sample that the freedom to make judgments on applicants by religion is exercised explicitly by Catholic schools. Further, implicit processes are at play across other school types which remain religiously influenced, on both parts of the island. Moreover, temporary contracts and probation periods combined with a ‘chill factor’ mean teachers are subjected to a protracted assessment of their suitability.
 
In considering our findings alongside European directives focusing on proportionality and genuine occupational requirement (European Council 2000), the European Convention guidance on religious freedom (article 9) (ECHR 2021) in tandem with the United Nations Human Rights comment 22 (UNHRC 1993) on mutual respect, we found that non-religious teachers without legislated protection from discrimination can be disadvantaged in employment in a range of school types and if they achieve employment can experience isolation, identity dissonance and restricted agency. Using an ecological view of agency as part of the analytical frame, helped to highlight how teachers, as individuals cannot easily address discriminatory environments and practices at a structural level. Interestingly, our findings also show that schools with strong religious cultures are not exclusively denominational schools. For this reason, a system-wide review of employment practices is needed, especially if nonreligious teachers are to experience equality and inclusivity as part of their professional environment.
 

 

References
Barnes, L. P. (2021). The character of Controlled schools in Northern Ireland: A complementary perspective to that of Gracie and Brown. International Journal of Christianity & Education, 205699712110089. https://doi.org/10.1177/20569971211008940

Berglund, Jenny (2014) Swedish Religion Education: Objective but Marinated in Lutheran Protestantism?, Temenos - Nordic Journal of Comparative Religion 49: 2, 165–84. https://doi.org/10.33356/temenos.9545

Bråten, O. M. H. (2014), “New social patterns: old structures? How the countries of Western Europe deal with religious plurality in education”, in Rothgangel M., Jackson R. and M. Jäggle (eds), Religious education at schools in Europe, Vol. 2: Western Europe, Vienna University Press: Göttingen

Bullivant, S., Farias, M., Lanman, J., & Lee, L. (2019). Understanding Unbelief: Atheists and agnostics
around the world. https://cdn-researchkent.pressidium.com/understandingunbelief/wpcontent/uploads/sites/1816/2019/05/UUReportRome.pdf

Catholic Schools Partnership (2014), Catholic Education at Second Level in the Republic of Ireland. Looking at the Future. Dublin: Veritas

Coffman, A.N. (2015) Teacher Agency and Education Policy. The New Educator, 11(4), 322-332

Chan, A. & Stapleton, C. (2021). Religious-based bullying: International Perspectives on what it is and how to address it. In P.K Smith, P & J. O’Higgins Norman (Eds.), The Wiley Blackwell Handbook of Bullying: A Comprehensive and International Review of Research and Intervention. Vol one [pp.321-341]. Wiley Blackwell.

Employment Equality Act, 1998-2011, Section 37(1). Dublin Stationery Office, available http://www.irishstatutebook.ie/eli/1998/act/21/enacted/en/html

Equality Commission Northern Ireland (ECNI). (2004). The Exception of Teachers from The Fair
Employment and Treatment (NI) Order 1998. https://www.equalityni.org/ECNI/media/ECNI/Publications/Delivering Equality/TeacherExceptionfromFETOInvestigReport2004.pdf

European Council. (2000). Council Directive establishing a general framework for equal treatment in
employment and occupation 2000/78/EC. http://data.europa.eu/eli/dir/2000/78/oj

Franken, L. (2021). Church, State and RE in Europe: Past, Present and Future. Religion & Education.
https://doi.org/10.1080/15507394.2021.1897452

Heinz, M., Davison, K., & Keane, E. (2018). ‘I will do it, but religion is a very personal thing’: teacher education applicants’ attitudes towards teaching religion in Ireland. European Journal of Teacher Education, 41(2), 232-245.

Milliken, M., Bates, J., & Smith, A. (2019). Education policies and teacher deployment in Northern Ireland: ethnic separation, cultural encapsulation and community cross-over. British Journal of Educational Studies, 1–22. https://doi.org/10.1080/00071005.2019.166608

Nelson, J. (2019). Meaning-making in religious education: a critical discourse analysis of RE departments’ web pages. British Journal of Religious Education, 41(1), 90–104. https://doi.org/10.1080/01416200.2017.1324757

Priestly, M., Biesta, G., & Robinson, S. (2015). Teacher Agency: an ecological approach. Bloomsbury Academic.
From <https://www.conftool.com/ecer2023/index.php?page=showAbstract&form_id=343&show_abstract=1>

Russo, C. J. (2009). The Law and Hiring Practices in Faith-Based Schools. Journal of Research on
Christian Education, 18(3), 256–271. https://doi.org/10.1080/10656210903345248

Stapleton, C. (2021).’Catholic education at the coalface of a kaleidoscope of identities’, Pastoral Care in Education, 39(1) DOI: 10.1080/02643944.2021.1898664 Available:  https://www.tandfonline.com/eprint/EFDDHVYXVZF7PICGFGZM/full?target=10.1080/02643944.2021.1 898664


28. Sociologies of Education
Paper

Education Policy and Youth Freedom of Expression on Race and Faith at School

Karl Kitching, Asli Kandemir, Reza Gholami, Md. Shajedur Rahman

University of Birmingham, United Kingdom

Presenting Author: Kitching, Karl; Kandemir, Asli

This paper is part of a mixed methods research project on the factors in and out of schools that shape young people’s expression on race and faith equality issues. Focusing here on policy discourse, the paper presents an analysis of two key questions: (1) how liberal political concepts including those of freedom of expression may be mobilised in education policy to facilitate wider right-wing political goals, and (2) how education policy in this context shapes young people’s ‘free’ public political expression - in particular on race and faith equality issues - at school. The international significance of this paper lies in its analysis of how education policy discourse is aligned with the revival of freedom of expression as a topic of right-wing political concern across the global north. This revival, it has been argued, seeks to undermine fragile race and faith equality progress through the translation of narrow ‘free speech’ and ‘cancel culture’ claims into policy and political discourse (Mondon and Winter 2020; Titley 2020).

Our focus is on education policy and politics in contemporary English education, where government figures have defined the anti-racist organising of movements such as Black Lives Matter as ‘cancelling’ freedom of expression in higher education, and as creating risks for impartiality on the teaching of equality in schools (Trilling 2020). Political figures in the US, France and Australia have made similar claims (Goldberg 2021), advancing an “antagonistic vision” of “who constitutes the public and what values should guide public discourse” (Titley 2020: 3). However, education policy texts are typically more politically measured, and freedom of expression is a far more complex phenomenon than binary notions of ‘free speech’ and ‘cancellation’ put forward in such political discourse allows. For example, in the context of curriculum-making, mundane processes of foreclosing what is not/cannot be taught, processes of editing, and the pursuit of efficiencies and profit all play a role in shaping what can be thought, said and felt in education contexts (Mondal 2018).

The paper analyses 80 education policy texts in the English and UK policy context with a view to unearthing not just how freedom of expression is directly defined in such texts, but to identifying the ways education policy contributes to the political, cultural and affective environment that makes certain kinds of expression possible for young people. Education policy has long been theorised in terms of discourse, i.e., a body of ideas, concepts and beliefs established as knowledge or truth, framing “what can be said, and thought, but also… who can speak, when, where, and with what authority” (Ball 1993, 14). The paper draws on this theoretical tradition to understand ‘freedom’ as existing in a complex, contextual relationship to power/constraint, rather than being its simple opposite. As notions of disciplinary power and subjectivation arising from Foucault (1975) and Butler (1990) indicate, a focus on discourse helps us see the performative, i.e., normalising power of discourse in shaping the possibilities of everyday youth expression (Youdell 2006), alongside more commonly understood juridical/legal forms of constraint on expression (e.g. hate speech).

As such, a key analytic goal in this paper is to identify what kinds of subject positions and thus, possibilities for expression, are made available to young people through education policy texts. But freedom of expression, and questions of race and faith equality involve political passions (Youdell 2011). Therefore, drawing on affect theories, we seek to analyse how the possibility of young people and their political expression on race and faith equality becoming a particular subject and object of feeling is also created/closed down through policy discourse (Ahmed 2004; Kitching et al. 2015).


Methodology, Methods, Research Instruments or Sources Used
The paper presents a two-step thematic and discursive analysis of English education policy texts pertaining the period 2010-2022. This period marks several Conservative-led policy changes, including the deregulation of school governance to offer schools greater budgetary and curriculum ‘freedoms’ (Academies Act 2010; Department for Education; DfE 2016), the establishment of a statutory terrorism prevention duty in schools (Department for Education 2014), the minimising of racism as a systemic issue (Commission on Race and Ethnic Disparities 2021), the endorsement of ‘strict’ methods to manage behaviour (Timpson 2021), and the issuing of political impartiality guidelines for schools as a response to movements, e.g. for decolonisation (DfE 2021).

The data consisted of a corpus of 80 texts gathered in the areas of equality, curriculum, behaviour, safeguarding (inclusive of counter-terrorism) and inspection. These texts were identified through a process of searching these areas through the DfE government web archive for the period. We included relevant higher education texts due to the focus on freedom of expression in this context (DfE 2021). The texts included white papers, legislation, guidance on enacting legal duties in schools, policy research reports, and press statements. While engaging a broad range of policy priorities, this approach allowed us to identify dominant discourses operating across these priorities and how they aligned with or contradicted one another. The selected texts were divided between the two presenting authors, and a two-stage analytic process was conducted. The first was a thematic analysis (Braun and Clarke 2019), which enabled the identification of the range of meanings put forward in the texts. While texts were coded under a priori categories of equality, freedom, expression, and mission of school/higher education, these categories largely helped us to organise the analysis and of two separate sets of texts, allowing us to meet/communicate regularly and ‘make sense’ of each other’s coding processes. We then examined how our 265 codes overlapped and differed, to simplify and merge the codes into 31 a posteriori codes. At this point, moving towards a more deductive process, we identified five key themes as capturing the prevailing meanings advanced in the texts: truth, vulnerability, liberal equality, school excellence, and citizen-making. Drawing on samples from each of the five themes, we then conducted a second-stage analysis of the discursive strategies deployed in the texts, to offer particular subject positions for young people, and ways of feeling about freedom of expression, race and faith equality and youth.

Conclusions, Expected Outcomes or Findings
The analysis found few education policy texts addressed the topic of freedom of expression or the political debates noted earlier directly. However, multiple discursive and affective strategies typically aligned to liberal political norms were identified as offering narrow possibilities for expression to young people. As one example, a key strategy involved not simply the production of young people as vulnerable subjects, but the discursive and affective regulation of acceptable vulnerability through discourses of child/youth safeguarding and protected characteristics. There was an implicit temporal distinction drawn between current priority (gender, sexuality, age) risks which, which in line with prevailing political climate, deprioritised concerns about race inequality. At the same time, the forms of vulnerability that youth may more actively encounter (e.g. youth-led organising, dissent) was either absent, discouraged, or defined as illegal.

While processes of policy enactment will find ways to subvert and work against the above issues, we argue these discursive and affective strategies amongst others in the wider dataset powerfully work to empty liberal democratic concepts of equality and human rights of their potential to support young people’s political expression. This emptying and narrowing of the kinds of political subjects that young people can become in turn facilitates the achievement of prevailing right-wing political goals. This is not least as race and faith equality are largely depoliticised and deprioritised as protected characteristics, and any stronger representation of race or faith inequality as a live issue is designated as ‘contested’ and thus a risky basis for school-based discussion. The next phase of our research will map how these discursive and affective strategies translate into processes of policy enactment in schools and young people’s lives, through interviews with national and local policy stakeholders, and ethnographic school case studies.

References
Academies Act 2010. Available at https://www.legislation.gov.uk/ukpga/2010/32/contents
Ahmed, S. 2004. The Cultural Politics of Emotion. Edinburgh: University of Edinburgh Press.
Ball, S.J. 1993. What is Policy? Texts, Trajectories and Toolboxes. Discourse: Studies in the Cultural Politics of Education 13(2): 10-17.
Braun, V. and Clarke, V. 2019. Reflecting on Reflexive Thematic Analysis. Qualitative Research in Sport, Exercise and Health 11(4): 589-597.
Butler, J. 1990. Gender Trouble: Feminism and the Subversion of Identity. London: Routledge.
Commission on Race and Ethnic Disparities. 2021. Commission on Race and Ethnic Disparities: The Report. Available at https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/974507/20210331_-_CRED_Report_-_FINAL_-_Web_Accessible.pdf
Department for Education. 2015. The Prevent Duty: Departmental Advice for Schools and Care Providers. HMSO.
Department for Education. 2016. Educational Excellence Everywhere. HMSO.
Department for Education. 2022. Political Impartiality in Schools. Available at https://www.gov.uk/government/publications/political-impartiality-in-schools/political-impartiality-in-schools
Foucault, M. 1975. Discipline and Punish: The Birth of the Prison. New York: Random House.
Goldberg, D.T. 2021. The War on Critical Race Theory. Boston Review. https://www.bostonreview.net/articles/the-war-on-critical-race-theory/
Kitching, K., O’Brien, S., Long, F., Conway, P.F., Murphy, R., and Hall, K. 2015. Knowing How to Feel About the Other? Student Teachers, and the Contingent Role of Embodiments in Educational Inequalities. Pedagogy, Culture and Society 23(2): 203-223.
Mondal, A. A. 2018. The Shape of Free Speech: Rethinking Liberal Free Speech Theory. Continuum 32(4): 503-517.
Mondon, A. and Winter, A. 2020. Reactionary Democracy: How Racism and the Populist Far Right Became Mainstream. London: Verso.
Timpson, E. 2019. Timpson Review of School Exclusion. Department for Education.
Titley, G. 2020. Is Free Speech Racist? London: Polity Books.
Trilling, D. 2020. Why is the Government Suddenly Targeting Critical Race Theory? The Guardian. 23 October. https://www.theguardian.com/commentisfree/2020/oct/23/uk-critical-race-theory-trump-conservatives-structural-inequality
Youdell, D. 2006. Impossible Bodies, Impossible Selves: Exclusions and Student Subjectivities. Dordrecht: Springer.
Youdell, D.  2011. School Trouble: Identity, Power and Politics in Education. London: Routledge.
 
5:15pm - 6:45pm09 SES 13 B: Assessment Practices and School Development: Fostering Fairness and Effective Implementation
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Alli Klapp
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

How to Deal with the Challenge of Assessment – Performance and Assessment Culture as an Issue of School Development

Marius Diekmann, Sabine Gruehn, Carolin Kruell

University of Münster, Germany

Presenting Author: Diekmann, Marius; Kruell, Carolin

Without doubt, everyday assessment and responses to student performance are central facets of school quality and an important field of innovations in school. The development of a “new” and formative performance resp. assessment culture, which is suitable for initiating and supporting the acquisition and development of both subject-specific and interdisciplinary/generic competences of students (e.g. self-regulated learning, social competences), seems to be a necessary condition for the development of teaching and learning in general (cf. MfSW NRW 2009; Beutel et al. 2017; Wiliam 2018). In this context, the term “culture” refers to a fundamental change of assessment practice that is not limited to a selective use of some additional or alternative diagnostic instruments by only a few teachers (cf. Jürgens & Diekmann 2006; Box 2019, 42/143). According to Sacher (2014, 264) the development of a new resp. formative performance and assessment culture in schools will only succeed, if it is based on a jointly formulated assessment concept in which the teaching staff fixes objectives, guiding principles and concrete agreements on assessment practice. Such an assessment concept – Sacher points out – must be implemented, regularly evaluated, discussed, and revised as an essential part of the school program. In which way, to what extent and how successful individual schools undertake efforts regarding the requested change and development of performance and assessment culture(s) (cf. e.g. Winter 2012) has hardly been empirically investigated (in Germany). Most of the findings on performance and assessment culture and associated innovations relate to schools that can be described as “extraordinary”. Extraordinary in the sense that they, for example, have been nominated for the (nationwide) German School Award (cf. Porsch et al. 2014, Beutel & Pant 2020) or have a special pedagogical profile (e.g. Montessori, cf. Diekmann 2018). There are almost no empirical findings that give a broader impression of focal points, achievements, or school form/grade specific characteristics of the change in performance/assessment culture at “ordinary” schools. One exception are the findings obtained in the context of an external evaluation (“Qualitätsanalyse”) of schools in North Rhine-Westphalia (federal state of Germany). During this evaluation, various methods were used to gain a comprehensive impression of the work and quality of schools. Among other things, classroom observations were conducted, and so-called school portfolios were reviewed. The school portfolios contained various documents specific to individual schools, such as school programs. A mandatory component of the school portfolios were the performance concepts developed by the individual schools. In summary the performance resp. assessment concepts schools had to submit during the evaluation are characterized as unsatisfying and in need of development (cf. MfSW NRW 2009, 34). Unfortunately, this conclusion is not really explained in detail. Regarding specific differences between school types and levels, only a few findings are reported. For example, it is pointed out that the performance/ assessment concepts at secondary schools are comparatively subject-specific (compared to the performance/ assessment concepts at primary schools). In contrast, performance/assessment concepts at primary schools apparently prove to be more elaborate about the formative use of individual diagnostics. (cf. MfSW NRW 2016, 30-32). The following questions arise from this:
Research question 1: How are performance/assessment concepts designed in terms of scope and content? Are there any school level/form-specific priorities or features?
Research question 2: Are performance/assessment concepts embedded in a whole school approach of school development?


Methodology, Methods, Research Instruments or Sources Used
To gain fundamental insights into the questions raised above, we conducted an explorative content analysis of diverse documents dealing with performance and assessment which we found on homepages of each 100 randomly selected primary schools and secondary schools in North Rhine-Westphalia. In addition to the texts explicitly designated as performance/assessment concepts, we have also incorporated texts and text passages that deal, for example, with grading practice in various subjects. After we downloaded the documents from the schools’ homepages from January to May 2022, we developed, tested, and revised the category system for content analysis. The categories we used were derived both from the material resp. performance/assessment concepts itself and from academic discussion (cf. e.g. Bohl 2018) and guidelines given by the Educational Administration (cf. QUA-LiS NRW 2011). Statements in performance/assessment concepts resp. performance/assessment related information that could be assigned to the following (superordinate) categories were coded and counted: general and subject-specific principles and objectives of (performance) assessment; quality criteria for (performance) assessment; forms and instruments of (performance) assessment; concretization and implementation of legal requirements; performance/assessment concept in the context of school and teaching development; innovations; evaluation and revision. Analyses of variance and T-tests were used to examine whether there are significant school-level and school-form-specific differences. One advantage of document analysis is that this method is much less prone to the phenomenon of social desirability than, for example, a written survey or an interview. One of its disadvantages, however, is that the origin and authorship of the analyzed material cannot always be traced, for example. Therefore, it is usually recommended to combine different methods of data collection. This is what we intend to do in the next step. Based on the findings of our document analyses, we plan to conduct in-depth interviews with school administrators and written surveys of teachers.
Conclusions, Expected Outcomes or Findings
Research question 1: The length of the performance/assessment concepts as well as the subject-specific parts varies considerably within the sample, ranging from 3 to 149 and from 0 to 139 pages (primary vs. secondary schools). Performance/assessment concepts at primary schools typically consist of an interdisciplinary and a subject-specific part. The latter is usually not included in the performance/assessment concepts of secondary schools but may be found in a separate document (subject-specific performance/assessment concept). Practically all of the performance/assessment concepts contain statements on quality criteria and principles of performance measurement, to which the respective school feels (particularly) committed, as well as statements on the concretization and implementation of legal requirements, which can be found in the School Act, in examination regulations or decrees.
Research question 2: Just under half to two-thirds of the performance/assessment concepts (elementary vs. secondary schools) contain basic statements about their formation. Less frequent and less extensive are indications to the evaluation and revision of performance/assessment concepts. It is quite remarkable that - especially at primary schools - a connection to the individual school program/school profile is established only in exceptional cases. In contrast, innovations (e.g., use of new/formative instruments) are reported more frequently in the performance/assessment concepts of elementary schools compared to those of secondary schools.

To put it simply, the performance/assessment concepts analyzed largely prove to be information about the existing practice of performance measurement and assessment, some of which is specific to the school level, as well as a concretization of binding, general requirements. As programs for innovation resp. the development and implementation of a "new" performance culture - as suggested by Sacher (2014, 264) – performance/assessment concepts seem to be (still?) little used. Examining the reasons of this finding is one of the purposes of our planned follow-up study.

References
S.-I. Beutel, K. Höhmann, H. A. Pant, M. Schratz (Hg.) (2017): Handbuch Gute Schule. Sechs Qualitätsbereiche für eine zukunftsweisende Praxis. 2. Auflage. Seelze: Kallmeyer.
Beutel, S.-I.; Pant, H. A. (2020): Lernen ohne Noten. Alternative Konzepte der Leistungsbeurteilung. Stuttgart: Kohlhammer.
Bohl, T. (2018): Ewige Baustelle? Von pädagogischer Innovation und diagnostischer Qualität. In: Lernende Schule 21 (84), 3-7.
Box, Cathy (2019): Formative Assessment in United States Classrooms. Changing the Landscape of Teaching and Learning. London: Palgrave Macmillan.
Diekmann, M. (2008): Wortgutachten, Zeugnisbriefe und Rasterzeugnisse. Zur Beurteilungspraxis an bayerischen Montessori-Schulen. In: Lernende Schule 21 (84), 30-34.
Jürgens, E.; Diekmann, M. (2006): Lernleistungen von und mit Kindern erfassen und bewerten. In: P. Hanke (Hg.): Grundschule in Entwicklung. Herausforderungen und Perspektiven für die Grundschule heute. Münster: Waxmann, 206-229.
Ministerium für Schule und Weiterbildung des Landes Nordrhein-Westfalen (MfSW NRW) (2009): Qualitätsanalyse in Nordrhein-Westfalen. Impulse für die Weiterentwicklung von Schulen. Düsseldorf.
Ministerium für Schule und Weiterbildung des Landes Nordrhein-Westfalen (MfSW NRW) (2016): Qualitätsanalyse in Nordrhein-Westfalen. Landesbericht 2016. Düsseldorf.
Qualitäts- und UnterstützungsAgentur – Landesinstitut für Schule (QUA-LiS NRW) (2011): Anlage 1.4 Checkliste – Leistungskonzept (Material Nr. 2955), verfügbar unter: schulentwicklung-nrw.de.
Porsch, R.; Ruberg, C.; Testroet, I. (2014): Elemente einer Didaktik der Vielfalt. Die Bewerbungsportfolios der Schulen. In: S.-I.Beutel, W. Beutel (Hg.): Individuelle Lernbegleitung und Leistungsbeurteilung. Lernförderung und Schulqualität an Schulen des Deutschen Schulpreises. Schwalbach/Ts.: Wochenschau Verlag, 16-87.  
Sacher, W. (2014): Leistungen entwickeln, überprüfen und beurteilen. Bewährte und neue Wege für die Primar- und Sekundarstufe. 6., überarbeitete und erweiterte Auflage. Bad Heilbrunn: Klinkhardt.
Wiliam, D. (2018): Embedded Formative Assessment. Second Edition. Bloomington, IN: Solution Tree.
Winter, F. (2012): Leistungsbewertung. Eine neue Lernkultur braucht einen anderen Umgang mit den Schülerleistungen. 5., überarbeitete und erweiterte Auflage. Baltmannsweiler: Schneider.


09. Assessment, Evaluation, Testing and Measurement
Paper

Knowing without Doing: Chinese Primary Citizenship Teachers’ Perceptions and Practices of Assessment Policies

Peng Zhang, Enze Guo

IOE, UCL’s Faculty of Education and Society

Presenting Author: Zhang, Peng; Guo, Enze

Based on the policy enactment perspective (Ball et al., 2011), this study investigates how primary citizenship teachers do assessment policies in their practice and discusses its influencing factors. While the citizenship programme remains non-statutory at the primary level in many countries, such as England (Richardson, 2010), the programme is mandatory in China due to the emphasis on fostering socialist identity and moral cultivation. The programme standards mandate an assessment approach that focuses on students' 'values' and 'process performance' (Ministry of Education, 2022, p.49) rather than test scores. Advocating 'assessment for learning' (ibid., p. 50), the standards call for a greater emphasis on formative assessments. To ensure the full implementation of national education policies (Lu et al., 2018, p.113), China has established an internal agency system - the System of Pedagogical Research Officer. Primary citizenship programme pedagogical research officers are appointed by the district education authorities and have administrative powers. As intermediaries, they hold the authority and responsibility to interpret, translate, and organise citizenship assessment policies in practice.
Contrasting with the popular perception in China that relegates teachers to the role of policy implementers, scholars (Braun, et al., 2011; Ball, 2011) acknowledge teachers as policy enactors. As Ball (1994, p. 19) asserts, policies do not typically provide a set course of action, but rather create situations where the choices for what to do are limited or altered, or specific aims or outcomes are established. The majority of educational policies depend on their realisation through teaching, positioning teachers not merely as implementers, but as interpreters and 'translators' of policy (Perryman et al., 2017, p.745). This act of 'translation' suggests that while teachers adhere to policy, they also make adaptive modifications.
The study reveals that all schools employ standardised tests—developed by the district's education authorities—as summative assessments for students from Year 3 onwards, despite the absence of such tests for Year 1 and 2. Notwithstanding the stipulation in assessment policies that 'assessment results should be graded rather than scored' (Ministry of Education, 2022, p. 52), test results are ultimately rendered in the form of scores. Formative assessment post Year 3 is notably sparse, predominantly consisting of verbal feedback within classes. This is the case despite a unanimous acknowledgment that the curriculum standards advocate against basing judgments of students’ learning performance solely on test results.
Teachers perceive this disjunction—being aware of but not adhering to the assessment policies—as an outcome of the 'internal disintegration of the policies', a consequence of intermediary influences. In a manner akin to the role of medieval bishops interpreting the Bible, the pedagogical research officer wields unassailable authority within their community to interpret and translate official assessment policies. Their instructions and guidelines are considered the truly applicable policies, while the national assessment policies are often disregarded as overly 'idealistic' and 'abstract'. Furthermore, formative assessment is decried as a privilege available only to economically developed regions, which have the financial means to engage national and international experts for knowledge dissemination and practical guidance. Teachers also face considerable pressure from parents. As primary schools increasingly serve as childcare providers, teachers interact more directly and frequently with parents, many of whom express scepticism towards formative assessment due to its absence of tangible scores and rankings. The teachers identify the prevailing culture of competition or the 'rat race' endemic in nowadays China as the root of these challenges. As the country’s economy slows, societal pressure to compete escalates, underscoring the view that ‘excellence is not the sole goal, but more importantly, to be better than others'.


Methodology, Methods, Research Instruments or Sources Used
One district has been chosen as a case in this study, located in the capital of a border province - a city which, despite being officially defined as a regional centre, has significantly less economic and cultural impact compared to Beijing and Shanghai. The collected data comprised both interview and documentary sources. Primary school citizenship educators within this district, invited via purposive sampling, contributed to the interview data. This method ensured that participants freely expressed their authentic opinions. 13 teachers, covering all primary schools in this district, were interviewed across two rounds. Each participant possessed over five years of experience teaching the citizenship programme and was actively involved in student assessment practices. Due to pandemic-induced international mobility restrictions, the interviews were conducted online.

The interview Data were gathered in two rounds using semi-structured interviews. In the initial phase during 2019-2020, educators were interviewed for approximately 60 minutes to comprehend their perspectives on formative and summative assessment. Following the introduction of the new citizenship standards in 2022, the same teachers were invited for a second round of interviews, with each session extending close to 120 minutes. The objective was to gauge their views on assessment policies and particular practices. Initially conducted in Chinese, the interviews were later translated into English. Subsequently, the interview data were subject to thematic coding and analysis. Although this research is patently theory-driven — underpinned by the policy enactment perspective (Ball et al., 2011) — the attempt was to suspend any pre-existing theoretical expectations or biases during the coding phase as far as practicable. This was not only because the study aimed to present 'open' results, displaying authentic teacher viewpoints and practices, but also because it anticipated the emergence of themes beyond existing frameworks. Thematic data were continually compared throughout the coding process until saturation was achieved.

The documentary evidence encompassed the latest Primary Citizenship Programme Standards (2022 edition), 17 examination papers pertaining to the Year 3-6 citizenship programme (September 2016 to September 2022), and the topic outlines and associated documentation for the in-service citizenship teacher training over the past four years (September 2018 to September 2022). This data was contributed by participants who believed these documents played a policy role and exerted a structural impact on their assessment practice.

Conclusions, Expected Outcomes or Findings
This study diverges from the former studies emphasis on teachers’ personal factors and the Confucian testing tradition (Herman et al., 2015; Poole, 2016; Yan et al., 2021), instead investigating the manner in which teachers do official assessment policies by utilising the policy enactment perspective (Ball et al., 2011).Contrary to findings from England (Braun et al., 2011) and Ireland (Skerritt et al., 2021), the agency system in China does not invariably act as a catalyst and facilitator. On the contrary, it tends to fragment national assessment policies. This study therefore disputes the widespread belief in China that inadequate policy execution is due to teachers’ incompetence. Teachers often rely on intermediaries for policy interpretation, with these interpretations significantly influencing their behaviours. Additionally, most prior assessment studies in East Asia were centred in metropolitan regions, such as Hong Kong (Yan et al., 2021) and Beijing (Lu et al., 2018). However, this study revealed that teachers in non-metropolitan areas perceive formative assessment as a cultural benefit deriving from economic development, due to easier access to pertinent resources and support for metropolitan teachers. The child-care role that China’s primary schools play exerts greater assessment pressure on teachers from parents, compared to English secondary schools (Richardson, 2010).

Encouragingly, however, change is already underway. During the second round of interviews conducted in 2022, many teachers indicated efforts being made to enhance the status of formative assessment. They expressed gratitude towards this study, as it illuminated the utility of formative strategies in advancing student progress through practical experience, despite a lack of adequate support and training.

References
Ball, S.J. (1994). Education reform: A critical and post-structural approach, Buckingham, UK: Open University Press.

Ball, S. J., Maguire, M., & Braun, A. (2011). How schools do policy: Policy enactments in secondary schools. Routledge.

Braun, A., Ball, S. J., Maguire, M., & Hoskins, K. (2011). Taking context seriously: Towards explaining policy enactments in the secondary school. Discourse: Studies in the Cultural Politics of Education, 32(4), 585-596.

Herman, J., Osmundson, E., Dai, Y., Ringstaff, C., & Timms, M. (2015). Investigating the dynamics of formative assessment: Relationships between teacher knowledge, assessment practice and learning. Assessment in Education: Principles, Policy & Practice, 22(3), 344-367.

Lu, L. T., Shen, X., Liang, W. (2018). The composition and characteristics of practical knowledge of district and prefectural level pedagogical research officer: An example of district and prefectural level pedagogical research officer in Beijing, Teacher Education Research (教师教育研究),30(06),112-118.

Ministry of Education, (2022). Curriculum standards for morality and the rule of law in compulsory education. Beijing: Beijing Normal University Press.

Perryman, J., Ball, S. J., Braun, A., & Maguire, M. (2017). Translating policy: Governmentality and the reflective teacher. Journal of Education Policy, 32(6), 745-756.

Poole, A. (2016). ‘Complex teaching realities’ and ‘deep rooted cultural traditions’: Barriers to the implementation and internalisation of formative assessment in China. Cogent Education, 3(1),1-14.

Richardson, M. (2010). Assessing the assessment of citizenship. Research Papers in Education, 25(4), 457-478.

Skerritt, C., McNamara, G., Quinn, I., O’Hara, J., & Brown, M. (2021). Middle leaders as policy translators: Prime actors in the enactment of policy. Journal of Education Policy, 1-19.

Yan, Z., & Brown, G. T. (2021). Assessment for learning in the Hong Kong assessment reform: A case of policy borrowing. Studies in Educational Evaluation, 68, 100985.


09. Assessment, Evaluation, Testing and Measurement
Paper

Investigation of Careless Responding on Self-Report Measures

Başak Erdem Kara

Anadolu University, Turkiye

Presenting Author: Erdem Kara, Başak

The use of scores from self-report measures are very common in several areas of research. Since those instruments provide researchers to measure some psychological constructs such as personality, attitudes, beliefs, emotions of too many respondents in a short time, they are preferred widely for data collection process (Alarcon & Lee, 2022; Curran, 2015; Ulitzsch et al., 2022). However, some important problems may occur when responders do not give their best effort to select the response correctly reflecting themselves (which is very common especially with the unmotivated responders) (Rios & Soland, 2021; Schroeders et. al., 2022). Individuals may respond to items without reading them, by misinterpreting them or be unmotivated to think about (Huang et al., 2012; Ward & Meade, 2022). This type of responding behaviour have been stated as random (Beach, 1989), careless (Meade & Craig, 2012), insufficient effort (Huang et al., 2012), disengaged responding (Soland et al., 2019) in the literature. In the context of this study, the term ‘careless responding’ with CR abbreviation is preferred. Careless responding (CR) behaviour is a major concern based on the data taken from self-report scales in any type of research (Meade & Craig, 2012). Even the amount is small, it may affect the data quality and results of the study severely. Careless responses may introduce a measurement error, weaken the relationship between variables and inflate the Type II error. It may also introduce a new source of construct-irrelevant variance to the process and end up with an undesirable effect on psychometric properties of the scale (item difficulty, average scores, test reliability, factor structure etc.). Briefly, CR has the potential to weaken the test scores’ validity in different ways (Beck et. al., 2019; Rios & Soland, 2021).

Considering the factors stated above, CR have become an important and interesting research topic for researchers with a growing interest. One of the most important aspects on CR research is the way how we can detect and cope with them to ensure the quality of survey data. Identifying careless responders and removing them from the dataset is one of the suggested ways to increase data quality. In the literature, there are several data screening methods mainly classified in two groups; priori and post-hoc. Priori methods are the ones that are planned and incorporated into data collection process before the administration of survey. On the contrary, post-hoc methods get involved in the process after data collection. They are implemented on the collected dataset and typically based on a statistical calculation.

While there are several studies focusing on the effect of careless responding on datasets and comparison of the efficacy of CR identification methods, there is still no clear answer about the detection accuracy of CR identification methods (Goldammer et al., 2020). Besides, this study will focus on prior methods that have been studied and focused less on previous studies.

The present study will handle three different ways of prior methods (instructed response items, reverse items and self-report items) which will be explained in method part in detail. In the context of this study, these three ways of CR identification will be used, examinees will be removed from dataset according to those methods separately and their effects on psychometric properties of data will be investigated. This study addresses the following research questions;

- How was the distribution of careless responders with respect to three different CR identification methods?

- How did psychometric properties of the data (scale mean, reliability, correlation between factors, factor structure etc.) change when careless respondents were removed from the data with respect to different CR identification methods?


Methodology, Methods, Research Instruments or Sources Used
The purpose of that study is to examine self-report data for careless responding, to investigate the effect of CR on psychometric properties of dataset and to compare the performance of CR identification methods. Three different priori methods will be used for this purpose; instructed response, reverse and self-report items. Instructed response items are special items instructing respondents to select one specific category and the ones that choose another option than the instructed response, are assumed as careless. Reverse items are used as attention control items. Individuals are expected to select responses in opposite directions for reverse items. When they give same or too similar answers, it is assumed as an indicator of CR. Lastly, self-report items directly ask individuals about their effort (e.g. ‘I put forth my best effort in responding to this survey’; Meade & Craig, 2012).
In the context of this study, a self-report scale will be used for data collection purpose. A manipulated version of this instrument will be formed by adding one instructed-response item (‘Please select ‘strongly agree’ for this item’), one manipulated reverse item and one self-report item (‘I did my best while responding to the scale’). It is planned that manipulated form will be applied to approximately 500 students. Only one instructed response item will be added to the original scale and individuals selecting the response other than the instructed one will be handled as CR. Additionally, one reverse item for one of the items on the original scale will be purposefully added and individuals choosing the same or similar responses for reverse items will be assumed as CR. Lastly, only one self-report item will be included at the end of the scale and responders will be evaluated according to their own answers in terms of CR. Percentage of careless responders will be calculated for each method separately and psychometric properties of data (scale mean, factor loadings, reliability, explained variance etc.) will be examined. After that, careless responders will be excluded from dataset according to three methods separately and the three separate remaining datasets will be examined again to see how psychometric properties (scale means, reliabilities, correlation between factors etc) were affected by that removal. Lastly, in order to see which CR identification method performed most efficiently and improved data quality, psychometric properties (reliability, factor structure etc.) of remaining datasets will be compared separately.

Conclusions, Expected Outcomes or Findings
The finding of this study is important for the researchers and practitioners who are using self-report measures for data collection and making conclusions based on that data. Careless responses may cause ‘dirty data’ and may affect the results significantly. So, some investigations should be considered in order to make data cleaning. In addition, result will investigate the efficiency of using of different prior methods and some suggestions will be made on CR identification. I hope that this study will help to fill some gaps in careless responding identification and eliminating its’ effect in a better way.
References
Alarcon, G. M., & Lee, M. A. (2022). The relationship of insufficient effort responding and response styles: An online experiment. Frontiers in Psychology, 12. https://www.frontiersin.org/article/10.3389/fpsyg.2021.784375

Beck, M. F., Albano, A. D., & Smith, W. M. (2019). Person-fit as an index of inattentive responding: A comparison of methods using polytomous survey Data. Applied Psychological Measurement, 43(5), 374–387. https://doi.org/10.1177/0146621618798666

Curran, P. G. (2015). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66(2016), 4–19. https://doi.org/10.1016/j.jesp.2015.07.006

Goldammer, P., Annen, H., Stöckli, P. L., & Jonas, K. (2020). Careless responding in questionnaire measures: Detection, impact, and remedies. The Leadership Quarterly, 31(4). https://doi.org/10.1016/j.leaqua.2020.101384

Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), 99–114. https://doi.org/10.1007/s10869-011-9231-8

Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437–455. https://doi.org/10.1037/a0028085

Rios, J. A., & Soland, J. (2021). Parameter estimation accuracy of the effort-moderated item response theory model under multiple assumption violations. Educational and Psychological Measurement.
 
Schroeders, U., Schmidt, C., & Gnambs, T. (2022). Detecting careless responding in survey data using stochastic gradient boosting. Educational and Psychological Measurement, 82(1), 29–56. https://doi.org/10.1177/00131644211004708

Ward, M. K., & Meade, A. W. (2022). Dealing with careless responding in survey data: prevention, identification, and recommended best practices. Annual Review of Psychology, 74(1). https://doi.org/10.1146/annurev-psych-040422-045007

Ulitzsch, E., Yildirim-Erbasli, S. N., Gorgun, G., & Bulut, O. (2022). An explanatory mixture IRT model for careless and insufficient effort responding in self-report measures.


09. Assessment, Evaluation, Testing and Measurement
Paper

Shaping and Inspiring a Fair Thinking in assessment. A research with pre-service and in-service teachers

Debora Aquario1, Norberto Boggino2, Elisabetta Ghedin1, Juan Gonzalez Martinez3, Griselda Guarnieri2, Teresa Maria Sgaramella1

1University of Padova, Italy; 2Universidad Nacional de Rosario, Argentina; 3Universitat de Girona, Spain

Presenting Author: Aquario, Debora; Ghedin, Elisabetta

How can we activate and create assessment systems that lead to a flourishing school where everyone is able to fulfil their potential and achieve both success and well-being? How might we shift assessment practices toward equity and justice/fairness? How do the assessment methods meet the diversity of the students? The research project SHIFT (Shaping and Inspiring a Fair Thinking in assessment) aims to investigate how a range of emerging trends within the international community can be used to answer these questions. These trends concern the literature on: (1) human capabilities (Sen, 1999) as a framework for ‘social justice’, (2) Assessment for Learning (Swaffield, 2011) as the horizon for understanding assessment, (3) Universal Design for Assessment (CAST, 2011) as the philosophy that attempts to go beyond the ‘model of adjustment’ and (4) such approaches as: fair and equitable assessment (Tierney, 2013; Montenegro & Jankowski, 2020), culturally-responsive assessment (Nortvedt et al., 2020), inclusive and universal assessment (Waterfield & West, 2006; Nieminen, 2022; Tai et al., 2021).

An increased focus on equity and justice emerges from the 2030 Agenda for Sustainable Development, where the commitment is to provide inclusive and equitable quality education at all levels, as well as from other European and international documents (OECD, 2012, 2005; UNESCO, 2015, 2022). The same concern is evident in empirical studies focused mainly in higher education contexts (Nieminen, 2022; Tai et al., 2021), and what has become evident as more and more assessment researchers and practitioners engage with the equity conversation is the desire for considering these issues also in school context. Moreover most assessment research is based on what can be described as a ‘technical perspective’, looking at whether assessment is efficient, reliable, valid, leaving less space for a “humanistic” perspective that highlights assessment to foster learning for human flourishing and for responsibility toward and within society (Swaffield, 2011; Fuller, 2012; Gergen & Gill, 2020; Hadji, 2021).

Dialogue into the paradigms of assessment is of paramount importance if assessment aims at embracing a focus on equity, ethics, and humanization and meeting the challenges of these times. The paradigm shift was initiated many years ago, moving from assessment of learning towards assessment for learning, giving greater attention on the role of learners (opening the way to participatory approaches connecting school and community), on a shift from product to process-focused assessment and on a view of learning as a lifelong process rather than something done to prepare for an exam. Although these changes have been partially incorporated into the debate about educational assessment, work remains to be done to ensure the necessary attention to the issue of diversity among learners. Such an approach would strengthen the value of the shift and enlarge the potential of the assessment process towards the promotion of all students’ learning and growth moving away from a model of adjustments, which makes specific reasonable accommodations for some students towards assessment models that allow all students to fully participate and learn in the most equitable way.

Coherently with the theoretical framework, the research design addresses the importance of engagement, participation and opportunities for access, choosing a community-based approach interconnected with the appreciative one, seeking to produce a new imaginary for approaches to assessment, with implications for both cultures and practices. The aim of the Programme is to connect assessment with justice and equity through a participatory process sustaining the shift towards a fair thinking in assessment.


Methodology, Methods, Research Instruments or Sources Used
SHIFT intends to give value to bottom-up research practices articulating the logic and the flow of Community-Based Participatory Research (CBPR) and Appreciative Inquiry as follows. At this point, steps 1 and 2 have been implemented.
STEP 1: Discovering & identifying the community: implemented through an initial phase consisting in different activities of Public Engagement (PEa). The following activities have been realised to raise awareness and mutual understanding (by establishing a common language and participatory multi-actor dialogues) and to promote and develop a shared assessment literacy about inclusiveness and diversity: an open webinar day devoted to the discussion of the main topics about accessibility and equity; a website; the identification of a logo for the project; initiatives specifically oriented to schools (‘dialoghi pedagogici’); a blog about the research keywords.
PEa represented the ground on which a Call to Action (CtA) will be opened as a strategy for discovering engagement. It has been devoted to the schools of all educational levels in order to collect concern about the key issues and co-constructing the community of research, composed by a network of 7 schools (from early childhood level to middle school level). Moreover, a group of 250 prospective teachers were involved in the research with the aim of exploring the same issues with pre-service teachers in order to explore different understandings and purposes of assessment in the two groups (Brown, Remesal, 2012).
STEP 2. Dreaming & Co-creating shared images of a preferred future. Based on the results of the CtA, the step 2 invites to begin “envisioning together”: asking themselves “what might be?” imagining and envisioning how things might work well in the future. Therefore the aim is collaborating for identifying and fostering the capacity to aspire and imagine possible and future actions.
Instruments used are the following: panel discussions with teachers (one for each of the 7 schools, for a total of 60 involved teachers from early childhood to secondary school level, and 4 panel discussions with 50 students enrolled in teacher education programs from 5 different countries -UK, Turkey, Lithuania, Netherlands and Portugal); written interviews administered to 200 students enrolled in the teacher education program at University of Padova. These questions guided the reflection: How might we shift assessment practices toward equity and accessibility? How might we assess for learning and growth of all students? How assessment methods meet the diversity of the students?

Conclusions, Expected Outcomes or Findings
First analyses from the panel discussions and written interviews with pre-service and in-service teachers show that assessment is fair when the following aspects become relevant: assessment as an integral part of the teaching strategy (use of assessment by the teacher to make teaching decisions and the consequent need to consider changes in teaching and assessment jointly and reciprocally); the “pedagogical” process as distinct from the “administrative” process (the process of reflection and use of evaluation criteria not confused with the attribution of the mark/grading); communication (need to pay attention to the communication moments -in progress and final-, both to the students and their families; need to act for change in the field of assessment by working on all dimensions -student, family, teachers, school head- in a parallel and integrated way. Differences in narratives by pre- and in-service teachers will be presented.
Next steps 3 and 4 concern the design of innovative ways to bring into existence the preferred future participants have envisioned in the Dream step through the use of participatory videos (Boni et al., 2020) and the realisation of a narrative storytelling process for sustaining the change in assessment culture and practice towards fair assessment.
 Expected final outcomes consist of:
- an open digital toolkit for a fair assessment: flexible/modular in its structure/implementation, accessible and usable. Examples of the toolkit contents: a guide for a universal approach to assessment; guidelines (with different resources, tips and examples) for designing accessible and universal assessments; good practices of fair assessment.
- multimedia open educational resources (OER, Wiley, 2006) with the aim to offer an open learning path for all those who want to be self-trained in the research topics (audiovisual material, references and readings, simulations, workshops, guides about assessment contents, multimedia resources from public engagement activities and participatory videos).

References
Aquario D. (2021). Through the lens of justice. A systematic review on equity and fairness in learning assessment. Education Sciences & Society, 2, 96-110.
Brown G. T. L., & Remesal A. (2012). Prospective teachers’ conceptions of assessment: A cross- cultural comparison. The Spanish Journal of Psychology, 15(1), 75–89.
Bushe G. R. (2012). Foundations of Appreciative Inquiry: History, Criticism,and Potential. AI Practitioner: The International Journal of AI Best Practice, 14 (1), pp. 8-20.
Center for Applied Special Technology (2011). Universal Design for Learning (UDL). Guidelines version 2.0, CAST, Wakefield.
Hanesworth P., Bracken S. and Elkington S. (2019). A typology for a social justice approach to assessment: Learning from universal design and culturally sustaining pedagogy. Teaching in Higher Education, 24 (1), 98-114.
Heritage M., & Wylie C. (2018). Reaping the benefits of assessment for learning: achievement, identity, and equity. ZDM, 50 (4), 729–741.
Klenowski V. (2014). Towards fairer assessment. Australian Educational Researcher, 41, 445–470.
Levy, J., & Heiser, C. (2018). Inclusive assessment practice (Equity Response). Urbana, IL: University of Illinois and Indiana University, NILOA.
McArthur J. (2016). Assessment for social justice: the role of assessment in achieving social justice. Assessment & Evaluation in Higher Education, 41, 7, 967-981.
Montenegro E., & Jankowski N. A. (2020). A new decade for assessment: Embedding equity into assessment praxis. Urbana, IL: University of Illinois and Indiana University, NILOA.
Murillo F. J., Hidalgo N. (2017). Students’ conceptions about a fair assessment of their learning. Studies in Educational Evaluation, 53, 10-16.
Nortvedt, G.A., Wiese, E., Brown, M. et al. (2020). Aiding culturally responsive assessment in schools in a globalising world. Educational Assessment, Evaluation and Accountability, 32, 5–27.
Scott S., Webber C. F., Lupart J. L., Aitken N. and Scott D. E. (2014). Fair and equitable assessment practices for all students. Assessment in Education: Principles, Policy & Practice, 21,1, 52-70.
Sen A. (1999). Development as freedom. Oxford: Oxford University Press.
Stobart G. (2005). Fairness in multicultural assessment systems. Assessment in Education, 12, 3, 275–287.
Swaffield S. & Williams M. (Eds.) (2008), Unlocking assessment: Understanding for reflection and application. London: David Fulton.
Tierney R.D. (2014). Fairness as a multifaceted quality in classroom assessment. Studies in Educational Evaluation, 43, 55-69.
Zhang, J., Takacs, S., Truong L., Smulders, D., Lee, H. (2021). Assessment Design: Perspectives and Examples Informed by Universal Design for Learning. Centre for Teaching, Learning, and Innovation. Justice Institute of British Columbia.
 
Date: Friday, 25/Aug/2023
9:00am - 10:30am09 SES 14 B: Exploring Factors Influencing Motivation, Engagement, and Attitudes in Education
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Trude Nilsen
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

The Relationship Between Parental Mindsets and Children's Motivation in Mathematics

Cecilia Thorsen1, Kajsa Yang Hansen1,2

1University West, Sweden; 2University of Gothenburg, Sweden

Presenting Author: Thorsen, Cecilia; Yang Hansen, Kajsa

Mathematics is perceived by many students as a particularly difficult subject, and many tend to experience higher levels of anxiety in relation to mathematics compared to other subjects (Goetz et. al., 2007). At the same time, mathematical competencies are fundamental to several aspects of contemporary society (OECD, 2013). Fostering motivation is therefore important for supporting students who experience difficulties in mathematics, especially since motivation is a driving force for learning mathematics over time (Wigfield et al., 2016). A number of studies have shown a positive relationship between motivation and achievement in mathematics, regardless of theoretical approach (e.g., Kriegbaum et. al., 2018; Prast et. al., 2018). Students who are motivated also tend to engage more in mathematical activities because they find them enjoyable and interesting (Eccles & Wigfield, 2004), and the development of motivation for mathematics during elementary school is related to the choice of mathematics-intensive careers (Musu-Gillette et. al., 2015).

One of the most important theories of motivation for mathematics is the Expectancy Value Theory of Motivation (EVM) proposed by Eccles, Wigfield, and colleagues. According to EVM, motivation is a function of a person's expectancy of success and the value they place on the task. Expectancy of success refers to a person's belief in his/her own ability to to successfully complete a task, and value refers to the importance or relevance of the task to the person's goals or interests. Students with a higher expectancy of success and a higher value placed on mathematics tend to have higher motivation and achievement in mathematics (Wigfield et. al., 2016).

Another theory relevant to motivation is Dweck's (1995) theory of implicit intelligence. The theory states that individuals can have implicit beliefs about the nature of intelligence that can be either fixed or malleable. People with a fixed mindset believe that intelligence is not changeable, whereas people with a growth mindset believe that intelligence can be developed through effort and learning. Students' implicit beliefs about intelligence are related to both academic achievement and motivation (e.g., Song et al., 2022), implying that students with a growth mindset tend to develop several adaptive academic behaviors, such as higher motivation and achievement, than those with a fixed mindset (Yeager and Dweck, 2012).

Wigfield et al. (2004) hypothesized that Dweck's theory of implicit intelligence is related to EVM in that individuals who believe their abilities cannot be improved through effort will not engage in activities they believe they are not very good at. However, few studies have examined how such motivational beliefs are formed in children. Eccles and Wigfield (2020) proposed in their situated expectancy-value theory (SEVM) that beliefs and values are also shaped by social context, such as family, peers, and culture. In a study of how parental beliefs about fixedness of ability affect interactions with their children, Muenks et al. (2015) found that parents with fixed mindsets engaged in more controlling and achievement-oriented behaviors and were less likely to engage in math-related activities with their children. Although few studies have examined how parents' mindset affects their children's motivation, a study by Song et al. (2022) showed that children reported having greater self-reported persistence when their parents had more growth mindset. Xie et al. (2022) also found that parents' mindset indirectly predicted math anxiety through their failure beliefs.

Thus, the present study aims to investigate the role of parents' beliefs about mathematical ability, i.e., their fixed or growth mindset, in fostering student motivation. Specifically, we focus on parents' beliefs of mathematical ability as innate or malleable, and whether and how parents' mindsets affect students' self-concepts about their ability, value, and achievement of mathematics.


Methodology, Methods, Research Instruments or Sources Used
Participants
Participants were about 600 elementary school students in grade 3 and 4  and their parents. Both children and parents participated in a larger study examining the development of motivation for mathematics in the elementary school classrooms. Parental informed consent was obtained for each student participating in the study.

Instrument and procedures
Motivation was assessed using an instrument based on the Expectancy-Value Motivation Scale (EVMS), which included a total of 34 items in five dimensions: Competence Self-beliefs (6 items, e.g., Math is easy for me), Intrinsic Value (8 items, e.g., I like doing math), Achievement Value (7 items, e.g., Being good at math is very important to me personally), Utility Value (7 items, e.g., What I learn in math I can use in my daily life), and Cost (6 items, e.g., Doing math problems keeps me from doing other things I like). All items were answered on a 4-point scale ranging from 'a lot of times' to 'never." In a validation study, the scale was found to be appropriate for early elementary grades and to have a good model fit consistent with expectancy- value theory. The different EVS dimensions also showed good reliability (Peixto et al., 2022).
Parents' mindset was measured by eight items on their beliefs about mathematical ability as innate or malleable. 4 items were used to measure fixed mindset (e.g. Math ability is innate) and 4 items were used to measure growth mindset (e.g., a Child's ability in math can be improved with practise). Responses were given on a 4-point scale ranging from 'disagree' to 'agree." Socioeconomic background was measured by parental education level.
The instruments were developed in English and translated into Swedish. Translation and back-translation procedures were used, and no discrepancies were found. The EVMS instrument was distributed in grades 3 and 4 in Sweden in spring 2022 as part of a larger study. Administration was done at school by trained research assistants using pen and paper questionnaires. Parents received a QR code and answered a digital questionnaire.
Analytic Method
Structural equation modelling (SEM) will be used to examine the relationship between parents' mindset and children's self-concept of ability, values, and achievement in mathematics. A path model will be estimated to examine the mechanisms between parents' fixed or growth mindset and children's self-concept of ability, value of mathematics, and achievement in mathematics according to the SEVM model of Wigfield and Eccles (2020).

Conclusions, Expected Outcomes or Findings
It is expected that parents’ fixed intelligence beliefs will negatively affect their children’s competence self-beliefs, which in turn will affect both attainment value, intrinsic value, and achievement. However, it is also possible that parents’ mindset directly affects achievement. Based on the findings of Song et al. (2022), it is also expected that the effect of parents' mindset is partially mediated by their socioeconomic background, implying that parents with lower socioeconomic backgrounds are more likely to have fixed mindset.
References
Dweck, C. S., Chiu, C.-y., & Hong, Y.-y. (1995). Implicit theories and their role in judgments and reactions: A world from two perspectives. Psychological Inquiry, 6(4), 267–285. https://doi.org/10.1207/s15327965pli0604_1
Eccles, J. S., & Wigfield, A. (2020). From expectancy-value theory to situated expectancy-value theory: A developmental, social cognitive, and sociocultural perspective on motivation, Contemporary Educational Psychology, 61. https://doi.org/10.1016/j.cedpsych.2020.101859.
Goetz, T., Frenzel, A. C., Pekrun, R., Hall, N. C., & Lüdtke, O. (2007). Between- and within-domain relations of students' academic emotions. Journal of Educational Psychology, 99(4), 715–733. https://doi.org/10.1037/0022-0663.99.4.715
Kriegbaum, K., Becker, N., & Spinath, B. (2018). The relative importance of intelligence and motivation as predictors of school achievement: A meta-analysis. Educational Research Review, 25, 120-148.
Musu-Gillette, L.E., Wigfield, A., Harring, J.R., & Eccles, J.S. (2015). Trajectories of change in students’ self-concepts of ability and values in math and college major choice. Educational Research and Evaluation, 21(4), 343-370. https://doi.org/10.1080/13803611.2015.1057161
OECD (2013). PISA 2012 Assessment and Analytical Framework: Mathematics, Reading, Science, Problem Solving and Financial Literacy, Paris: OECD Publishing.
Peixoto, F., Radišić, J., Krstić, K., Hansen, K. Y., Laine, A., Baucal, A., Sõrmus, M., & Mata, L. (2022). Contribution to the Validation of the Expectancy-Value Scale for Primary School Students. Journal of Psychoeducational Assessment, 0(0). https://doi.org/10.1177/07342829221144868
Prast, E., Van de Weijer-Bergsma, E., Miočević, M., Kroesbergen, E., & Van Luit, J. (2018). Relations between mathematics achievement and motivation in students of diverse achievement levels. Contemporary Educational Psychology, 55, 84-96.
Song, Y., Barger, M. M., & Bub, K. L. (2022). The Association Between Parents’ Growth Mindset and Children’s Persistence and Academic Skills. Front. Educ, 6. https://doi.org/10.3389/feduc.2021.791652
Wigfield, A., Tonk, S., & Eccles, J. S. (2004). Expectancy value theory in cross-cultural perspective. In D. M. McInerney & S. Van Etten (Eds.), Big theories revisited (pp. 165-198). Greenwich, CO: Information Age Publishing.

Wigfield, A., Tonks, S., & Klauda, S. L. (2016). Expectancy-value theory. In K. R. Wentzel & A. Wigfield (Eds.), Handbook on motivation in school (2nd ed., pp. 55–76). New York: Routledge.
Xie, F., Duan, X.F., Ni, X.L., Li, L.N., & Zhang, L.B. (2022). The Impact of Parents’ Intelligence Mindset on Math Anxiety of Boys and Girls and the Role of Parents’ Failure Beliefs and Evaluation of Child’s Math Performance as Mediators. Front. Psychol, 13.  https://doi.org/10.3389/fpsyg.2022.687136
Yeager, D. S., & Dweck, C. S. (2012) Mindsets That Promote Resilience: When Students Believe That Personal Characteristics Can Be Developed, Educational Psychologist, 47(4), 302-314. https://doi.org/10.1080/00461520.2012.722805


09. Assessment, Evaluation, Testing and Measurement
Paper

Religiosity and Expected Political Engagement in the Future Among Lower-Secondary Students in 10 European Countries

Wolfram Schulz, John Ainley

ACER, Australia

Presenting Author: Schulz, Wolfram

Using data from the first two cycles of ICCS in 2009 and 2016, this paper analyses the relationship between expected political engagement and affiliation and engagement with religion as well as attitudes toward the influence of religion in society among lower-secondary students in 10 European countries. It reviews changes over time as well as of associations between indicators of religious attachment among young people with indicators of intended political engagement in the future. The databases provided by ICCS provide an excellent opportunity to investigate the links between religious affiliation and beliefs among young people as motivating factors driving expected individual engagement in society.

Religion has been identified as an important influence on civic participation and engagement (see Pancer, 2015; Putnam, & Campbell, 2010; Verba, Schlozman, & Brady, 1995) and research findings suggest that religious affiliation has an impact on political and social engagement among adults (see Ekström & Kvalem, 2013; Guo, Webb, Abzug, & Peck, 2013; Perks, & Haan, 2011; Verba et al., 1995). Similar observations have also been recently reported based on comparative international surveys across different countries (Pew Research Center, 2019a). It has been argued that religious organizations provide networks focused on political recruitment and motivation while participation in religion encourages adherents to consider features of society (a world view) that they see as desirable (Campbell, 2001; Jones-Correa & Leal, 2001; Putman & Campbell, 2010).

Pancer (2015) presented some evidence that schools and neighborhoods may contribute to both civic engagement and religious formation among adolescents. Vermeer (2010) viewed religious education at schools as a contributor to socializing young people in ways that had civic value while Francis et al. (2015) regarded church attendance and education about religion at school as factors that nurture tolerance in a religiously diverse society. In this sense engagement with religion could also be viewed as an important part of a broader civic engagement.

Research also suggested that, even after controlling for other variables, religious tradition and attendance of religious services tend to be related to indicators of civil participation (Smidt, 1999; Storm, 2015). However, other studies have also reported negative effects of religious affiliation on democratic citizenship as manifested in lower levels of political knowledge and lack of political efficacy among strongly religious people (Scheufele, Nisbet, & Brossard, 2003). Research among US adolescents (Porter, 2013) indicates that moral identity may be positively associated with voluntary service and expressive-political involvement but negatively related to traditional-political involvement. Findings from ICCS showed that lower-secondary students with higher levels of civic knowledge were less likely to endorse religious influence in society (Schulz & Ainley, 2017; Schulz et al., 2018). Results also showed that in most countries students who attended religious services held more positive attitudes towards the desirability of religious influence on society (Schulz et al., 2010 & 2018; Schulz & Ainley, 2017).

The relationship between religious attachment and civic engagement is a phenomenon, which has frequently been highlighted in other studies. This paper provides evidence about changes in religious affiliation and attitudes toward the importance of religion for society between 2009 and 2016. Further, the paper explores how these variables relate to expected participation in the future while considering also the context of the general status of religion in each participating country. Using data from an optional component of the ICCS student questionnaire, this paper investigates the extent to which lower-secondary students from 10 European countries in 2009 and 2016 were attached to a religion, endorsed its influence on society and the extent to which their engagement with religion was related to their expected future participation.


Methodology, Methods, Research Instruments or Sources Used
The first two cycles of the International Civic and Citizenship Education Study (ICCS 2009 and 2016) have provided a data set with unique possibilities for comparative analyses of civic-related learning outcomes (Schulz et al, 2010 & 2018). In both cycles the student questionnaire included an international option on religious affiliation and engagement, as well as on attitudes toward the influence of religion in society that was administered in a majority of participating countries.
Data from 10 European countries that participated in ICCS 2016, met IEA sampling participation standards and implemented the international option regarding religion, are included in the analyses undertaken for this paper. Further, five of these countries also participated in the corresponding option in ICCS 2009 and provide data for reviewing changes over time. As ICCS employed two-stage cluster sampling procedures, the jackknife repeated replication technique (JRR) was used for all analyses to obtain appropriate sampling errors for population means, percentages, regression coefficients, and any other population estimates.
This paper will include a descriptive analysis of the extent in the religious attachment and their attitudes toward religious influence as well as changes between 2009 and 2016. Further, it will present results from path models that predict two forms of expected political engagement in the future: electoral (e.g. becoming informed and voting in elections) and active political participation (e.g. joining political organisations, campaigning and being a candidate). The model will include as predictor variables student characteristics (gender, religious affiliation), context variables (socioeconomic background, community size, students’ attendance of religious services), student attitudes (trust in civic institutions, citizenship self-efficacy) as well as school-related variables (student’s civic participation at school, civic knowledge). In this model, endorsement of religious influence in society will be both treated as a dependent variable as well as a predictor variable for intended political participation.
To reduce the complexity of estimating this model across many countries, the path model is based mainly on manifest indicators. As civic knowledge is represented by five plausible values and a multiple-imputation procedure is applied to consider its measurement error. In the case of variables that represent latent variables, we used the IRT scales without incorporating the measurement model for each latent factor in this model. Models were estimated for each national sample separately and average results with their corresponding standard errors were also computed to provide findings at the level of the combined study.

Conclusions, Expected Outcomes or Findings
When looking at the extent of religious affiliation, engagement and endorsement of religious influence in society as well as at change between the two first cycles of ICCS in 2009 and 2016, there were considerable differences across participating countries. In some national contexts, majorities of students saw themselves as part of a religion and reported attendance at least once a month in religious services while in other countries less than half of their young people identified with a religion. Results from comparisons across the first two cycles suggest slight decreases in religious affiliation and endorsement of religious influence across countries that participated in both cycles.
The results show that, after controlling for other factors, endorsement of religious influence in society was strongly related to religious affiliation, as well as to religious service attendance, and reported participation in a religious group. Endorsement of religious influence on society was associated with religious background and also appeared to be higher in countries with greater religiosity. However, knowledge and understanding of civic principles and practices was negatively related to endorsement of religious influence on society.  
There were no consistent associations between expected electoral participation and religiosity. However, expected active political participation appeared to be related to religious affiliation in almost half of the European countries that participated in ICCS 2016. In some countries, there were also weak but significant associations between religious group participation and expected active political participation.
Results also show that endorsement of religious influence in society was related to expected active political participation to a small but consistent extent. This suggests a transmitted influence of religious background on endorsement of the influence of religion in society through to expected active political participation. However, there was no evidence that endorsement of religious influence in society was related to expected electoral participation.

References
Campbell, D. E. (2004). Acts of Faith: Churches and Political Engagement. Political Behavior, 26:2, 155-180.
Ekström, G., & Kvalem, T. A. (2013). Religion and Youths’ Political Engagement: A Quantitative Approach (thesis). Göteborg University: School of Business, Economics and Law.
Francis, L., Pyke, A., & Penny, G. (2015). Christian affiliation, Christian practice, and attitudes to religious diversity: A quantitative analysis among 13- to 15-year-old female students in the UK. Journal of Contemporary Religion, 30 (2), 249-263.
Guo, C., Webb, N., Abzug, R., & Peck, L. (2013). Religious affiliation, religious. Attendance and participation in social change organizations. Nonprofit and Voluntary Sector Quarterly, 42(1), 34-58.
Jones-Correa, M., & Leal, D. L. (2001). Political Participation: Does Religion Matter? Political Research Quarterly, 54:4, 751-770.
Pancer, S. M. (2015). The psychology of citizenship and civic engagement. Oxford: Oxforf University Press.
Perks T, & Haan M. (2011). Youth religious involvement and adult community participation: Do levels of youth religious involvement matter? Nonprofit and Voluntary Sector Quarterly, 40(1), 107-129.
Pew Research Center (2019). Religion’s Relationship to Happiness, Civic Engagement and Health around the World.
Porter, T. J. (2013). Moral and political identity and civic involvement in adolescents. Journal of Moral Education, 42 (2), 239-255.  
Scheufele, D. A., Nisbet, M. C., & Brossard, D. (2003). Pathways to Political Participation: Religion, Communication Contexts and Mass Media. International Journal of Public Opinion Research, 15:3, 300-324.
Schulz, W., & Ainley, J. (2017). Religious engagement, attitudes toward religion and society, and expected future political participation among young people. Paper prepared for the 76th IEA International Research Conference in Prague, 28-30 June.
Schulz, W., Ainley, J., Fraillon, J., Kerr, D. & Losito, B. (2010). ICCS 2009 International Report. Civic knowledge, attitudes and engagement among lower secondary school students in thirty-eight countries. Amsterdam: IEA.
Schulz, W., Ainley, J., Fraillon, J., Losito, B., Agrusti, G., Friedman, T. (2018). Becoming Citizens in a Changing World. IEA International Civic and Citizenship Education Study 2016 International Report. Cham: Springer.
Smidt, C. (1999). Religion and civic engagement: A comparative analysis. The ANNALS of the American Academy of Political and Social Science. 565 (1), 176-192.
Storm, I. (2015). Religion, inclusive individualism, and volunteering in Europe. Journal of Contemporary Religion, 30 (2), 213-229. doi.10.1080/13537903.2015.1025542.
Verba, S., Schlozman, K. L., & Brady, H. E. (1995). Voice and equality: Civic voluntarism in American politics. Cambridge, MA: Harvard University Press.
Vermeer, P. (2010). Religious education and socialization. Religious Education, 105 (1), 103-116.


09. Assessment, Evaluation, Testing and Measurement
Paper

Assessing Students’ Views About Scientific Inquiry in Sweden: A Cross-sectional Study from Primary School to Upper Secondary School

Zeynep Ünsal1, Jakob Gyllenpalm1, Carl-Johan Rundgren1, Karina Adbo2, Clara Vidal Carulla3

1Stockholm University, Sweden; 2Malmö University, Sweden; 3University of Gothenburg, Sweden

Presenting Author: Ünsal, Zeynep

Scientific Inquiry (SI) is one of the overarching goals for science education all over the world (Abd-El-Khalick et al., 2004). An understanding of SI is fundamental to scientific literacy, and involves combining content knowledge, process skills and an understanding of the processes and methods scientists use to generate new knowledge (Lederman et al., 2014). This study focuses on the last mentioned, which is described as learning about scientific inquiry (Hodson, 1996), currently often disused in terms of learning about scientific practices (Osborne, 2014). However, this learning goal is often obscured due to the conflation between SI as a pedagogical strategy and as a content matter (Gyllenpalm & Wickman, 2011; Lunde, et al., 2015). Both teaching and research have generally focused on SI as either a pedagogical strategy to learn science, or on students´ abilities to conduct scientific investigations. One reason for this is the tacit assumption that students automatically learn about scientific inquiry simply by doing inquiry. Yet, this assumption has since long been challenged by a large body of research which demonstrates the need for explicit instruction about scientific inquiry as content knowledge (Lederman et al., 2019). Another problem has been the lack of valid instruments for meaningful assessment of students’ understanding about SI (Lederman et al., 2014). We address these issues by using the VASI-questionnaire (Views About Scientific Inquiry) developed for this purpose and present findings from Sweden in primary-, middle- and secondary school. The data is a subset of a larger international project (see e.g. Lederman et. al., 2019) but we focus the analysis on the progression of students’ knowledge over time in a cross-sectional study design.

In Sweden the science curriculum is specified for the school years 1-3, 4-6, 7-9 and 10-12, and divided into the subjects physics, chemistry and biology from year 1 . Students begin learning about scientific inquiry in all science subjects already from the first year. A progression in students´ knowledge is then expected as the central content related to SI successively becomes more advanced in later school years (The Swedish National Agency for Education, 2022). Despite this focus on learning about SI in the curriculum, explicit teaching about SI seems to be rare in Sweden. Yet, practical activities where students are engaged in some form of scientific inquiry has a long tradition, although these are often used as a pedagogical strategy for other educational goals (Högström et al., 2012, Lunde et al., 2015).

The purpose of this study is to contribute to an increased understanding of students’ views of SI and how this can develop over time in order to better understand how this important topic can be addressed by teachers, curriculum developers, national test designers and text book authors. In particular, the study examines the following question:

What are students’ views about scientific inquiry in Sweden in primary-, middle and upper secondary school?


Methodology, Methods, Research Instruments or Sources Used
To assess students’ views of SI Lederman et al. (2014; 2019; 2022) have developed the VASI-E (primary school) and VASI (middle- and secondary school) questionaries. Both instruments are based on aspects of SI about which there is general agreement on, and that are both possible and relevant for school children to learn These are:

(1) Scientific investigations all begin with a question and do not necessarily
         test a hypothesis.
(2) There is no single set or sequence of steps followed in all investigations  
        (i.e., there is no single scientific method).
(3) All scientists performing the same procedures may not get the same
        results.
(4) Inquiry procedures can influence results.
(5) Research conclusions must be consistent with the data collected.
(6) Inquiry procedures are guided by the question asked.
(7) Scientific data are not the same as scientific evidence.
(8) Explanations are developed from a combination of collected data and
        what is already known.

The VASI-E excludes items 3, 4 and 7 and with some simplifications of the remaining five. The aspects are contextualized in the instrument with age-appropriate examples.

Data consists of 481 questionaries and 65 interviews. The VASI-E was used at the end of the 3rd grade (N=110) and the beginning of the 4th grade (N=100) in seven primary schools respectively. The VASI was used at the beginning of 7th grade (N=126) at the end of 12th grade (N=145) in five schools respectively. Coding was initiated by reaching consensus for a sample of five questionnaires in each grade level. Each student was given a code of: No Answer, Naïve, Mixed or Informed for every aspect of scientific inquiry. The coding was holistic, meaning that each questionnaire was taken as a whole and if a student expressed an understanding of an aspect of SI on an item not intended to test this particular aspect this was taken into account. In addition, 49 students in grades 3-4, and 16 students in grades 7 and 12 were interviewed to ensure that the coding of the instruments was accurate, and to obtain a deepened qualitative understanding of the students’ views about SI. During the interview students were given a copy of their own questionnaire as a primer to elaborate on their understanding of the questions and scientific inquiry in general.

Conclusions, Expected Outcomes or Findings
In grades 3-4 only two aspects have over 50% informed answers. These are 5 Conclusions consistent with data (75%) and 8 Explanations based on data and prior knowledge (55%). Both aspects were assessed by questions involving dinosaurs – a topic familiar to many students, which might have contributed here. The aspect with the most naïve (40%) but also least informed answers (16%) is aspect 2 No single scientific method. The interviews indicate that many students describe all types of scientific investigations as experiments.

In 7th grade students do not achieve 50% informed answers in any aspect. The most informed are 1 Starts with a question (29,4%), 5 Conclusions must be consistent with data collected (28,6%) and 6 Procedures are guided by the question asked (27,8%). Students in the 12th have more informed views than in 7th grade but the difference is not radical. Only two aspects in the 12th have at least 50% informed answers: aspects 3 Same procedures may not yield same results (58%) and 6 Procedures are guided by the question asked (51%). In both grades 7 and 12 the most naïve answers are in 7 Data and evidence are not the same with 55,6% and 41% respectively. This is interesting as both “evidence” and “data” have overlapping and ambiguous connotations in Swedish unless care is taken to be specific. Simultaneously, grade 12 also have more naïve answers than grade 7 on five of eight aspects.

Care must be taken when comparing primary school, and middle and upper secondary school given the difference in instruments, and how these were coded relative to students’ age. However, a preliminary conclusion is that students’ views about scientific inquiry is far from satisfactory relative to the ambitions laid out in curricular documents and current understanding of this topic in science education research.

References
Abd-El-Khalick, F., BouJaoude, S., Duschl, R., Lederman, N. G., Mamlok-Naaman, R., Hofstein, A., Niaz, M., Treagust, D., & Tuan, H. (2004). Inquiry in science education: International perspectives. Science Education, 88 (3), 397–419. https://doi.org/10.1002/sce.10118.

Gyllenpalm, J., & Wickman, P.-O. (2011). ‘‘Experiments’’ and the inquiry emphasis conflation in science teacher education. Science Education, 95(5), 908–926.

Hodson, D. (1996) Laboratory work as scientific method: three decades of confusion and distortion, Journal of Curriculum Studies, 28:2, 115-135. https://doi.org/10.1080/0022027980280201.

Högström, P., Ottander, C., & Benckert, S. (2012). Laborativt arbete i grunskolans senare år: Lärares perspektiv [Laboratory work in secondary school: Teachers perspectives]. Nordic Studies in Science Education, 6(1), 80–91. https://doi.org/10.5617/nordina.332

Lederman, J.S., Bartels., S., Jimenez, J., Lederman, N.G., Acosta, K., Adbo, K., ... Zhu, Q. (2022). An international assessment of elementary students’ views about scientific inquiry: A study made possible with development of the views about scientific inquiry- elementary (VASI-E) assessment. Paper under review submited to Journal of Research in Science Teaching.

Lederman, J., Lederman, N., Bartels, S., Jimenez, J., Akubo, M., Aly, S., Bao, C., Blanquet, E., Blonder, R., BolognaSoares de Andrade, M., Buntting, C., Cakir, M., EL-Deghaidy, H., ElZorkani, A., Enshan, L., Gaigher, E., Guo,S., Hakanen, A., Hamed Al-Lal, S., …Zhou, Q. (2019). An international collaborative investigation of beginningseventh grade students’understandings of scientific inquiry: Establishing a baseline. Journal of Research in ScienceTeaching. Published online. https://doi.org/10.1002/tea.21512.

Lederman, J. S., Lederman, N. G., Bartos, S. A., Bartels, S. L., Meyer, A. A., & Schwartz, R. S. (2014). Meaningful assessment of learners’ understandings about scientific inquiry— the views about scientific inquiry (VASI) questionnaire. Journal of Research in Science Teaching, 51(1), 65–83. https://doi.org/10.1002/tea.21125.

Lunde, T., Rundgren, C.-J., & Chang Rundgren, S. N. (2015). När läroplan och tradition möts— hur högstadielärare bemöter yttre förväntningar på undersökande arbete i naturämnesundervisningen [How lower secondary science teachers meet external expectations on inquiry-based science teaching]. NorDiNa (Nordic Studies in Science
Education), 11(1), 88–101. https://doi.org/10.5617/nordina.783.

Osborne, J. (2014). Teaching scientific practices: meeting the challenge of change. Journal of Science Teacher Education, 25(2), 177–196. https://doi.org/10.1007/s10972-014-9384-1

The Swedish National Agency for Education (2022). Läroplan för grundskolan, förskoleklassen och fritidshemmet 2022 [Curriculum for the compulsory school, preschool class and the leisure-time centre 2022]. https://www.skolverket.se/undervisning/grundskolan/laroplan-och-kursplaner-for-grundskolan/kursplaner-for-grundskolan.
 
1:30pm - 3:00pm09 SES 16 B: Exploring Methodological Advances in Educational Research and Assessment
Location: Gilbert Scott, 253 [Floor 2]
Session Chair: Erika Majoros
Paper Session
 
09. Assessment, Evaluation, Testing and Measurement
Paper

Competence Assessment Interviewer Effects in a Large-scale Educational Survey: a Replication Using NEPS Data

Andre Pirralha, Laura Löwe

LIfBi, Germany

Presenting Author: Pirralha, Andre; Löwe, Laura

Large-scale educational studies are an important resource to inform policymakers and the general public about the reach and effectiveness of diverse aspects of educational systems in several countries. Competence assessment in institutional settings (e.g. schools) has been an essential factor to collect valid measurements of cognitive abilities or motivations, for example. In order to conduct the assessment sessions, a significant number of test administrators (TAs) are necessary to supervise and coordinate test groups in the participating schools. The TAs undergo specific training and follow a strict protocol to ensure that competence assessment sessions are standardized and comparable so that student achievement data can be meaningfully collected. The TA characteristics can affect the quality of assessment scores and survey data. Differences in their behavior can originate interviewer effects, systematically impacting the validity and comparability of competence assessment tests. While there has been a recent effort to change competence assessment testing to computer-assisted modes of data collection, there is very little research aimed to uncover whether the training sessions and protocols are effectively delivering the goal of preventing TA effects in the first place.

In this paper, we explore the presence and magnitude of interviewer effects on paper-and-pencil competence assessments for mathematics abilities and survey questions in a German nationally representative longitudinal educational survey (National Educational Panel Study - NEPS). For this purpose, we will replicate the Lüdtke et al. (2007) paper, to date the only empirical investigation of TAs interviewer effects we are aware of. Multilevel analyses for cross-classified data are taken to effect to decompose the variance associated with differences between schools and the variance associated with TAs. The results are of use to improve competence assessment testing procedures, particularly by unveiling whether interviewer training and protocols should be improved and to assess the existence and magnitude of interviewer effects in test assessment sessions under paper and pencil-based modes of data collection.


Methodology, Methods, Research Instruments or Sources Used
To effectively study test administrator effects in educational assessments, it is necessary to have a cross-classified data structure. If one test administrator conducts the assessment in each school and does not conduct assessments in any other schools, it is not possible to distinguish test administrator effects from school effects – they are inseparably confounded. Therefore, a prerequisite for separating test administrator effects from school effects is having at least two test administrators administering the assessment to separate groups of students in each school, with the students being randomly assigned to these groups. There is even greater potential to disentangle test administrator and school effects when test administrators conduct assessments in different schools. We follow Lüdtke et al. (2007) statistical procedure. We estimate a cross-classified multi-level model with Markov Chain Monte Carlo (MCMC) estimators.
Conclusions, Expected Outcomes or Findings
Overall, like the original Lüdtke et al. (2007) paper we are replicating, the analysis found that a significant proportion of the variance in mathematics achievement and response behavior was at the school level, but much of this variance was explained by the type of school. In contrast, there were no differences in mathematics achievement or response behavior at the test administrator level. The results of the present study suggest that the procedures used to train test administrators and standardize test administration, which are largely the same procedures used in other large-scale assessment studies (e.g. PISA), were successful in ensuring that the tests were administered consistently to all student groups. This is a reassuring finding given the importance often placed on the outcomes of these kinds of assessments.
References
Blossfeld, H.-P. & Roßbach, H.-G. (Eds.). (2019). Education as a lifelong process: The German National Educational Panel Study (NEPS). Edition ZfE (2nd ed.). Springer VS.
Lüdtke, O., Robitzsch, A., Trautwein, U., Kreuter, F., & Ihme, J. M. (2007). Are there test administrator effects in large-scale educational assessments? Using cross-classified multilevel analysis to probe for effects on mathematics achievement and sample attrition. Methodology, 3(4), 149–159. https://doi.org/10.1027/1614-2241.3.4.149
PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving | en | OECD. (n.d.). Retrieved December 20, 2022, from https://www.oecd.org/education/pisa-2015-assessment-and-analytical-framework-9789264281820-en.htm


09. Assessment, Evaluation, Testing and Measurement
Paper

A Semiparametric Regression Model Applicable to Causal Inference in Various Educational Research Data: Extension of Identification via Heteroskedasticity

Akihiro Hashino

The University of Tokyo, Japan

Presenting Author: Hashino, Akihiro

Causal inference is a crucial topic in empirical education research, as well as in other social sciences (Murnane & Willet 2011). In particular, addressing endogeneity (selection bias caused by unobserved confounders) is arguably the most important issue. However, existing methods in applied research have significant limitations in terms of applicability and policy implications.

First, despite the development of causal inference methods such as panel fixed effects model, difference-in-differences, regression discontinuity design, and instrumental variable regression, the data available for their application are limited. Omitted variable bias is a common problem in observational data, and methods that address this issue have high data requirements. While large-scale survey data used in educational research, such as PISA, TALIS, can provide valuable information, the applicability of causal inference methods is limited or non-existent.

Second, even if these methods could be used, many applied studies are limited to those that assume a linear model or a dichotomous treatment variable. If the true relationship between the outcome variable and treatment variable is nonlinear, the policy implications of the analysis by existing methods are limited or misleading. This is especially relevant in the field of education, where there are many continuous or multi-value discrete treatment variables with nonlinear effects. Class size, school size, years of teacher experience, and teachers’ working hours are typical examples(Jerrim & Sims, 2021; Kraft & Papay, 2014). As the vast amount of past empirical research and accompanying discussion on the educational production function has shown, empirical findings on the nonlinear effects of class size and years of teacher experience will have direct implications for the financial resources available to implement educational policy.

The question is, how can we address challenges like these that we often face?
It is necessary to develop a realistic identification strategy that can address endogeneity and nonlinearity. In this paper, I extend a model-based approach that uses identification via conditional heteroskedasticity (Klein & Vella, 2010) to address the above limitations on causal inference in education research.

Methods using conditional heteroskedasticity are not commonly addressed in applied research, but have been discussed in theoretical literatures. This method models the structure of error terms of equations, and differs from those based on usual design-based identification strategies, but has the significant advantage of having relatively realistic side information requirements for identification. Additionally, this approach can be easily combined with various types of existing regression models, providing more options for empirical research using observational data. I extend the linear model with the novel identification strategy to a semiparametric model (partial linear model) within Bayesian framework and demonstrate the effectiveness of the proposed model using simulated and real data.


Methodology, Methods, Research Instruments or Sources Used
   We propose a model that extends the control function approach discussed in Klein & Vella (2010) to a semiparametric regression model within Bayesian framework.  After discussing the model and its estimation using MCMC methods, we evaluate its performance by using simulated data. The simulation considers both cases where the effects of endogenous treatment variables are linear and nonlinear.

   In addition to these simulated data, we also demonstrate the usefulness of the model in application to real data. Using real data from the Teaching and Learning International Survey (TALIS) 2018, an international survey on teachers’ working environments, we analyze the impact of teachers' long working hours on well-being, job satisfaction, and efficacy by the proposed model. Although TALIS provides useful information for the policy regarding teachers, it is difficult to apply the usual identification strategies of causal inference. Empirical research on teachers' subjective well-being and working environment has been conducted in several academic disciplines, including psychology, education, and epidemiology, but existing studies are highly flawed in terms of causal inferences. Specifically, workload is assumed to be one of the important factors when job satisfaction, sense of efficacy, and other well-being index are used as outcome variables, but the possibility that workload is an endogenous variable and correlated with unobserved confounding factors has been rarely considered. As to the nonlinearity, the question of what range of working hours has a greater impact on welfare has direct implications for the regulation of working hours and other issues. In particular, the detection of nonlinear effects of working hours (e.g., the impact increases rapidly above a certain threshold) is very important. Using the proposed model, we will analyze the effect of working hours on teachers' well-being, taking into account endogeneity and nonlinear effects.

Conclusions, Expected Outcomes or Findings
   Our proposed semiparametric model, which uses identification strategies based on conditional heteroskedasticity, offers several advantages over existing standardized causal inference methods. This approach is less limited in terms of the range of data it can be applied to, and has the ability to detect nonlinear effects of treatment variables. The results from both simulated and real data have demonstrated its ability to successfully contribute to research on policy-relevant questions. In particular, the analysis of TALIS data applying the proposed model revealed that existing studies underestimate the impact of teachers’ long working hours on well-being and overlook nonlinear effects.

   Furthermore, the proposed model is more flexible due to the adoption of Bayesian modeling. An example is the random effects model (hierarchical model) used in the real data analysis in this paper.

   In future research, we may consider relaxing various restrictions and extending the model to a heterogeneous treatment effects model, which would allow for the treatment effect to vary among individuals. In addition, applying this model to various other research topics is also an important avenue for future research.

References
Jerrim, J. and Sims, S. (2021).  When is high workload bad for teacher wellbeing? Accounting for the non-linear contribution of specific teaching tasks, Teaching and Teacher Education,105:103395.

Klein, R., and Vella, F. (2010). Estimating a class of triangular simultaneous equations models without exclusion restrictions. Journal of Econometrics, 154(2), 154-164.

Kraft, M. A., & Papay, J. P. (2014). Can Professional Environments in Schools Promote Teacher Development? Explaining Heterogeneity in Returns to Teaching Experience. Educational Evaluation and  Policy Analaysis, 36(4), 476-500.

Murnane, R. J., and Willett, J. B. (2011). Methods Matter: Improving causal inference in educational and social science research. Oxford; New York: Oxford University Press.
 

 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: ECER 2023
Conference Software: ConfTool Pro 2.6.150+TC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany