Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
09 SES 09 B: Innovative Approaches to Educational Practice and Assessment
Time:
Thursday, 29/Aug/2024:
9:30 - 11:00

Session Chair: Leonidas Kyriakides
Location: Room 012 in ΧΩΔ 02 (Common Teaching Facilities [CTF02]) [Ground Floor]

Cap: 56

Paper Session

Show help for 'Increase or decrease the abstract text size'
Presentations
09. Assessment, Evaluation, Testing and Measurement
Paper

Exploring Implementation of Value Added Model in Slovenia

Gasper Cankar

NEC, Slovenia

Presenting Author: Cankar, Gasper

Value-added indicators are a more accurate method of assessing school performance since they eliminate more non-school factors (Meyer et al., 2017). Slovenian upper secondary schools in the General education track finishing with General Matura have been able to assess value-added measures and track changes over time since 2014. The two time points in question are achievement at the end of Grade 9, just before entering upper secondary schools, and achievement at General Matura examinations. Lower secondary schools can similarly check value-added between Grade 6 and Grade 9 (finishing grade) in different subjects since 2018. These measures are not part of any accountability scheme and are provided for schools’ self-evaluation purposes along with other achievement results.
Value added measures used by slovenian schools are calculated as average residual between actual and predicted students' achievement (Cankar, 2011). Usual method for calculation of predicted values is 'median method', where population is sorted on scores from time point 1 and divided into equal sized groups. Median scores on scores from time point 2 over all groups constitute a series of predicted values for midpoints of groups on timepoint 1. All other values are interpolated from those.
This method of calculating value-added proved robust and relatively straightforward for explaining to teachers and the general public. However, there are also indications that the calculations aren’t optimal. In this paper, we will address the issue of negative national average value-added measures. As a rule, the average value-added over all schools for a chosen year tends to be negative.

There can be many reasons, and within this presentation, we will explore the following research questions:

Could the observed negative average value be associated with school composition factors (primarily the size of the school)?
Could we associate negative average values with school background characteristics (for example, the average income of the school’s municipality)?
Could we associate negative average values with the implementation of the value-added measure (either the median method or some other step in computation)?
Since value-added models rely on differences between time points, which increases the measurement error (Papay et al., 2011), it is important to reduce additional sources of error and insights from this research might be useful to others in the field of value added models.


Methodology, Methods, Research Instruments or Sources Used
To address the mentioned research questions, we will use simple regression techniques or hierarchical linear regression where needed. Data on external examinations and national assessments to calculate value-added measures will come from the National Examinations Centre, while the data on municipalities will originate from the Slovenian Statistical Office. We will use value-added measures for the last five years to demonstrate the stability of findings over time.

Data will we used and analyzed in a responsible manner to protect individual privacy and adhere to legal requirements. This is especially important since the data on whole cohorts of students will be used.

Conclusions, Expected Outcomes or Findings
Value-added models can provide important information and identify underperforming schools, as demonstrated by Ferrão and Couto in the case of Portuguese schools (2014). We expect to provide insight into the problem and either identify the causes of constant negative averages or propose further steps needed to explore and resolve the issue. As value-added measures are also present in other European countries, this research will help other researchers evaluate their value-added models and contribute to a better understanding of the field.
References
Cankar, G. (2011). Opredelitev dodane vrednosti znanja (Izhodišča, primeri in dileme). In Kakovost v šolstvu v Sloveniji (str. 431). (2011). Pedagoška fakulteta. http://ceps.pef.uni-lj.si/dejavnosti/sp/2012-01-17/kakovost.pdf

Ferrão, M., & Couto, A. (2014). The use of a school value-added model for educational improvement: a case study from the Portuguese primary education system. School Effectiveness and School Improvement, 25, 174 - 190. https://doi.org/10.1080/09243453.2013.785436.

Koedel, C., Mihaly, K., & Rockoff, J. (2015). Value-added modeling: A review. Economics of Education Review, 47, 180-195. https://doi.org/10.1016/J.ECONEDUREV.2015.01.006.

Meyer, R. (1997). Value-added indicators of school performance: A primer. Economics of Education Review, 16, 283-301. https://doi.org/10.1016/S0272-7757(96)00081-7.

Papay, J. (2011). Different Tests, Different Answers. American Educational Research Journal, 48, 163 - 193. https://doi.org/10.3102/0002831210362589.


09. Assessment, Evaluation, Testing and Measurement
Paper

Using the Dynamic Approach to Promote Formative Assessment in Mathematics: Αn Experimental Study

Evi Charalambous1, Leonidas Kyriakides1, Margarita Christoforidou2, Ioannis Ioannou3

1Department of Education, University of Cyprus; 2Centre for Educational Research and Evaluation, Cyprus Pedagogical Institute; 3Department of Secondary General Education, Cyprus Ministry of Education, Sport and Youth

Presenting Author: Kyriakides, Leonidas

Teachers who use assessment for formative rather than summative purposes are more effective in promoting student learning outcomes (Chen et al., 2017; Kyriakides et al., 2020). Teachers appear to acknowledge the benefit of formative assessment. However, their assessment practice remains mainly summative oriented (Suurtamm & Koch, 2014; Wiliam, 2017). This can partly be attributed to the fact that teachers do not receive sufficient training in classroom assessment (DeLuca & Klinger, 2010). Teacher Professional Development (TPD) programs intended to improve assessment practice have so far provided mixed results regarding their impact on teachers’ assessment skills (Chen et al., 2017), whereas many studies do not provide any empirical evidence on the impact of student assessment TPD programs on student learning outcomes (Christoforidou & Kyriakides, 2021). In this context, this study aims to explore the impact of a TPD course in formative assessment on improving teachers’ assessment skills and through that on promoting student learning outcomes in mathematics (cognitive and meta-cognitive).

During the first phase of the study, a framework that enables the determination and measurement of classroom assessment skills was developed. This framework examines assessment looking at three main aspects. First, skills associated with the main phases of the assessment process are considered (Gardner et al., 2010; Wiliam et al., 2004): (i) appropriate assessment instruments are used to collect valid and reliable data; (ii) appropriate procedures in administering these instruments are followed; (iii) data emerging from assessment are recorded in an efficient way and without losing important information; (iv) assessment results are analysed, interpreted, and used in ways that can promote student learning; and (v) assessment results are reported to all intended users to help them take decisions on how to improve student learning outcomes. The second aspect of this framework has to do with the fact that assessment skills are defined and measured in relation to teachers’ ability to use the main assessment techniques. Specifically, the framework looks at assessment techniques by considering two important decisions affecting assessment technique selection: a) the mode of response and b) who performs the assessment. Finally, the third aspect of the framework refers to the five measurement dimensions suggested in the Dynamic Model of Educational Effectiveness (Kyriakides et al., 2020): frequency, focus, stage, quality and differentiation. These dimensions allow us to better describe the functioning of each characteristic of an effective teacher (Scheerens, 2016).

Based on the theoretical framework and its dimensions, a questionnaire measuring teachers’ skills in assessment was developed. A study provided support to the validity of the instrument. It was also found that assessment skills can be grouped into three stages of assessment behaviour. These stages were used to make decisions in relation to the content and design of the TPD course which was based on the main assumptions of the DA. First, the DA considers the importance of identifying specific needs and priorities for improvement of each teacher/group of teachers. Second, it is acknowledged that teachers should be actively involved in their professional development to better understand how and why the factors addressed have an impact on student learning. Third, the DA supports that the Advisory and Research Team has should support teachers in their efforts to develop and implement their action plans. Fourth, monitoring the implementation of teacher action plans in classroom settings is considered essential. This implies that teachers should continuously develop and improve their action plans based on the information collected through formative evaluation.


Methodology, Methods, Research Instruments or Sources Used
At the beginning of school year 2019-20, 62 secondary school teachers who taught mathematics in Grades 7, 8 and 9 in Nicosia (Cyprus) agreed to participate. These teachers were randomly split into the experimental (n=31) and the control group (n=31). Randomization was done at the school level to avoid any spillover effect. Students of Grades 7, 8 and 9 of the teacher sample participated in the study. All students of two classrooms per teacher were randomly selected. Our student sample was 2588 students from 124 classrooms. Teachers of the experimental group were invited to participate in a TPD course with a focus on student assessment. Teachers of the control group did not attend any TPD course. However, they were provided the opportunity to participate in the TPD course during the next school year.
Data on teacher skills and student achievement were collected at the beginning and at the end of the TPD course. The instruments used were: (1) a teacher questionnaire, (2) a battery of curriculum-based written tests in mathematics (measuring cognitive skills), and (3) a battery of tests measuring metacognitive skills in mathematics.
To measure the impact of the TPD course on improving teachers’ assessment skills the Extended Logistic Model of Rasch was used to analyse the data emerged from the teacher questionnaire. Data emerged from each measurement period. Then, the Mann Whitney analysis was used to search for any differences between the control and experimental group in terms of teachers’ assessment skills at the beginning and at the end of the intervention.
To measure the impact of the TPD course on improving students’ cognitive learning outcomes, multilevel regression analysis was conducted to find out whether teachers employing the DA were more effective than the teachers of the control group in terms of promoting their students’ learning outcomes in mathematics. In addition, to search for the impact of the intervention on improving students’ metacognitive learning outcomes, three separate multilevel regression analyses, one for each scale measuring regulation of cognition (i.e., Prediction, Planning, Evaluation), were also conducted.

Conclusions, Expected Outcomes or Findings
The Wilcoxon Signed Ranks test revealed that the mean scores of teachers’ assessment skills were higher at the end of the intervention compared to their scores at the beginning of the intervention (Z=4.80, p<0.001). On the other hand, no statistically significant improvement in the skills of the control group was identified (Z=1.21, p=0.23). The Mann Whitney test did not reveal any statistically significant difference between the control and the experimental group in terms of the stage that each teacher was found to be situated at the beginning of the intervention (Z= -0.57, p=0.57). A statistically significant difference at the end of the intervention (Z=2.53, p=0.011) was found. It was observed that none of the teachers of the control group managed to move from the stage he/she was found to be situated at the beginning of the intervention to a more demanding stage. A stepwise progression was observed in the experimental group since 13 out of 31 teachers managed to move at the next more demanding stage. Moreover, the results of all four multilevel analyses revealed that the DA had a statistically significant effect on student achievement in mathematics (in both cognitive and meta-cognitive learning outcomes).
The DA considers the importance of designing a course according to the specific needs and priorities for improvement of each group of teachers, unlike most ‘one size fits all’ professional development approaches. This argument has received some support since it was found that teachers’ assessment skills can be grouped into three stages. This study also reveals that teachers can improve and ultimately progress to the next developmental stage of assessment skills, by undertaking appropriate trainings. Finally, this study has shown the impact of the TPD course based on DA on both cognitive and metacognitive learning outcomes. Finally, implications for research, policy and practice are discussed.

References
Chen, F., Lui, A. M., Andrade, H., Valle, C., & Mir, H. (2017). Criteria-referenced formative assessment in the arts. Educational Assessment, Evaluation and Accountability, 29(3), 297-314.

Christoforidou, M., & Kyriakides, L. (2021). Developing teacher assessment skills: The impact of the dynamic approach to teacher professional development. Studies in Educational Evaluation, 70, 101051. https://doi.org/10.1016/j.stueduc.2021.101051

DeLuca, C., & Klinger, D. A. (2010). Assessment literacy development: Identifying gaps in teacher candidates’ learning. Assessment in Education: Principles, Policy & Practice, 17(4), 419-438. https://doi.org/10.1080/0969594X.2010.516643

Gardner, J., Wynne, H., Hayward L., & Stobart, G. (2010). Developing Teacher Assessment. McGraw-Hill/Open University Press.

Kyriakides, L., Creemers, B.P.M., Panayiotou, A., & Charalambous, E. (2020). Quality and Equity in Education: Revisiting Theory and Research on Educational Effectiveness and Improvement. Routledge.

Scheerens, J. (2016). Educational effectiveness and ineffectiveness: A critical review of the knowledge base. Dordrecht, the Netherlands: Springer. DOI 10.1007/978-94-017-7459-8
 
Suurtamm, C., & Koch, M. J. (2014). Navigating dilemmas in transforming assessment practices: experiences of mathematics teachers in Ontario, Canada. Educational Assessment, Evaluation and Accountability, 26(3), 263-287. https://doi.org/10.1007/s11092-014-9195-0

Wiliam, D. (2017). Assessment for learning: meeting the challenge of implementation, Assessment in Education: Principles, Policy & Practice, 25(6), 686–689. https://doi.org/10.1080/0969594X.2017.1401526

Wiliam, D., Lee, C., Harrison, C., & Black, P. J. (2004). Teachers developing assessment for learning: Impact on student achievement. Assessment in Education: Principles Policy and Practice, 11(1), 49-65. https://doi.org/10.1080/0969594042000208994


09. Assessment, Evaluation, Testing and Measurement
Paper

Teaching Quality In Classrooms Of Different Compositions. A Mixed Methods Approach.

Trude Nilsen, Bas Senden, Armin Jentsch, Nani Teig, Wangqiong Ye

University of Oslo, Norway

Presenting Author: Nilsen, Trude

Teachers’ instruction is at the heart of education, and previous research has shown that teaching quality is important for students’ learning outcomes (e.g. Charalambous & Praetorius, 2020; Seidel & Shavelson, 2007). However, teaching is a two-way process, and less is known about how the composition of the classroom affects teaching quality (TQ). Do for instance high socio-economic (SES) classrooms receive different TQ than low-SES classrooms? To examine this, one would first need to establish whether a so-called compositional effect exists. Compositional effect refers to the effects of, for instance the classroom’s socio-economic status (SES) on student learning outcomes, over and above the effect of students’ individual SES (Van Ewijk & Sleegers, 2010).

Both compositional effects and unfair distribution of high-quality teachers have been found in previous studies in a number of countries (Gustafsson et al., 2018; Luschei & Jeong, 2018; Van Ewijk & Sleegers, 2010) However, in Norway, that for a long time was considered an egalitarian society (Buchholtz et al., 2020), there is a lack of such studies. At the same time, educational inequality has increased in Norway (Sandsør et al., 2021). Hence, the overarching aim of the present study is to examine whether a compositional effect exists, and how the composition of the classrooms affects TQ. We further aim to describe more in depth what characterizes the TQ in classrooms of different compositions in Oslo where the gaps between students are larger and there are more minority students than in the rest of Norway (Fløtten et al., 2023).

The following research questions were asked:

1) What is the effect of the classroom composition (in terms of SES and minority status) on students learning outcomes in science, over and above students’ individual SES and minority status (i.e. the compositional effect)? How does this differ between Oslo and the rest of Norway?

2) What is the effect of the classroom composition on TQ in science, and how does this differ between Oslo and the rest of Norway?

3) What characterizes TQ in science classrooms of different compositions in Oslo?

Theoretical framework for teaching quality.

We chose The Three Basic Dimensions (TBD) framework (Klieme et al., 2009; Praetorius et al., 2018) to conceptualize TQ as this framework is the most commonly used in Europe and by the international large-scale studies (Klieme & Nilsen, 2022). TQ is here defined as the type of instruction that predicts students learning outcomes, and includes the following three dimensions:

1) Classroom management refers to how teachers manage the classroom environment and includes, for instance, preventing undesirable behaviors and setting clear and consistent rules and expectations for student behavior.

2) Supportive teaching focuses on the teacher’s ability to support students both professionally and socio-emotionally, such as providing clear and comprehensive instruction and seeing and listening to every individual student.

3) Cognitive activation includes instruction that enables students to engage in higher-level cognitive thinking that promotes conceptual understanding. Such instruction is characterized by challenging and interactive learning.

The TBD is a generic framework used across subject domains. To address research question 3, and further investigate more in depth the subject-specific aspect of TQ in science, a fourth dimension from the framework Teacher Education and Development Study–Instruct (TEDS-Instruct, e.g. Schlesinger et al., 2018) was included. This framework was adapted to the Norwegian context and to the subject domain of science, and validated. The fourth dimension is called Educational structuring and refers to subject-specific aspects of instruction such as inquiry or dealing with students misconceptions in science.


Methodology, Methods, Research Instruments or Sources Used
Design and sample.

The project Teachers’ effect of student learning (TESO), funded by the Norwegian Research Council, collected data through an extended version of TIMSS 2019, including a representative sample of fifth graders in Norway, a representative sub-sample of Oslo, and video observations of grade six classrooms in Oslo. The students who participated in the video observations in sixth grade, also participated in TIMSS 2019 when they were fifth graders. All students answered questioners and the TIMSS mathematics and science tests.

Measures.

To measure the generic TQ in the second research questions, students’ responses to the questionnaire were used. In the questionnaire, Classroom management was measured by 6 items (e.g. “Students don’t listen to what the teacher says”).  Cognitive activation was measured by 5 items (e.g. “The teacher asks us to contribute in planning experiments”. Both of these were measured using a 4-point frequency scale (from Never to Every or almost every lesson). Teacher support included 6 items on 4-point Likert scales (from Disagree a lot, to Agree a lot), e.g. “My teacher has clear answers to my questions”.
To answer research question 3 and provide more in-depth descriptions of TQ, the more fine-grained TEDS-Instruct observation manual (including 21 items ratted from 1 through 4) was used to rate the videos. The manual measures the same three aspects as TIMSS conceptually, in addition to educational structuring.
SES was measured by students’ responses to the number of books at home (the parents’ responses to their education had more than 40% missing and was hence excluded as a SES indicator).
Minority status was measured by students’ answer to how often they speak Norwegian at home.

Methods of analyses

To answer research questions 1 and 2, we employed multilevel (students and classes) structural equation modelling (SEM), and a multi-group approach to examine differences between Oslo and the rest of Norway.  To avoid multi-collinearity, each aspect of teaching quality was modelled separately and as latent variables. Compositional effects were estimated by subtracting the within level effects from the between level effects.
To answer research question 3, the questionnaires, achievements, and ratings on the videos were linked and merged to one file. Descriptives were used to crate profiles of the ratings of the video observations to describe the characteristics of TQ in classrooms of different compositions.

Conclusions, Expected Outcomes or Findings
RQ1. Compositional effects
The compositional effects were all significant (p< .05) and positive. The effect of SES was 0.44 for Norway, and the multigroup analyses yielded an effect of 0.57 for Oslo and 0.31 for the rest of Norway. The compositional effects of language were 0.45 for Norway, 0.76 for Oslo and 0.45 for the rest of Norway. In other words, the compositional effects for Oslo were very high, while the compositional effects for Norway overall were in line with other Scandinavian countries (Yang Hansen et al., 2022).

RQ2. Relations between classroom composition and TQ
High-SES, and especially low minority classrooms, had positive and significant associations to both classroom management and teacher support. These effects were stronger in Oslo than the rest of Norway. This indicates an unfair distribution of high teaching quality to advantaged classrooms. However, for cognitive activation, there were no significant results at the class level, but a negative association between high-SES, low-minority classrooms and students’ perceptions of cognitive activation. This indicates that advantaged students perceive less challenge and interactive learning.

RQ3. Characteristics of TQ
Results from the video observations showed that TQ in high-SES classrooms were characterized by better classroom management, teacher support, and educational structuring than low-SES classroom, albeit with less cognitive activation. Furthermore, high SES classrooms were characterized by fewer minority students and higher achievements than low SES classrooms. These findings are in line with the results from the questionnaires.

Taken together, the findings from our three research questions points to a school that contributes to increase the gap between students. Classrooms with high shares of advantaged students have access to better teaching quality than classrooms with many disadvantaged students, thus generating unequal opportunities to learn.

References
Buchholtz, N., Stuart, A., & Frønes, T. S. (2020). Equity, equality and diversity—Putting educational justice in the Nordic model to a test. Equity, equality and diversity in the Nordic model of education, 13-41.
Charalambous, C. Y., & Praetorius, A.-K. (2020). Creating a forum for researching teaching and its quality more synergistically. Studies in Educational Evaluation, 67, 100894.
Fløtten, T., Kavli, H., & Bråten, B. (2023). Oslo er fortsatt en delt by [Oslo is still a divided city]. Aftenposten. Retrieved from https://www.aftenposten.no/meninger/kronikk/i/dw2z8o/oslo-er-fortsatt-en-delt-by
Gustafsson, J.-E., Nilsen, T., & Hansen, K. Y. (2018). School characteristics moderating the relation between student socio-economic status and mathematics achievement in grade 8. Evidence from 50 countries in TIMSS 2011. Studies in Educational Evaluation, 57, 16-30.
Klieme, E., & Nilsen, T. (2022). Teaching Quality and Student Outcomes in TIMSS and PISA. International Handbook of Comparative Large-Scale Studies in Education: Perspectives, Methods and Findings, 1089-1134.
Klieme, E., Pauli, C., & Reusser, K. (2009). The pythagoras study: Investigating effects of teaching and learning in Swiss and German mathematics classrooms. The power of video studies in investigating teaching and learning in the classroom, 137-160.
Luschei, T. F., & Jeong, D. W. (2018). Is teacher sorting a global phenomenon? Cross-national evidence on the nature and correlates of teacher quality opportunity gaps. Educational researcher, 47(9), 556-576.
Praetorius, A.-K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of teaching quality: The German framework of three basic dimensions. ZDM, 50(3), 407-426.
Sandsør, A. M. J., Zachrisson, H. D., Karoly, L. A., & Dearing, E. (2021). Achievement Gaps by Parental Income and Education Using Population-Level Data from Norway. https://osf.io/preprints/edarxiv/unvcy
Schlesinger, L., Jentsch, A., Kaiser, G., König, J., & Blömeke, S. (2018). Subject-specific characteristics of instructional quality in mathematics education. ZDM, 50, 475-490.
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454-499.
Van Ewijk, R., & Sleegers, P. (2010). The effect of peer socioeconomic status on student achievement: A meta-analysis. Educational Research Review, 5(2), 134-150.
Yang Hansen, K., Radišić, J., Ding, Y., & Liu, X. (2022). Contextual effects on students’ achievement and academic self-concept in the Nordic and Chinese educational systems. Large-scale Assessments in Education, 10(1), 16.