Investigating Teachers’ Language Assessment Literacy in the Implementation of the Merdeka Belajar (Freedom of Learning) Curriculum

Teachers' language assessment literacy (LAL) is critical to the success of education, the quality of students' learning, and students' willingness to study. Yet, studies on teachers' LAL preparation to face the Merdeka Belajar (Freedom of Learning) curriculum are still scarce. Most LAL studies used questionnaires to assess teachers' knowledge. In contrast to that, this study utilized teachers’ LAL knowledge test adapted from Al-Bahlani (2019). This study investigates teachers' self-perceived LAL consisting of competence, frequency of practice, and assessment knowledge in The Association of English Teachers Banyumas district region (MGMP) with a total of 73 participants involved by investigating teachers' self-perceived LAL consisting of competence, frequency of practice, and assessment knowledge. The current study utilized two data sources, including a questionnaire and a language assessment knowledge test to employ a quantitative methodology using MANOVA and Pearson’s product-moment correlation. The results showed both strengths and weaknesses in the teachers’ LAL, as well as matches and mismatches between teachers' self-perceived and shown assessment knowledge. Overall, EFL teachers in Banyumas regency are at a fair level of LAL and pre-service training in assessment was the variable with the greatest impact on teachers' LAL. Future research may also require investigating the objectives, actions, and outcomes of assessment training provided by teachers’ training institutes and professional development programs.

Investigating Teachers' Language Assessment Literacy in the Implementation of the Merdeka Belajar (Freedom of Learning) Curriculum

INTRODUCTION
The Merdeka Belajar (Freedom of Learning) curriculum is the newest ongoing curriculum implemented by the Indonesian Ministry of Education as the gatekeeper of education policy.There are radical changes that have been going on that differ from its predecessor.One of the most striking changes is in how teachers assess students prior to the removal of the National Examination (Aulia, 2021).Formative-based assessment as a shifted assessment model of Minimum Competency Assessment from summative-based assessment (the National examination) has assumed greater significance for language teachers to have sufficient assessment concepts and procedures (Fitriyah et al., 2022).This assessment model places a larger emphasis on the learning outcomes of students in the classroom as opposed to their final test scores.
In all curricula, teachers must be able to assess their students.This demonstrates that teachers' language assessment skills are essential (Umam & Indah, 2020).However, the Merdeka Belajar curriculum mechanism lays a heavy focus on teachers' autonomy in designing learning and assessments, enabling teachers to become significantly more engaged and gain extensive assessment literacy.Subsequently, teachers will be required to build learning sets and assessments from scratch based on the varying needs of students in each school, as determined by their socioeconomic origins.If the teachers pose inadequate language assessment literacy, it is feared that the teacher's learning objectives would be unfocused and teachers fail to map their students' abilities which is necessary to create future plans in their classes.The greater autonomy means that the responsibility lies on teachers' solely because they do not rely on the central government to provide the set of assessment products.The language assessment literacy (henceforth LAL) level of a language teacher can be used to determine the accuracy with which an EFL teacher evaluates students' skills (Fitriyah et al., 2022).
In relation to the concern, teachers' LAL has emerged as a central topic of study and investigation in the realm of language teaching assessment (Lam, 2019;Bahtiar & Purnawarman, 2017;Fulcher, 2012).In light of the growing significance of LAL in recent years, identifying the teachers' LAL is crucial to ensure that language teacher professional development is proceeded in the right direction, thereby providing students with accurate evaluations (Sulaiman et al., 2021).In addition to the necessity of LAL, teachers have numerous assessment responsibilities (Lan & Fan, 2019).As a note, teachers also devote up to fifty percent of their professional time to classroom assessment activities to monitor their student's achievement toward learning outcomes and influence the quality of their instruction based on the quality of the assessments used (Rad, 2019;Giraldo et al., 2018;DeLuca et al., 2015;Howell, 2013).These occurrences made it abundantly evident that language teachers must enhance LAL (Anam & Putri, 2021;Latif, 2021;Widiastuti, 2021;Tsagari, 2020;Koh et al., 2017;Prasetyo, 2018).The manner in which teachers administer LAL can affect overall learning quality (Smith et al., 2014), and give a significant influence on students' achievement (Umam & Indah, 2020;Zulaiha et al., 2020), the success of teaching (Gultom, 2016), the quality of student learning (Smith et al., 2014), and students' motivation in learning (Alkharusi, 2013;Earl, 2013).
As this is a relatively recent phenomenon, empirical evidence addressing the examination of EFL teachers' LAL in the implementation of the Merdeka Belajar curriculum is still limited and it has yet to receive much attention from EFL Indonesian researchers (Auliya, 2022;Zulaiha, et al., 2020).The evidence is crucial as evidence of their assessment literacy level.They will be less likely to help students acquire higher levels of academic accomplishment if they lack assessment literacy (Herrera & Macías, 2015).Hence it is essential to understand how language teachers respond to this new demand and how they possess the competencies expected to face this challenge.In addition to that, most studies reported low to moderate levels of teachers' assessment literacy (Kılıç, 2022;Massey et al., 2020;Genç, et al., 2020;Umam & Indah, 2020;Al-bahlani, 2019), the underdeveloped teachers' assessment literacy to lack of pre-service language assessment training (Bustamante, 2022;Puspawati, 2022;DeLuca & Johnson, 2017;Lam, 2015), and weak assessment knowledge results to the measures used in assessment literacy studies (Gotch & French, 2014).These aforementioned studies mostly used a survey to assess competence, practices, and knowledge.Some earlier studies utilized the Teacher Assessment Literacy Questionnaire (TALQ) (Luthfiyyah et al., 2020;Mertler, 2004;Campbell et al., 2002;Plake et al., 1993;Plake & Impara, 1993) or Assessment Literacy Inventory (ALI) (Mertler, 2009;Mertler & Campbell, 2005).To update the research, the current study utilized the LAL knowledge test adapted from Al-bahlani (2019) based on "Educational Assessment Knowledge and Skills for Teachers" (Brookhart, 2011) to assess teachers' understanding of assessment knowledge.
Furthermore, the current study aimed at measuring teachers' competence, knowledge, and practices of The Association of English Teachers (MGMP) Banyumas district region.For these reasons, the study posed to investigate (a) the extent to which Indonesian EFL teachers of the junior school are literate, (b) how teachers perceive or self-assess their assessment literacy and their relationship and correlation among EFL teachers' assessment practices and perceived skills concerning teacher sub-groups, and lastly, (c) how teachers perform in a test that measures assessment knowledge and their relationship and correlation among EFL teachers' assessment knowledge concerning teacher sub-groups.

Research Context and Participants
This study surveyed 73 participants (56 females, 17 males) of the Association of English Teachers (MGMP) Banyumas district region using online and on-site surveys.According to the analysis, the majority of participants (N=50=68.5%)had received at least one assessment course before beginning their teaching careers, whereas 31.6% had never taken any assessment courses.

Instruments Teachers Assessment Perceptions Questionnaire
The basis of the questionnaire is designed based on Brookhart's (2011) eleven components consisting of five assessment skills components (C1-C5) and six assessment practice (C6-C11) components.Later, these components developed into a total of 44 items.Responses on each item of teachers' assessment competence were rated from 1 (not competent) to 5 (very competent), whereas responses on each item of assessment practice were rated from 1 (never) to 5 (always).

Teachers' LAL Knowledge Test
This study used the EFL teacher assessment knowledge test consisting of 15 multiplechoice questions adapted from Al-bahlani (2019) based on Brookhart (2011) eleven components.The Cronbach alpha coefficients were calculated and found to be 0.80 considered ideal for testing.The procedure of Data Collection Quantitative data were collected throughout two distinct periods.The first phase was a survey of teachers' assessment perceptions while the second phase used the LAL knowledge test.The test was administered online using a Google form and offline by delivering a printed copy one week following the completion of the surveys.The participants were given information about the test after they did the surveys and were instructed to complete it as they started.

Data Analysis
Quantitative analyses of questionnaires and knowledge tests were conducted using descriptive and inferential analyses.Firstly, descriptive statistics were carried out in revealing teachers' assessment competencies, how frequently they used them, and their knowledge of language assessment.In evaluating the questionnaire at the LAL level, the researcher employed the formula of Dixon and Massey (1987).This formula divides the interval level of teachers' LAL into five groups ranging from very poor to very good.Secondly, inferential statistics through MANOVA was used to determine the significant differences in teachers' assessment competencies, assessment practices, and assessment literacy knowledge concerning different teachers' subgroups (gender, age, teachers' experience, teachers' roles, teachers' pre-service assessment training).Finally, Pearson's product moment analysis was run to see the correlation between teachers' assessment competencies, practices, and literacy knowledge of different teachers' subgroups respectively.

RQ1: The EFL Teachers' Assessment Literacy Level
The finding emphasized that The Association of English Teachers (MGMP) Banyumas district region is at a fair level of LAL (M=13.29,SD=2.78).This finding suggests that teachers need to improve their LAL performance whilst implementing the Merdeka Belajar curriculum for two points, firstly, the quality of students' learning and students' learning motivation (Alkharusi, 2013).On average, teachers regard themselves to be fairly competent in performing the Merdeka Belajar assessment (M=15.80,SD=3.08).This needs room for improvement due to the demands in implementing autonomous assessment which designed by teachers instead of relying on standardized framework issued by the Ministry of Education acknowledging the nature of different needs in each school as the new curriculum demand a more evidencebased assessment due to the realization that the purpose of assessment is not just about criticizing but also improving outcomes.In addition, Table 2 reported that teachers perceived themselves to be highly competent in performance assessment (C2).This demonstrates that teachers are highly believe that they are capable of evaluating students' contributions in class, determining whether or not they have understood a topic through a series of oral questions, designing performance assessment methods with well-defined objectives, establishing a rating scale for performance criteria, using that rating scale or a checklist to evaluate students' performances, and drawing valid conclusions about students' knowledge acquisition from these assessments.Afterward, MANOVA was run to further examine the differences between teachers' sub-groups and their language assessment competency.The results revealed no statistically significant difference for gender with F(11, 48) = 1.234, p > .05;Wilk's Λ = .780,partial η2 = .220),age with F(44, 185.6) = 1.071, p > .05;Wilk's Λ = .421,partial η2 = .195,and pre-service assessment training on teachers' language assessment competence with F (11, 48) = .796,p > .05;Wilk's Λ = .846,partial η2 = .154).This implies the lack of important roles of teachers' gender, age, and pre-service assessment course in reinforcing teacher-perceived assessment competence.This finding replicates the previous study by Al-bahlani (2019) in which gender and age have no statistical significance toward teachers' language assessment competency.However, this study differs from Al-bahlani (2019) in that the pre-service assessment training has a statistical significance toward teachers' language assessment competency.Albahlani (2019) stated that "The analysis revealed a statistically significant multivariate effect for pre-service assessment training on teacher language assessment competence; F (3.94) = 3.059, Wilks' Lambda = .837,p = .009.There were no statistically significant multivariate effects for age, gender, or in-service assessment training on teacher language assessment competence" (pp.105-106) In addition, this study points out the difference between Alkharusi et al., (2012Alkharusi et al., ( , 2014) study in which gender had a significant effect on language assessment competence.Although there are distinctions between this study's participants and Alkharusi et al., (2012Alkharusi et al., ( , 2014)), this study shared one fact in common in which the majority of participants had completed at least one pre-service assessment course regardless of their degree or number of classes taught.This raises questions about the courses and professional development programs that teachers in both studies participated in.For instance, how pertinent the content was to teachers' needs and how effective it was at influencing teachers' perspectives on language assessment.
As demonstrated in Table 4, teachers' teaching experience and ages exhibited a very weak positive statistical correlation with grading and communicating assessment results to others just like the findings in Al-bahlani (2019).This shows that teachers are weakly competent at setting grades based on students' average performance, knowing what criteria should be used to set grades, knowing which student characteristics should not be utilized to set grades, and setting grades that correspond with the attainment of learning goals.In addition, teachers are skilled in assessing students' progress using portfolios, offering written feedback, communicating assessment findings to students, providing oral feedback, and communicating assessment results to parents.Other components, in contrast, have very weak negative correlations.Referring to Table 2, English teachers are at a poor level of assessment practices (M=11.20,SD=2.53).This reports what teachers do towards what they know about the Merdeka Belajar assessment is still lacking, hence it demands room for improvement.As shown in Table 2, the highest assessment practice done by the teachers is using traditional assessment methods (M = 17.44,SD = 3.6).This indicates that teachers most of the time use the same old way in traditional assessment methods, such as using true or false questions, multiple-choice questions, fill-in-the-blank questions, matching questions, and short essay questions.However, in the Merdeka Belajar curriculum, it favors more on comprehensive authentic assessment such as observation, essay, performance tasks, etc. in order to make students to be relevantly involved directly in the assessment activity.These assessments imposed by the Merdeka Belajar curriculum take one step ahead for the teachers to be extremely aware of students' capability and development in individual level because teachers need to be able to create various personalized assessments that are high level thinking which requires extensive understanding of which assessment tools are suitable and appropriate.
This result is in line with Al-bahlani ( 2019) study in which the teachers sometimes use traditional assessment methods.On the other hand, this study claimed less frequent use of alternative assessment methods (M = 7.21, SD = 2.3).MANOVA shows that there is no statistical significance difference among teachers' gender F (5, 54) = 1.707, p > .05;Wilk's Λ = .864,partial η2 = .136),age F(20, 20) = 1.474, p > .05;Wilk's Λ = .605,partial η2 = .118,and pre-service assessment training F(5, 54) = 1.316, p > .05;Wilk's Λ = .891,partial η2 = .109)on teachers' language assessment practices.These findings indicate two points.Firstly, there is no significance in teachers' gender and age towards teachers' frequency of practice.Secondly, there are no important assessment courses in pre-service teachers' training programs for teachers' frequency of practice.These results on teachers' gender and age are similar to those reported by Al-bahlani (2019).Quite opposite, one finding related to the statistical significance of pre-service assessment training on the frequency of teachers' assessment practices is found contradicted by Al-bahlani (2019).Al-bahlani (2019) stated that "The analysis revealed a statistically significant multivariate effect for pre-service assessment training on the frequency of teacher assessment practices; F (3, 90) = 2.741, Wilks' Lambda = .824,p = .013.There were no statistically significant multivariate effects for gender, age, and in-service assessment training."(pp. 109-110) On the other hand, Pearson's product-moment correlation resulted that this study is in line with Alkharusi et al., (2012) finding in which it has no relation between pre-service assessment training and teachers' assessment practices.Meanwhile, this study was found to be the opposite of Alkharusi et al., (2014) study in which there were statistically significant multivariate effects on the teachers' assessment practices concerning pre-service assessment training (partial η 2 = .010).As a final judgment, two studies Alkharusi et al., (2014) and Al-bahlani (2019) share the same results, thus this study highlights the importance of pre-service assessment training courses on teachers' assessment practices.Surprisingly, only one weak positive correlation was found in using alternative assessment methods (C7) (r = .214,p = .069)and teachers' age.The assessment knowledge test reported that the scores of total correct answers per test item ranged from the lowest of 0.24 (test item 15) to the highest of 0.82 (test item 12) with an average score of 0.52.This score indicates teachers' language assessment knowledge about the Merdeka Belajar curriculum is still very limited just like the previous studies consistently found low levels of language assessment knowledge (Xu & Brown, 2016;Alkharusi et al., 2012).Furthermore, teachers' assessment knowledge is expected to support their instruction and effectively respond to the needs and expectations of students, parents, and the school community (Herrera & Macías, 2015).The findings can be referred to their less teaching experiences and intricately linked to instructional practices as the dominance of assessment knowledge (Louw et al., 2014;Basturkmen, 2012).As a point highlight, this study employed 21 novices and 52 experienced teachers are utilized.Thus, fewer teaching experiences are not the reason why teachers are having limited knowledge of LAL but most likely the intricately linked to instructional practices as the dominance of assessment knowledge.The findings (See Table 1) show that the highest overall performance was found in statement number ten stating "Language teachers should be able to assist students in making sound educational decisions based on evaluation data."It signifies teachers are competent in assisting students to analyze, plan, and track their learning, as well as comprehend the relationship between assessment and student motivation, feelings of control, and self-regulation.In contrast, the lowest score was found in statement number two in which teachers are lacking in articulating clear learning intentions.In contrast to the earlier finding, however, Al-bahlani, (2019) study found teachers were literate the most in item number two.
The output of Pearson's product-moment correlation indicates that teachers' teaching experience and their roles at school are in the interval of very weak negative correlation to weak positive correlation.Surprisingly, item 10 was the only item showing a statistical significance value with p ≤ .05 and a weak positive correlation towards teachers' roles.Item 10 is incorporated with item 11 which emphasizes the language teachers in the Banyumas district region's ability in communicating their interpretations of assessment results and rationale to the community surrounding their students.This happened due to their teaching experiences, how long they teach, and their roles at school.However, since the correlation is reported at a weak level, it is suggested for teachers improve their capabilities in LAL for better outcomes in the future.

CONCLUSION
The result of this study is projected to highlight important language assessment literacy competencies, practices, and knowledge in the implementation of the Merdeka Belajar curriculum.The consequences of this study apply to a variety of education and EFL stakeholders globally since they pertain to teachers' training programs and teaching institutions.The study found that pre-service assessment courses have a significant impact on EFL teachers' judgments of assessment competence and practice.Therefore, it is recommended that the EFL preparation program continue to give evaluation courses to language teachers to enhance their assessment literacy in the implementation of the Merdeka Belajar curriculum.Secondly, teachers' training programs reassess their current provisions for evaluation in language assessment.And lastly, there is an urgent need for these programs to incorporate the numerous assessment domains outlined in this study into their theoretical and practical assessment courses.In addition, the study contributed to the field of EFL and educational assessment by addressing the need for new instrument development to measure teachers' LAL and a framework to evaluate teachers' LAL.This study's findings confirmed those of other recent research indicating that, despite improved pre-service assessment training, teachers still exhibit assessment literacy gaps.Future research may also require investigating the objectives, actions, and outcomes of assessment training provided by teachers' preparation institutes and professional development programs.

Table 1 .
Assessment Knowledge Test Questions, Its Relevance to Brookhart, and Each Mean Component

Table 2 .
Banyumas EFL Teachers' Language Assessment Literacy Level RQ2: Teachers' Perceived/Self-Assess in Assessment Literacy a. EFL Teachers' Self-Perceptions of Assessment Competences

Table 3 .
Frequencies for teachers' competence in language assessment (N = 73)

Table 4 .
Pearson Product-Moment Correlation Coefficients of Teachers' Teaching Experience and Age in Teachers' Assessment Competence

Table 5 .
Frequencies for teachers' practices in language assessment (N = 73)

Table 6 .
Pearson Product-Moment Correlation Coefficients of Teaching Experience and Age with Teachers' Assessment Practices