How does Rasch modeling reveal difficulty and suitability level the fraction test question?

This article explains how to analyze test items in arithmetic operation with fractions to obtain the items' level of difficulty and fitness. Data were collected by using multiple-choice questions given to 50 fourth-grade students of an elementary school in Tasikmalaya city. The answers were then analyzed using the Rasch model and Winsteps 3.75 application, a combination of standard deviation (SD) and logit mean values (Mean). The score data of each person and question were used to estimate the pure score in the logit scale, indicating the level of difficulty of the test items. The categories were difficult (logit value > +1 SD); very difficult (0.0 logit +1 SD); easy (0.0 logit -1 SD); very easy (logit value < –SD). Three criteria were used to determine the level of difficulty and fitness of the questions: the Outfit Z-Standard/ZSTD value; Outfit Mean Square/MNSQ; and Point Measure Correlation. It resulted in a collection of test items suitable for use with several levels of difficulties, namely, difficult, very difficult, easy, and very easy, from the previous items, which had difficult, medium, and easy categories. Rasch model can help categorize questions and students' ability levels.


Introduction
Every learning process has a goal related to some knowledge or skills that students must obtain. In achieving the goal, measurement is used to determine the value, score, or percentage achieved by students related to the learning objectives. In the process or end of learning, it is necessary to measure the learning process and its results in the form of numbers that reflect the achievements of the learning process and results. According to Mardapi, measurement is basically an activity of systematically determining numbers on an object (Doran, 1980). Measurement is a process or activity to determine the quantity of something. The word "something" could mean students, teachers, school buildings, study tables, whiteboards, and others (Parker, 2000). Measurement is a process that describes the student's performance with a quantitative scale until qualitative characteristic is shown by numbers (Alwasilah, 1996). Thus, measurement in education means measuring student attributes or characteristics (Goddard et al., 2000).
Measurement in education is closely related to tests. One way that is often used to measure the results that students have achieved is by testing (Harris & Brown, 2010). In the measurement process, the teacher must use either test or non-test measurement tools (Parker, 2000). According to Zainul (2001), a test is a question or task or a set of tasks used to obtain information about an educational attribute. The test used is adjusted to the subject or field of science used as a test source (Zainul, 2001).
The results obtained from the test are data. This article analyzes the data using the Winsteps 3.75 application through the Rasch rating scale model. The determination of the assessment of the Rasch model was chosen because it has a good measurement approach and can model the relationship between item difficulty, problem-taking ability, and the probability of a given response (Andrich, 1981). In addition, Rasch modeling uses psychometric analysis techniques to be used to develop test items. Rasch modeling is also an important tool that can provide relevant information regarding student assessment after learning (Sumintono, 2018).
The analysis of instrument tests using the Rasch model includes response item measurement theory. This measurement describes of interaction between subject and test items. It will make the measurement result more appropriate and objective (Sumintono & Widhiarso, 2014). Meanwhile, according to Brodge (1977), the Rasch model is usually used for item measurement and the subject of people; in this context, examining the relationship between the comparison law and other additional combined measurements is discussed in this fraction test.
According to Masters, Rasch modeling can be used for various observation formats, including models for calculation analysis, repeated experiments, and rating scales (Masters, 1982). In addition, the statistical description of Rasch fitness can give a valuable framework for testing the correctness of a person's response, measuring the estimation of a person's responseability, and being able to detect various disturbances to a person's response (Smith, 1986). The Rasch model is noted as a probability model from the individual response towards an item and, therefore, not explicitly become a response model for itself (Brodgen, 1977).
In the 1960s, Georg Rasch has developed an analytical model of Item Response Theory (IRT) and later was being popular by Ben Wright. The raw data is collected as dichotomy data (true or false) that indicates students' ability. The Rasch model makes it a model that relates between students and items (Sumintono & Widhiarso, 2014). Besides dichotomy data, the Rasch model is used to analyze polytomy data that was developed by Andrich, based on two fundamental theorems, the ability of a person and the difficulty level of an item. The Rasch model assumed that the difficulty item is a disposition that is influenced by the response of respondence, and personability is a characteristic that influences by the difficulty level estimation (Linacre, 1999;Nur et al., 2020). The advantages of the Rasch model if compared with a classic theory that Rasch model can identify the false answer, identify the assessment that is not appropriate, and predict the loss data based on systematic response patterns (Fahmina et al., 2019;Goodwin & Leech, 2017;Ratna et al., 2017).
The topic of this research refers to previous research, where Rasch modeling is used for measuring critical thinking of student's skills in STEM learning in elementary schools (Hamdu et al., 2020). We know that STEM provides a learning experience of mathematics and science, technology, engineering that is suitable for students in elementary school. When students are given the task of making media or props for steam power in water ships, indirect measurement skills are needed. It requires math skills. However, there is an assumption that students have difficulty counting fractions.
Therefore, in this study, Rasch modeling is used to evaluate learning on rational numbers and describe how Rasch modeling can reveal the level of difficulty and fitness of test questions on fractions. Evaluation is an essential part of the learning process (Jackson et al., 2002). Generally, learning evaluation aims to widely determine the effectiveness, efficiency, and learning of the learning system, including knowing the quality of the questions used to evaluate. The quality of the questions are questions that can measure and describe students' thinking abilities, including questions that are appropriate to measure the level of cognition and difficulty.

Methods
To obtain information about the level of difficulty and the suitability of fractional number questions for elementary school students, it is necessary to conduct research using descriptive analysis, with a cross-sectional design, and data collection at one time. The research subjects were fourth-grade students from five elementary schools in Tasikmalaya City. Data were collected from 25 male students and 25 female students. The data studied were collected from the answers to 30 valid multiple-choice questions based on basic competencies in the fractional number material. The following are the basic mathematics competencies in fractions for fourthgrade elementary school students. Table 1. Basic mathematics competencies in fractions in elementary school

No
Basic Competencies 1 Explain equivalent fractions with pictures and concrete models. 2 Describes the various forms of fractions (common, mixed, decimal, and percent) and the relationships between them.

No
Basic Competencies 3 Explain and perform estimations of the number, difference, product, and quotient of two whole numbers as well as fractions and decimals. 4 Identify equivalent fractions with pictures and concrete models. 5 Look for different forms of fractions (ordinary, mixed, decimal, and percent) and the relationships between them. 6 Solve problems estimating the number, difference, product, and quotient of two whole numbers as well as fractions and decimals.
Furthermore, the measurement of the quality of the test questions, that is, regarding the level of difficulty and appropriateness, was carried out. The data collection technique is shown in Figure 1. below, which was modified from Hamdu et al. (2020).

Figure 1. Data analysis stages
The stages of analysis in Figure 1. begin with preparing research instruments in the form of written test questions as a tool to obtain research data. Furthermore, the data is taken from the results of filling in the questions by students. This study focuses on the stages of data analysis in Figure 1. which are marked with a red box. The disclosure of the level of difficulty and suitability of the items was carried out using Rasch modeling assisted by the Winsteps 3.75 application. Rasch modeling uses the following categorization standards: 1. For the category of a very hard question, the logit value is greater than +1SD. 2. Hard question the value of 0.0 logit +1 SD. 3. Easy question the value is 0.0 logit -1 SD. 4. Very easy question the value is less than -SD.

Results
The following are the results of the analysis of test questions through Rasch modelling assisted by the Winsteps 3.75 application.

Analysis of Item Measure Difficulty Levels
The red squares in Table 2. provide information on the level of difficulty of each item of fractional number questions as follows.  9,4,22,16,13,23,18,19,20,25,15, and 21 3. The group of easy items, namely question no. 7, 17, 24, 26, 28, 11, and 6 4. The group of items is very easy, namely question no. 2, 10, 12, 3, 14, 8, and 1 Question no. 29 and 30 are very difficult questions and question no. 1 is a very easy question. The question is as follows: There are 60 students in the fourth grade of Mekar elementary school. On this day there will be 4 students.
Each of them will bring 1/60 of 50kg of rice to make nasi liwet (liwet rice) and eat it together. Therefore, the fourth grade students of SD Mekar, will cook liwet rice as much as ….. part of 50 kg of rice. a. Determine the sum, difference, product, and quotient of fractions. 3 Easy (Question no. 6) A fraction equivalent to The indicators for the form of test questions for each level of difficulty are based on table 3 above, namely; 1) Shows equivalent fractions with pictures and concrete models (very high); 2) Determine the sum, difference, product, and quotient of fractional numbers (high); 3) Explain equivalent fractions (medium); and 4) Pronounce the value of the fraction (low).

Figure 2. Wright map
Based on Wright's map, information was obtained that the very easy category questions (S1) are still considered very easy by students with a low ability (50PE). Nine students can easily answer very difficult questions with high abilities (07PB, 29LD, 12LC, 09LB, 16LB, 18LB, 28PD, 27PB, 33PB).

Analysis of Item Fit Order
The level of fitness of fractional numbers test questions (Item Fit) can be determined by using three criteria, namely the outfit mean-square value (Outfit MNSQ), Outfit Z-Standardized (Outfit ZSTD), and Point Measure Correlation (PT-Measure Corr) (Bond & Fox, 2015). The criteria used to check for non-conforming items (outlier or misfit) are: 1. The value of the Outfit square-mean (Outfit MNSQ) received: 0.5 < MNSQ < 1.5.  Table 4. Item statistics: Misfit order The red square in Table 4. above shows that items 30, 15, and 1 do not meet the MNSQ value; all items meet the ZSTD value, and items 15, 26, 14, 1, and 8 do not meet the PT-Measure Corr. However, some items do not meet these three criteria. All questions about fractional fractions analyzed have an acceptable level of conformity and deserve to be maintained. As an additional explanation, the ZSTD value of all questions within the allowed limit is the cause of all items worth defending.

Discussion
Calibrated assessment instruments can verify student response patterns appropriately. In this case, Rasch modeling in principle precisely calibrates the measurement scale, respondents, and items. Rasch modeling works by processing score data based on per person and score data per item (question). The two scores become the basis for estimating the actual score on a logit scale which can indicate the level of difficulty of the item. Raw scores processed through Rasch modeling produce true/logit scores that have been measured using equal or equivalent intervals.
Assessment instruments processed through Rasch modeling can discuss assessment instruments more specifically. Efforts to know the competence and incompetence of students in a material can be obtained by focusing on discussing the material test instrument. In this article, the author analyzes the fractional number test instrument based on the basic competencies in the fractional number material curriculum used in learning to fourth-grade elementary school students.
The presentation of Rasch's modeling analysis results provides information that students' competence in explaining equivalent fractions with pictures and concrete models is considered very difficult. In skill competence, students find it difficult to solve the problem of estimating the number, difference, product, and quotient of two whole numbers and fractions and decimals. However, on the competence of skills to easily identify equivalent fractions with pictures and concrete models. Moreover, the knowledge competence explains various forms of fractions (ordinary, mixed, decimal, and percent), and students consider the relationship between them very easy.
Of all the test questions analyzed by Rasch modeling, it shows that based on the level of difficulty, the items are divided into four categories: very difficult, difficult, easy, and very easy. The items' difficulty level categorization was done by combining the standard deviation (SD) value, and the logit means value (Sumintono & Widhiarso, 2015). Based on the explanation and analysis of the difficulty level of measuring the items in the results section above. Problems can be used to measure students' mathematical knowledge and skills.
There are three criteria (MNSQ, ZSTD, and PT-Measure Corr) that can ensure that the question is not good enough so that it needs to be repaired or replaced (Biggs & Collis, 2014;Jackson et al., 2002). There were no items that did not meet the three criteria from these findings. Therefore, all the fractional number items analyzed have an acceptable level of conformity and deserve to be maintained. Besides that, information obtained from the processing results is shown in Table 2. Measure Order that questions that were previously made in the difficult, medium, and easy categories became very difficult, difficult, easy, and very easy categories. Then if you look at Figure 2, Wright's map, it was found that nine students (07PB, 29LD, 12LC, 09LB, 16LB, 18LB, 28PD, 27PB, 33PB) responded that the very difficult questions were still below their abilities, meaning that the very difficult questions were not considered very difficult for them difficult. Likewise, for questions categorized as very easy (no. 1), a student with the lowest ability (50PE) responded very easily, meaning that the question was very easy for students with the lowest ability, which was considered to be much easier. However, the questions proportional to the distribution of questions in the difficult, medium, and easy categories, found on the Wright map are still proportional. The questions are more widely distributed in the difficult and easy categories, and it is shown that there are two very difficult questions (no. 29 and no. 30) and one very easy question (no. 1).
The results of the Rasch modeling analysis show that the level of difficulty of the fractional number test questions is appropriate for each category, namely very difficult, difficult, easy, and very easy. Various levels of item difficulty are needed to identify students' various abilities. In very difficult category questions, the percentage of students with correct answers tends to be the lowest compared to other question categories. However, in the categories of difficult, easy, and very easy questions, the percentage of students who answered correctly tended to vary with several possibilities, such as students answering questions by guessing or cheating. These possibilities can occur in some questions that are classified as difficult, easy, and very easy because of the pattern of presentation of questions or different levels of student ability.

Conclusion
Rasch modeling assisted by the Winsteps 3.75 application makes it easy for users to analyze the assessment instrument (test questions) where the raw scores are processed and provides explicit analysis on the design of the items used and whether the score pattern used is appropriate. Based on the results, it can be concluded that several test questions showed various levels of difficulty with an acceptable level of fitness. The difficulty level related to the fractional numbers questions, namely, 1) the majority of hard items were found in the indicator of the ability to show equivalent fractions with pictures and concrete models; 2) the majority of very hard items were found in the indicator of the ability to determine fractions based on pictures, determine the product and quotient of the fractional numbers and explain the form of mixed fractions; 3) the majority of easy questions were found in the indicator of the ability to determine the sum and difference of fractional numbers; 4) the majority of very easy questions were found in the indicator of the ability to pronounce equivalent fractions.
The analysis results through Rasch modeling specifically provide a comprehensive description of the quality and categories of questions. It can give an idea of students' ability to the material that has been studied. Furthermore, it can provide an overview of how mathematics learning is carried out on specific materials and at certain times. Therefore, the results of the analysis of Rasch modeling can provide a description of the conditions and learning situations, such as those relating to student characteristics and the implementation of mathematics learning in certain classes or schools. Thus, Rasch modeling analysis can identify the learning process and create comprehensive tests.

Conflicts of Interest
The author declares that no conflict of interest regarding the publication of this manuscript. In addition, the ethical issues, including plagiarism, misconduct, data fabrication and/or falsification, double publication and/or submission, and redundancies have been completely by the author.