Instrument

In evaluation and assessment, an instrument is anything that is used to collect data. This can be an assignment, test, video, audio, or automated data collection method for which the collected data is used in an analysis. Instruments are typically planned and designed before an educational product or experience is implemented so that data can be collected during implementation for the educational product's performance to be evaluated during or after implementation.

Definition

An instrument is any object or method that collects data from people who use an educational product. Data that are collected from instruments are subsequently analyzed to answer research questions during evaluations. These data sources are used to provide evidence for whether a product was used as intended, or whether a person learned from the product, among many other evaluation research questions.

Additional Information

For evaluation purposes, instruments can be anything that collects data about how a participant uses an educational product or about what the participant knows (knowledge), knows how to do (skills), or is feeling and perceiving (psychology and affect).

Data, as collected by instruments, is directly tied to the type of analysis method that is used to answer a research question. When an evaluation is designed, the evaluator must first generate research questions, and then select combinations of instruments and analysis methods to answer those questions. Each analysis method requires specific types of data, and thus, requires appropriate instruments to collect the data.

An instrument needs to be well thought out and designed so that evaluators can be confident that the data that it collects reflects the things that are being studied and the research questions that are being asked. Specifically, an instrument must generate data (or evidence) that (1) validly captures evidence of the concepts being investigated in that it is a true reflection or representation of the concepts, (2) reliably captures evidence in a way that is repeatable and the systematic process of collection can be understood and followed by other evaluators, and (3) the systematic process of data collection using the instruments is well documented so that people can trust that the data collected are truly reflective of the concepts being studied.

Data are often transformed from their original form

Instruments also typically require data transformation after the data are collected to make the data usable for analysis. Data transformation is when the data are manipulated, operated on (i.e., added or other mathematical operations), combined, reduced, or otherwise changed so that the data are understandable and usable for the particular analysis method that is being used. In its raw form as directly collected from an instrument, data may not be usable to draw any insights from it.

Data transformation is particularly common for quantitative analysis, which requires most, if not all of the data to be reduced to variables that can be analyzed in numeric or categorical form. This also requires measuring the concepts that are being studied so that the variable can be assigned a value, such as the level of participation or the amount of competency that a person showed in answering a question (i.e., whether they got the correct answer or not).

Variable measurement and transformation requires changing the raw data sources into single variables with one value for each participant. For instance, a transformation of data would change a single correct response of a "B" on a test question to a "1". Multiple variables or "items" can additionally be combined or reduced into new variables, such as a variable for "overall test score" or "total self-confidence" score after items have been measured.

In some analysis methods, particularly qualitative methods, evaluators actually perform an analysis while the data are being transformed. Themes, patterns, and other insights are identified within complex data sets. The evaluator will transform the data and create new categories or new variables about these themes, patterns, and insights that they observe at the same time that the evaluators investigate and analyze the data sets.

For example, when evaluators using qualitative methods analyze dialogue or open-ended responses, they are typically looking for, identifying, and categorizing by marking down next to the data (i.e., "coding") any common themes, descriptions, and patterns that they observe in the data. As a result, the evaluators will often create new data transformations, reductions, and categories that reflect the themes and patterns that they identify. A qualitative evaluator may also examine data sources for the presence or absence of expected items and record where these items exist in the data. This gives the evaluator a checklist or set of variables that represent certain qualities of the data, which is a new set of data that were transformed from the original raw data.

Common types instruments and the types of data that the instruments collect

There are multiple types of instruments that are commonly used to collect data for educational evaluation to assess how and why people interact with educational products and whether people learn from an educational experience.

Tests and quizzes
- This is perhaps the most common form of assessment instrument that is used in education. People are asked specific questions and give a specific response. The question can either be answered correctly or not correctly. Thus, a correct answer to a question indicates that a person knows or demonstrates competency with that item and its topic. A question without a correct answer is not helpful when testing people's knowledge or skills with a test or quiz instrument. However, partial credit can be awarded on test items if they demonstrate some, but not all of the expected correct qualities. Repeated identical testing can reveal patterns of change, as a person who scores higher on a second test from a first test can be said to perform better or knows more than when they took the first test. This pre-post test design is the most common method used in experiments and demonstrating learning, change, and growth.
- Correct answer questions are those in which there is only one correct answer. Responses can either require responses about declarative knowledge, such as giving a definition or fact, or it can require the respondent to perform a task to generate a correct response, such as solving a math problem. As there is one single correct response, such questions can be automatically scored and the data transformed into scores by a computer (and thus saving significant amounts of human labor). It is for this reason that correct answer testing and quizzes are popular with teachers who have busy schedules and can't spend too much time on exam grading. Examples of correct answer questions include: multiple choice, true/false, item matching, fill in the blank, short answer. The response for each question may vary in form, but all correct answer questions can be subsequently assigned a point value (e.g., each correct answer is worth "1") and questions may be summed to get a total score.
- Open answer and essay questions, which can be scored for correctness, depth of knowledge, presence of expected concepts, length of response, or demonstration or description of skills within the text. A text response is provided and the evaluator must transform the data into a number or code that indicates the quality of response.
Surveys, interviews, and focus groups
- Surveys are written instruments that can be completed asynchronously by participants without the involvement of an interviewer or surveyor. A person responds to survey questions that ask the person about how they feel or their emotions, what they perceive, how they act or think, or virtually any aspect of a person's knowledge, behavior, or attitudes. Surveys are similar to tests in that they ask questions, but surveys tend to have no single correct answer to any of the questions (else it would be a test). Instead, surveys gauge or measure a person's psychological attributes, intentions, behaviors, goals, or other factors that are relevant to learning that are difficult to observe - basically anything that is going on inside an individual's head. A survey can be printed on paper, or it can be administered and stored digitally in an electronic form.
  - Ranked responses (also called Likert scales) that ask people to rate perceptions or feelings on a scale, usually from say 1 to 4. This provides any number of questions that respond to a question with numeric values that can be analyzed mathematically. Surveys are a useful tool for capturing evidence on a person's internal psychological and affective states, what they personally perceive, and their thought processes - all things that happen inside the brain of the individual. Examples include asking people's level of satisfaction with an activity (1 being low to 4 being high), a person's perceived level of how present a teacher was during a course, or their personal level of confidence in completing the tasks.
  - Open-ended responses, which allow respondents to provide open feedback and thoughts. Similar to ranked responses, open-ended responses provide textual data where the respondent elaborates on what they are thinking and describes their response to the question in detail. This provides a block of text, which is scored and analyzed based on what kinds of evidence an evaluation team is looking for.
- Interviews, which allow for an observer to ask questions about people's knowledge, perceptions, attitudes, and behavior, and receive synchronous responses from them. The added benefit of an interview instrument is that although questions are often predetermined in advance (i.e., a structured or semi-structured interview), the interviewer can also ask follow up questions in response to the participant's answers. This follow up allows for the respondent to provide additional details about their perceptions, thoughts, and ideas that would not necessarily be captured in a survey.
- Focus groups. A focus group is an interview with many people who are encouraged to respond to questions collaboratively, share their thoughts and perspectives, and (through speaking) provide a detailed description in response to the questions. Focus groups can also be asked to collaboratively perform tasks so that the interviewer can observe how and why people interact the way they do and so that the interviewer can ask questions while the group works on the task.
Rubrics
- Rubrics are often used to evaluate the quality of (1) performance, such as participation or skills, or (2) a work product, such as an assignment or project. Rubrics provide numeric rankings on the quality and quantity of the item being assessed, rather than simply whether it is correct or incorrect. The rubric instrument requires evaluators to consider many items with which to assess the work product, as well as a scale for each item on the level that was demonstrated in meeting the criteria. For instance, a rubric on an essay might have six different criteria that evaluate how well a person demonstrated their knowledge on a history topic. For each criterion, a number from, say, 0 to 3 would be provided as a score based on how well the person demonstrated that particular criterion, with 0 being the lowest score (i.e., does not demonstrate any competence) to a hypothetical 3 (i.e., demonstrates competence above expectation).
- Rubrics can be used to evaluate performance or quality by the learner themselves (self-evaluation), by peers within the activity (peer evaluation), and by a teacher, instructor, or technology (instructor evaluation)
Assignments, projects, and work products
- Any assignment or something that a participant produces during an educational experience is also an instrument itself, as it collects data and records the effort of the person's work (i.e., the work product or project).
- The work products of a participant, also called "artifacts" in the evaluation field, can be analyzed for how well the work products demonstrate participants' knowledge or skills. Additionally, work products can be evaluated as to whether the assignment, project, or work product meets the expected level of output. The raw data in this case is the actual project, assignment, or work product itself that was generated by the participants. The work product can then be further analyzed to investigate how and why the person's work demonstrates learning objectives. Rubrics are frequently used to evaluate a work product for various qualities.
Participation level observation
- A person should be expected to participate in a learning experience, else the experience will have no effect on the person. A person's level of participation can then be evaluated to see what ways they are participating and to what degree.
- Rubrics for measuring and categorizing the level or quality of participation can be used to document participation in different tasks and activities.
- Simply counting the number of times that someone does something is an easy way to measure participation. Documenting the tallies, counts, or frequencies of performing specific actions is a type of instrument for collecting data on participation (such as with learning analytics in digital technologies, or even just by observing participants with a clipboard in a physical space and checking marks whenever someone does something of interest).
Logfile and digital click data
- Every interaction that is taken in a digital system, as well as a timestamp of when the action occurred, can be captured and recorded by software through logfiles. Every click, video watch, and even the amount of time that is spent on each page can be recorded in a logfile. Analytics is the field of using digital logfile data for analysis about the participation and behavior of a person in a learning experience.
- Digital logs can also be monitored by software to regularly evaluate participation within a learning environment and take actions with the learner to increase participation and support learning achievement. The use of formative assessments is particularly suited for this task.
Dialogue and text
- The dialogue and conversational interactions between participants and teachers can be a useful source of data about a person's participation in a learning environment and evidence of their knowledge, competency, or even learning. Dialogue can be captured in many ways:
  - Recorded dialogue via digital media, such as through video or audio, allows the evaluator to capture discussions that occur in the moment synchronously without bothering the participants. This could also be through the capture of a telephone call, webinar, or video call. The raw data can be transformed into transcripts, which then could be used for text-based analyses to identify the presence of items, themes, or concepts. Additionally, face-to-face discussions and video calls often generate additional non-verbal cues and gestures that could be useful to understanding the participants' communications if further identified, coded, and analyzed.
  - Digital text-based conversations have the benefit of being both synchronous (i.e., instant messaging) and asynchronous (i.e., email, DMs, letter writing). Because they occur in digital spaces, the conversations and their content are automatically captured as data to be stored and delivered to participants in the technology system. This reduces the need for additional transformation of data through having to transcribe, but does require participants to have the required literacy and writing skills to participate at the expected level.
  - Notes by an observer. Alternatively, an observer or evaluator can document any conversational topics, notes, or live transcription related to dialogue that is occurring in a learning environment. Although there is some data loss in comparison to recording the whole conversation verbatim, sometimes it is the only instrument that is possible, especially if participants do not consent to an audio or video recording.
Checklists
- Checklists are simple instruments that can be used to multiple kinds of aspects or qualities of a person, their participation, or their work. This includes evaluating tasks, behavior, interactions, or work products for whether the item has the qualities indicated by the items on the checklist. In its simplest sense, a checklist details the items that are expected to be present in a work product or activity from a participant. The checklist is completed by some observer (either the student themselves, peers, a teacher, an external observer, or a computer system). The observer indicates whether each item on the checklist is present or not.
Reflections
- Reflections are open-ended and freestyle activities that encourage a participant to reflectively think about their experience and how it applies to other contexts, as well as to generate lessons from the experience. A reflection instrument will typically prompt a person with specific questions to help them (1) think back on their experience; (2) make connections between activities, concepts, and real life; and (3) generate insights, lessons, and specific principles about their experience.
- Reflections as instruments provide data on people's perceptions, thought processes, feelings, experiences, and perceived value or use of learning experiences. Reflections can also be a source of evidence of how people participated in the learning exercise, including specific actions that the participant remembers.
Observations (general)
- In observations, an evaluator will observe a person or group doing specific tasks or activities. The participants can be given specific instructions to follow to see how well the person can perform those instructions, or they can be asked to generally perform a task or work to identify how they perform the tasks. The observer will take detailed notes about what they observe over a defined period of time, or use a predefined list of criteria to observe. For educational evaluation, observations often include people's behaviors and actions, observations about the environmental and contextual circumstances in a learning experience, people's interactions with each other, and any interactions with technology, media, or other tools.
Task analysis and think aloud
- Think aloud is an instrument that is used to collect data on how people perform tasks, but also data on what they are thinking while they are working on the task. Participants are asked to speak out loud descriptions of their thinking processes, what they are attending to or what they perceive, their goals, and generally how they are going about completing the task. An observer may ask follow up questions to further to "make the participant's thinking visible" and to capture data about the internal psychological processes and factors that are involved with completing tasks and learning. Read aloud can either be (1) conducted in real time while a person is performing the task, or (2) their performance can be recorded (usually with video) and the participant can retroactively describe what they were thinking, how they did things, what they noticed, and why they performed the way they did as they watch the video of their performance.
- Observational task analysis is a common lab method for observing how people behave, interact, and use knowledge and skills over a period of time when given specific instructions.

Tips and Tricks

To answer research questions in an evaluation, there is no exact science to perfectly selecting of an instrument to match the analysis methods that will be used. Proper selection of instrument and analysis method relies on a thoughtful evaluation plan that considers all of the contexts of the study, what questions are being asked, and what kinds of evidence needs to be shown in the data to make strong claims about how people use an educational product or how well people learned from the experience.
Think about what types of knowledge or behaviors that you want participants to show, and then consider what kinds of instruments from the list above that can collect that data. Make sure that you are also aligning the data you are collecting to the analysis method that you will use.
Sometimes, evaluations are limited to the types of data that the environment can reasonably collect. It can be helpful to consider what types of data that you have the ability to collect, and not just what you need to collect. Sometimes the available data capture opportunities in a learning environment can help evaluators identify methods that will help them answer useful research questions.
Don't go overboard on data collection if you can help it. Many instruments and data collection approaches often require human work to collect the data or to transform and analyze the data. You don't want to create more work than is feasible for yourself when selecting instruments and data collection approaches for your analysis methods!

Instrument

From The Learning Engineer's Knowledgebase

Contents

Definition

Additional Information

Data are often transformed from their original form

Common types instruments and the types of data that the instruments collect

Tips and Tricks

Related Concepts

Examples

External Resources