Evaluating learning and competency outcomes
From The Learning Engineer's Knowledgebase
The evaluation of learning and competency outcomes is a category of research question that evaluates if an educational product was effective at helping people learn or demonstrate competency.
Definition
Specifically, the evaluation of learning and competency outcomes is a category of research questions that investigates whether people either demonstrated competency or whether they learned as a result of participating in an educational product or experience.
Competency is defined in this case as the ability to demonstrate knowledge, skills, or other outcomes at a specific single point in time.
Learning is defined in this case as the positive observed change in competency over time, typically as the result of participating in an educational product or learning experience. Learning is typically measured through an identical instrument that is administered at multiple points in time, typically at pre-post timepoints, and the data from the later time is compared to data from the earlier time(s) to see if there is an improvement.
If there is an increase in competency over time, then it can be said that someone demonstrated a positive change (or that they learned).
Evaluations on learning and competency outcomes should be related to the established learning objectives of an educational product.
Additional Information
Competency evaluations is one of the most common forms of evaluation that are done on educational products and experiences.
In most learning situations, an individual participant is assessed at the end of a learning experience to see whether they can satisfactorily perform the skills or recite the knowledge that the experience aimed to teach (i.e., the learning objectives). This readily seen in K-12 and higher education settings where final exams, projects, and papers all ask students to perform specific tasks and demonstrate their competency with the material that was covered in the class. A student performs well in this type of evaluation and demonstrates mastery if they can adequately mimic the types of behavior that are desired, such as answering multiple choice questions correctly, or including specific knowledge on a term paper.
Measuring competency can be a challenging task for evaluators. Finding evidence of a learner's skills and knowledge is needed before any judgement of competency can be made. The instruments that are used to collect data must require students to produce any desired skills and actions, as well as recite any knowledge that are in the learning objectives of the educational experience.
To measure learning or to identify if it occurred, an evaluator needs to track a participant's performance over time. This can be done through a simple pre-post assessment design, or through a time series or repeated measures design if there are more than two time points. Participants can also have their participation and activity qualitatively analyzed for micro changes in their participation and competencies over time.
Common Research Questions
In this category of research questions, the actual questions typically take one or more of the following general forms:
- Did the product cause people to achieve the learning objectives?
- Did people actually learn from the educational product?
- What was the effect of the product?
- Did people have the expected level of competency at the end of using the product?
- How did people's competency in knowledge and skills change over time while using the product?
- Did the educational experience influence a person's affect, beliefs, or dispositions?
Common Instruments and Data
There are a variety of instruments and data sources that are commonly used for this category of research questions:
- Tests, quizzes, and surveys that require students to identify, recite, and describe their knowledge, as well as use skills to perform tasks or procedures on a test. The test, quiz, or survey is measured by how well a student performs on test items in comparison to the expectation (such as, finding the correct answer, or including all the required information). A total score of correctly responded items can be summed to indicate overall competency with the test. Learning can be measured and inferred if a test is administered at multiple time points and compared.
- Projects and documents that are made by learners. Completion of projects requires the learner to use specific skills. Elements of the project that are present or absent can be used to show evidence that a learner knows how to do something (or demonstrates learning if administered at multiple time points and a positive change is observed).
- Evidence of participation in social settings can be used to see the quality of interaction that a learner has in social settings, depending on the learning outcomes. The participation of an individual may be evidence for achieving some types of learning objectives, such as whether the learner can participate in various roles in a social setting, make constructive arguments, or share information productively.
- Analytics / logs of how a person uses a digital technology or media, which can capture records of a person's use of a technology or media. If the learner demonstrates that they can perform certain tasks with technology or media, such as a skill with using the technology or following procedures properly or in the correct sequence, then competency might be evidenced in this way (remember, learning is often evaluated as the demonstration of competency measured at multiple time points).
- Rubrics can be used to clearly state how competency is measured (or learning if administered at multiple time points). Each item of competency usually ranges from 0 to a higher number (3 or 4, for instance), with higher scores indicating more demonstrated competence with each item.
Common Variables and Concepts of Interest
To answer research questions in this category, evaluators are typically interested in measuring and examining one or more of these concepts or variables:
- Competency, which is typically a single measure of whether a person can demonstrate knowledge, skills, or other behaviors at a single point in time. Any competency variables are typically tied to the learning objectives of the educational product.
- Learning, which is typically measured in quantitative methods as a dependent variable of competency (i.e., post-test) in comparison to an earlier measured variable of competency (i.e., pre-test). Learning variables is typically reflective of the learning objectives that the product is seeking to achieve or influence. Learning can also be a concept of interest in qualitative studies where competencies are investigated at a micro level and are evaluated for changes as people participate with a product and learning experience. The analysis of learning by default uses competency variables discussed above - learning is typically a process of change over time, which requires repeated demonstration of competency at multiple time points.
- Influential factors, which are anything that can be argued might influence, support, or interfere with a person's achievement of the learning objectives (as measured by learning or competency variables - see above). Influential factors are also often called moderating or mediating factors that influence competency and learning outcomes. Examples of influential factors that are commonly examined include age, gender, income level, place of residence, and measures of obstacles, perceptions, interests, and motivations. These influential factors are typically investigated simultaneously with any competency or learning variables mentioned above, such as through the inclusion of influential factors as independent variables in statistical approaches.
- Participation, which is the level by which a person interacts and participates in activities, or uses the educational product. Because it exists to help a person learn, an educational product by definition needs to be used for it to have any effect. As such, the use of or exposure to the activities, technologies, and interfaces of a product can be measured to identify the degree to which the product was used in relation to achieving the learning outcome.
Common Analysis Methods to Answer the Research Questions
Both quantitative and qualitative analysis methods are commonly used to measure and describe the degree to which educational products achieve the intended goals of competency and learning. Method choices should align with the intents and desires of the people who will be using the evaluation data. Quantitative data on effectiveness is typically preferred in most contexts, but both standalone qualitative methods and mixed methods that combine both are becoming increasingly practiced.
Note: It is beyond the scope of this knowledgebase to expand on each of these methods. It is recommended that researchers and evaluators seek additional training, web resources, or courses on individual methods they would like to use.
Quantitative Methods
- Observational Study, also often called pre-experimental studies. An observational or pre-experimental study is any study that measures and analyzes outcomes after an intervention is used by participants (i.e., the educational product), but does not systematically compare the outcomes or performance of participants between two or more comparison groups or randomize its participants into the groups[1]. The observational study is conventionally seen as weaker at making causal inferences than an experiment or quasi-experiment, but an observational study is far easier and cost-effective to conduct and thus is acceptable as a method of evaluating for the effectiveness of a product at achieving the desired competency and learning outcomes.
- Common types of observational or pre-experimental studies:
- One-group pre-post study. This is an evaluation where only one group of participants is analyzed for whether their post-scores on an instrument were higher than their pre-scores, when evaluated in a pre-post way. Before participants begin an educational experience, they are given evaluation instruments (i.e., assessments) that are called the pre, which demonstrate their competency with knowledge and skills, and demonstrate their current state of other areas like affect and perception. After they participate with the educational product under study, the participants are given identical instruments that they were given at the beginning, which are called the post. The post is compared to the pre, which should be identical. Any statistical differences between the post and the pre can be used to infer that there was a change in the participant, and that learning did occur and that the learning outcomes were achieved as intended. The degree to which there was an observed change is called the effect size, which is often reported in the results as well. The effect size shows how much a person on average improved (or worsened) their score in the post compared to the pre.
- Self-selected or admin-selected multi-group pre-post study. This is similar to the one-group pre-post study, but two or more comparison groups are instead formed and differences in performance are examined between them. Observational studies may divide participants into groups for comparison, by either the evaluator choosing groups using some criteria (i.e., admin-selected) or by participants choosing their groups (i.e., self-selected). However, such a design fails to be a true experiment or quasi-experiment if the selection into groups is not random. For instance, an evaluation may divide the participants into roughly equal groups of their level of participation, such as high participation, moderate participation, and low participation. In this case, participants essentially self-selected into a group based on their own chosen level of participation. As such, this example is not a true experiment because the groups were not randomly selected or that random selection was not systematically controlled before the start of the study.
- One-shot study. This is an evaluation where the participants are evaluated only at the end of the experience (i.e., no pre-test is given). This type of study measures competency and performance is typically compared to a set standard or criterion for what determines successful performance. For example, if a skill test is administered at the end of a person's participation with an educational product, a one-shot study would evaluate for their level of competency with the skill test and whether the level of competency met a pre-determined expected level of competency (as set by the evaluator). Similarly, a certain level of satisfaction from the participants might be desired by the evaluator, which could indicate whether the product was successful. The criteria or levels that are set by the evaluator for whether participants were successful on average can be arbitrary if not well rationalized and justified, as it does not compare to past performance (i.e., a pre-test) or to an alternative condition.
- If an experimental study is not possible, or if an evaluator is in the early stages of evaluating the product, it is appropriate to conduct some type of observational or pre-experimental study. In fact, observational studies are probably the most common type of study that are used to demonstrate that a product "worked" as intended or to demonstrate competency or learning from a product. Therefore, the evaluator is concerned with understanding if learning is observed (via a pre-post assessment design).
- Depending on the structure of the data, observational studies typically use t-tests, ANOVA, and regression statistical methods for ordinal-type and scale-type data, and chi-square and logistic regression methods for categorical/binary variable types.
- Common types of observational or pre-experimental studies:
- Experimental Study. When designed properly, an experiment or quasi-experiment compares a treatment group to an alternative or control group in a way that increases the ability of the evaluator to make claims that the product is the true cause of observed learning and other changes in participants.[2] This is useful to see whether the observed learning or competency occurred as a result of the educational product and not just by chance or other common factors. If all other aspects are the same between the comparison groups and if the assignment into comparison groups was random (experiment) or semi-random (quasi-experiment), the educational outcomes of an educational product can be experimentally compared to an alternative approach (such as normal educational practice). Any observed differences can then be used to confidently infer that the observed differences are caused only by the product. In other words, an experiment can let an evaluator say with confidence that an educational product performed better than alternative approaches. See the page on the research question category of comparison of educational products for more information on how experiments are a stronger way to demonstrate the effectiveness of a product, but are harder and more costly to conduct.
Qualitative Methods
- Basic qualitative analysis. In any basic qualitative analysis[3], the evaluator will sort and examine qualitative and quantitative data to identify common themes that are evident in the data. The identified themes will be sorted into categories, which can then be described in detail by the evaluator.
- Instead of attempting to demonstrate that people statistically either learned or demonstrated an adequate level of competence at the end of using a product, the evaluator doing any kind of basic qualitative analysis will instead attempt to identify and describe how a person learned from the experience or how they demonstrated competency from the experience. Additionally, by using systematic and well-documented criteria for analysis, the evaluator can provide an argument that a product achieved its goals and outcomes descriptively.
- Qualitative analyses are a useful approach to understanding how and why products differ, which can provide a high degree of interpretability and "real-world" context for readers in comparison to some of the more quantitative methods. Quantitative methods also use large datasets of participants to make inferences and an experimental approach may not always be possible, making a qualitative approach more realistic and appealing.
- Some evaluations require comparisons that use quantitative statistical analyses of outcomes and comparisons, such as those by funders or purchasers of educational products. However, in circumstances where it is appropriate, high quality qualitative results with high validity and reliability and that argue for one product having better outcomes than another can also be achieved if the evaluator uses qualitative evidence that is collected and analyzed systematically by an evaluator.
- Case Study. A case study is a systematic approach to richly describing an educational product and the factors that influence how people learn from it.[4] A multiple case study is when two or more "cases", or educational products, are evaluated simultaneously and compared from multiple perspectives.
- A case study will investigate and detail the design of products, how people used the product, and what kinds of outcomes were observed. Although case study is a method that is considered to be qualitative, case studies also often use quantitative data to enrich the descriptions on how, why, and with what effect a product is used (i.e., what and how much people learned). So, case study methodology is not limited to qualitative or text-based data only!
- Case studies are very valuable for understanding how individual participants are influenced while using a product. In a case study that compares two or more products, the evaluator can critically identify comparisons and contrasts between the products throughout writing the case study.
- Through rich descriptions and even storytelling about how products were designed and used, a reader of a case study is ideally informed about how the product brought about changes in participants and how the product influenced their learning outcomes. It can also richly describe the learning outcomes and results from using the product. However, it is difficult to infer in a measurable and generalizable way that the observed effects were commonly seen as a result of using the educational product, or if they instead occurred by chance alone, which is something that quantitative experimental studies can show with greater degrees of certainty.
Mixed Methods
- Both qualitative and quantitative methods may be used in evaluating what was learned and how effective educational products were at achieving their goals of learning and competence. Both methods may be used simultaneously to confirm and support arguments.
- Mixed methodology is becoming an increasingly common approach toward showing multiple sides of the same type of analysis to answer research questions. Such approaches add cross-validation and triangulation so that multiple perspectives are considered and support each other when making claims about a product.
Related Concepts
Examples
None yet - check back soon!
External Resources
None yet - check back soon!
References
- ↑ Researchconnections.org. (n.d.). Pre-Experimental Designs. https://www.researchconnections.org/research-tools/study-design-and-analysis/pre-experimental-designs
- ↑ Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton, Mifflin and Company.
- ↑ Merriam, S. B., & Tisdell, E. J. (2015). Qualitative research: A guide to design and implementation. Chicago: John Wiley & Sons
- ↑ Yin, R. K. (2009). Case study research: Design and methods (Vol. 5). Sage.