Activities in training and development represent a crucial function in organizations. This constitutes exploring the wide range of technical knowledge and resources the organization has at its disposal and translates into technical skills that can lead to increased organizational performance.
Training is one of the many processes influencing organizational success. But a training program and rating guidelines should be carefully designed, organized, delivered, and properly evaluated. To determine the level of effectiveness of a training program, some form of evaluation is needed to determine whether or not the training has brought about the desired changes in methods or performance.
Training is necessary because it inspires organizational change. In a national survey in 1996, approximately 59 million U.S. employees were given training and development costing about $200 billion (Phillips, cited in Craig 2002). The number is increasing and today it is common for employees to be sent for training or may be required to undergo online training. In 1999, about 79% of U.S. employees had undergone training (The American Society for Training and Development, cited in Craig 2002).
With globalization and new technology continually in use in organizational functions, training is an extreme necessity for today’s firms. A training theory asserts that adult learners must have control in their work environment and set the goals and purposes of learning so that they can help trainers attain their desired goals (Hubbard, cited in Jackson 2006).
This essay argues that without evaluation, any investment in training cannot be properly managed and time and money will be wasted for inappropriate courses and intervention. Evaluation is important to guide future training decisions and to convince top management that training can contribute to business success. Different ways of evaluating or measuring training effectiveness linked to individual performance and organizational success have been described and discussed. We have outlined reasons for an increasing focus on organization-wide and longer-term outcomes.
During World War II up to the 1960s, evaluation theory addressed methods for completing the evaluation in the field (Shadish & Luellen, cited in McPherson 2015). This has evolved and researchers expanded the scope encompassing methods, causal, processes, and the sociopolitical contexts surrounding evaluations (Shadish & Reichardt, cited in McPherson 2015). Mathison (cited in McPherson 2015, p. 18) indicated that theories of evaluation refer to a credible body of philosophies that elucidate and provide direction to the system of evaluation.
Focus in the evaluation of training effectiveness is high as organizations invest hundreds of billions of dollars annually in training and development. Without a thorough evaluation of training programs, the question of whether investments were well spent goes significantly unanswered, and opportunities to improve training programs will be ignored (Craig 2002). There should be more empirical research to answer the significant problems of current training programs and to allow the evaluation to have a significant place in improving training for organizational success.
Whilst there is no exact definition for training evaluation, there are various valid reasons why this should be conducted. Training evaluation aims to determine: how the training can support the organization’s objectives; how is it related to the organization’s critical success factors and job responsibilities and performance expectations; how is it related to individual goals of the employees; whether it corresponds to company policies and procedures; whether the training is properly executed; and whether supervisors and employees are trained to use and live under the organization’s system (Jamerson 2012).
However, researchers lament the lack of a well-defined formal method of implementing the training-needs assessment. Common training-needs assessment methods require evaluation done by managers to which employees either agree or disagree (O’Brien & Hall, cited in Fraser 2013). Moreover, training evaluation is performed to improve the quality of the training, on how it should be delivered – by whom and how, and the many circumstances needed for a good training; evaluate the efficiency of the entire training course (the trainer and the method); rationalize the course (prove that the benefits offset the expenses) and substantiate the importance of training (Fraser 2013).
A search of the literature using the term program evaluation revealed a broad range of usage for this term evaluation theory. Christie (cited in McPherson 2015) indicated that because of the complicated meaning of evaluation theory in the literature, various understandings are taken. She used the term ‘folk theories’ instead of implicit theory and added that she would usually use these terms interchangeably.
Additionally, evaluation writers have provided their operational definitions of terms such as theory, model, or framework. Kirkpatrick (cited in Pulichino 2007) and his students interchanged taxonomy and model. Critics attacked Kirkpatrick’s model, saying that it was a taxonomy and not a model. Shadish et al. (cited in McPherson 2015) added that there is no widely agreed definition of evaluation theory and provided the five theoretical bases essential to a good theory of evaluation, namely: ‘practice, knowledge, use, value, and social programming’ (p. 22).
Training and development motivate change and change is vital in an organization. Three types of change can be seen after measurements are compared: the ‘alpha, beta, and gamma’ (Golembiewski, Billingsley, & Yeager, cited in Craig 2002, p. 8). Gamma change is said to have occurred when the subject being addressed has different meanings during two different periods (in Time 2 the construct has different meanings than it had in Time 1).
For example, in the 1960s African-Americans viewed ‘freedom’ as not being able to ride in the backs of buses as did white Americans. This concept was changed in the1970s when ‘freedom’ pertained to the design of the transit systems. Currently, gamma change can be defined based on factor analysis. Researchers argue that factor analysis determines the meanings of constructs by pointing at the items that refer to the factors (Lindell & Drexler, cited in Craig 2002).
When gamma change is not the result, the beta or alpha can be taken for assessment. Here, a training intervention provided to the employees taught them about a more qualitative type of decision-making that they had not known yet. This improved their knowledge about a higher kind of participative decision making. The alpha change occurs when both the gamma and the beta change cannot be assessed. The entire process uses quantitative analysis.
The commitment of top management to commit to change cannot be over-emphasized. Craig (2002) shows how redisposition at the senior management level influences the HRD role. Whilst training aims at a low organizational level that might be achievable without this support, an implicit role at a strategic level requires a degree of senior management support. Training at the operational level may be competently organized and delivered, but if training is to be considered a corporate strategic function, it needs to embrace all levels within the organization and thus provide a basis for consistency and coherence in training policy at the corporate level.
Training has been at the forefront of criticisms since employees do not know how to use what they learned in the training exercise (Vermeulen, cited in Fraser 2013). Management believed that training was successful and that learning was achieved after the conduct of training, but no assessment was made as to how training achieved its goals. Fraser (2013) expressed the importance of role-playing to allow employees to transfer their learned skills and knowledge upon returning to their office. But role-play has been ignored by HR managers. HR people propose job analysis as an HRM tool to understand training needs assessment, which is now commonly termed training evaluation.
Evaluation should be a means to an end. Evaluation differs from validation in that it attempts to measure the overall cost-benefit of the course or program and not just the achievement of the mentioned objectives. The term is also used in the general judgmental sense of the continuous monitoring of a program or the training function as a whole (Fraser 2013).
In the training literature, a need is defined as a gap between the present state and a future state where an inconsistency exists between ‘what is’ (the present state) and ‘what should be,’ the desired state (Witkin & Altschuld; Swist, cited in Fraser 2013, p. 70).
Craig (2002) proposed several taxonomies of training criteria to determine the effectiveness of the structure of training. The taxonomy of training outcomes are articulated by Kirkpatrick (cited in Craig 2002), with his four levels of taxonomy which determines among participants’ reactions to training (level one), their attainment of new
knowledge (level two), changes in their on-job behavior (level three), and organizational results (level four). These levels signify broad increases in the logical distance between the training objective and the outcome being measured, with each increase in distance having an associated increase in the difficulty of causal attribution.
An organization, in particular the supervisor, will know what is working well in a course through evaluation. Evaluation helps evaluators gain information for informed judgment about course effectiveness, and thus the value of training. Fraser (2013) emphasized the use of training models to indicate the kind of training employees need.
The value of training can become obvious if the outcome of that training is known. And what should be known is change, which is also asked in the training criteria. The correct training criteria must be given because this is where evaluation depends. One of the recognized criteria was developed by Kirkpatrick in the form of a model (cited in Pulichino 2007). This model promotes the idea that the design, delivery, and evaluation of training need to be treated as part of a single, integrated process. One reason for this is that it is difficult to evaluate results unless we are clear about what we were attempting to achieve. Recognizing the limitations of narrow assessment-based approaches, Kirkpatrick’s four-level model incorporates such things as trainees’ reactions and the impact of training on organizational performance.
The different approaches
The Kirkpatrick Model, which used the four levels in evaluating training programs, was first used in 1959 and has since been an industry standard. Although it was not Kirkpatrick’s intention to establish an industry standard, or even to provide a model or theory of evaluation, the steps have been used as a model. The four levels evolved from “steps” to “levels” to “model” and without Kirkpatrick knowing that it would become an actual standard for training evaluation. The four levels became the ‘Kirkpatrick Model’ (Pulichino 2007 p. 18).
Kirkpatrick conceived the four steps of training evaluation in the 1950s, where the workplace environment was different than it is today. During that time, training programs and courses were taught in a formal classroom setting with an instructor and a subject matter expert (Pulichino 2007). Computers were not yet popular and e-learning was not in existence.
Kirkpatrick proposed that evaluating training is a four-step process and it can be done in succession, meaning each step leads to the next. These four segments are: ‘reaction, which asks the trainees reaction to a particular program; ‘learning,’ which measures the knowledge, skills, and attitudes gained during the training; and ‘behavior,’ which points to the final results after the training (Jamerson 2012). Kirkpatrick emphasized that by using this training evaluation, one can appreciate the value of training. And by using evaluation theories, the importance of training is enhanced.
Other approaches to evaluation have been proposed, ranging from a narrow focus on the process of training itself to more holistic, organization-wide systems. Each approach may have its strengths, but it is probably misguided to expect any single approach to serving all purposes for all organizations.
The goal-free approach is not confined to the set objective so that other types of learning and behavior change can be identified. To make evaluators provide an unbiased evaluation, they have to be kept unaware of the goals of the program, allowing the program’s achievements to be those identified by the participants. By this approach, some unplanned outcomes and those identified by the training program can be achieved. This approach supplements the goal-based approach. Oftentimes, trainers use the goal-based approach because it employs ‘The Planned Training Cycle,’ and the major concentration of evaluation is on learning objectives, training strategy, the method of training, and the outcome (Jamerson 2012).
The professional review approach is based on syllabi and the extent to which they meet agreed professional standards, a good example being the requirement to pass the examination to become a member of a professional association. The quasi-legal approach is the method where a panel is set up and witnesses are called to submit evidence, values, and beliefs, from program organizers and users, and financial decision-makers.
A distinct type and very detailed training evaluation is the information evaluation approach in which an expert trainer continuously observes the environment to draw in people’s insights, suggestions, and sometimes body language. This can be attained by monitoring trainees’ responses during the training period, whether in formal or informal situations, at business unit meetings, or occasional comments by senior managers.
Kirkpatrick’s (cited in Pulichino 2007) model has remained popular among evaluators because it has systematized evaluation centered on results. It is an industry-standard but as a whole, few organizations use it. This is because some critics argue that the levels do not connect. According to this theory, a high measure at one level does not show in other levels (Pulichino 2007), whereas in actual situations in other organizations, evaluators focus on the ‘reactions’ level because it is an easy task and quick to analyze than the other three levels.
Kirkparick (cited in Pulichino 2007) countered that a positive evaluation of one of the steps does not guarantee or even imply that there will be a positive evaluation in another step. This might lead to a no-correlation among the results of the four steps, but he asserts that this has to occur, even if he did not support a theory or basis for this claim. He further states that a training director may have done a wonderful job assessing the trainees’ reactions, but there is still no assurance that the trainees attained satisfactory learning. It is also no certainty that the training will change the participants’ behavior, or results will come out from the training.
Evaluation at Steps 3 (behavior) and 4 (results) is more difficult to achieve than at steps 1 (reaction) and 2 (learning), considering that the steps need ‘a more scientific approach’ and several other factors (Kirkpatrick, cited in Pulichino 2007). Factors cited include motivation to improve, work improvement, and the opportunity to practice the newly acquired knowledge or skills. There is also the separation of variables which points to other issues that might have affected the behavior and results. These overriding variables naturally affect results at Levels 3 and 4 but are not necessarily within the range of experience of training evaluators.
In less than a decade after the publication of Kirkpatrick’s articles about the evaluation models, he and one of his graduate students by the name of Ralph Catalanello at the University of Wisconsin provided the results of a research study which analyzed techniques used by the various sectors in evaluating their training programs (Catalanello & Kirkpatrick, cited in Pulichino 2007). They further described the four-step approach that was not popular in industry conferences.
The researchers focused their study on the frequency of usage of each of the four steps, using a sample of 110 business firms. They revealed that 78% of the respondents tried to measure trainee reactions, but just half or less of the respondents focused on learning, behavior, or results. Kirkpatrick and Catalanello (cited in Pulichino 2007) found that 35% of the firms who measured learning tested trainees before and after the training to compare results. There was also no positive correlation found among the four steps. The researchers concluded that evaluation was still a promising discipline among training supervisors and practitioners; trainee reaction was the popularly measured criteria; and that regarding the difficult steps, there was nothing done to improve it.
Kirkpatrick (cited in Pulichino 2007) also indicated that the primary stakeholders in the training, such as the trainee, his/her manager, workgroup, trainer, and organization, should be involved in the evaluation process. The criteria in determining the value of training may be similar in some instances but there can also be differences. In other words, evaluation is a collaborative process involving the major stakeholders in which the aim is to develop a results-oriented and value-inspired training program.
Craig (2002) asserts that Kirkpatrick believed that there was a general conclusion among practitioners that the purpose of the evaluation was to determine the effectiveness of training programs and find value in it. But he also questioned the value of fulfilling this purpose unless and until the evaluator understood the specific terms and limitations by which effectiveness is to be quantified and qualified, i.e. a factor analysis and other scientific methods have been conducted.
Practitioners should understand what they need to evaluate: how the trainees react to the training, measure what they learn, how they partake of the learning process to the trainees’ behavior and performance on the job, and whether their organization realizes a desired change or improvement in its achievement of business results. Kirkpatrick (cited in Craig 2002) further contends that these are important factors and without such understanding, the evaluator’s problem lies in how and where to start.
The aim and purpose of Kirkpatrick’s four steps approach are to get the evaluator to get started. By focusing on the four steps approach, and breaking down the task of evaluation into a logical process, the task of evaluation becomes simplified and can offer practitioners clear and attainable goals which, when considered appropriately, will help them find the effectiveness of their programs distinctly and sequentially.
Evaluators need to know the outcome of the training as they would be able to estimate the outcomes from one level, then on to the next. If the evaluator knows the results, he/she will be able to know the behavior of the trainee as an outcome of the result. If the evaluator knows the behavior, he/she will be better able to know what learning needs that should happen to change or improve the behavior.
If we know what needs to be learned, we can have a better plan so that trainees can have a positive reaction to the training. The sequence of design and planning, like taking the four levels in reverse order, can be the ‘hidden secret’ or the key to creating a researchable model out of the taxonomy mentioned in the discussion above. Once we have done this, this design would enable us to evaluate reaction up to results.
In conclusion, we can draw the value of the training through the theories of evaluation and the models discussed in this essay. We should be able to understand the desired results of the training evaluation quantitatively and qualitatively by considering the various variables that affect the outcome of the results (Craig 2002). Training could affect the outcome of the results. Metrics and measurements should be considered to find out whether training can impact on results, and this has to be related to outcomes.
We have also to understand what behaviors could influence the results. In this step, the evaluator must consider the variables that may affect a change or improvement in behavior. Training can affect behavior. Our next step is to identify what needs to be learned to change or improve behavior and produce the desired results. Another step is to identify the desired positive reactions of the trainees as related to what needs to be learned, what behavior needs to be changed or improved, and what results need to be achieved.
When training is done, the effectiveness can be assessed using Kirkpatrick’s four steps. These activities, drawn from the literature and Kirkpatrick’s model, would provide evaluators a more reliable and rigorous job of designing and delivering the appropriate training programs. This would also show that training programs have been effective in the different levels of evaluation.
Craig, S. 2002, Implicit theories and beta change in longitudinal evaluations of training effectiveness: an investigation using item response theory, PhD Thesis, Virginia Polytechnic Institute and State University. Web.
Fraser, J 2013, A gap analysis of employee training needs in supply chain management, PhD Thesis, University of Pretoria. Web.
Jackson, S 2006, Program effectiveness of job readiness training: an analysis and evaluation of selected programs in St. Louis, Missouri, ProQuest Information and Learning Company, Ann Arbor, Michigan.
Jamerson, K 2012, Addressing the unique training needs of post-secondary career and technical school faculty: the development, implementation, and evaluation of a pedagogical training workshop, PhD Thesis, Robert Morris University. Web.
McPherson, C 2015, Implicit and explicit evaluation theories of expert evaluation practitioners, PhD Thesis, University of South Alabama. Web.
Pulichino, J 2007, Usage and value of Kirkpatrick’s four levels of training evaluation, PhD Thesis, Pepperdine University. Web.