0% found this document useful (0 votes)
111 views24 pages

Chen 2012

This document discusses theory-driven evaluation, including its conceptual framework and examples of applications. It describes the key components of a program theory, including the change model which outlines the causal process between intervention and outcomes, and the action model which outlines how to deliver the intervention. It also provides an example of a theory-driven process evaluation conducted on an anti-drug abuse program for middle school students in Taiwan. The evaluation examined how the program was implemented and helped to reduce drug abuse among students by over 96%. In summary, the document outlines the conceptual framework of program theory that guides theory-driven evaluations and provides an example of how such an evaluation can examine a program's implementation and success.

Uploaded by

kevin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views24 pages

Chen 2012

This document discusses theory-driven evaluation, including its conceptual framework and examples of applications. It describes the key components of a program theory, including the change model which outlines the causal process between intervention and outcomes, and the action model which outlines how to deliver the intervention. It also provides an example of a theory-driven process evaluation conducted on an anti-drug abuse program for middle school students in Taiwan. The evaluation examined how the program was implemented and helped to reduce drug abuse among students by over 96%. In summary, the document outlines the conceptual framework of program theory that guides theory-driven evaluations and provides an example of how such an evaluation can examine a program's implementation and success.

Uploaded by

kevin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Theory-driven evaluation: Conceptual framework, application and advancement 17

Theory-driven evaluation: Conceptual framework,


application and advancement
Huey T. Chen

1 Introduction

There is an impressive amount of literature on theory-driven evaluation pub-


lished in the past few decades. The literature devoted to this topic includes four
volumes of New Directions for Evaluation (Bickman 1987, 1990; Rogers, Hasci,
Petrosino, & Huebner 2000; Wholey 1987), several books (Chen 1990, 2005;
Chen/Rossi 1992; (Connell, Kubisch, Schorr, & Weiss 1995; Fulbright-
Anderson, Kubisch, & Connell 1998; Pawson & Tilly 1997) and numerous arti-
cles published in various journals (see recent review by (Coryn, Noakes,
Westine, & Schoter 2011; Hansen & Vedung 2010). Furthermore, major evalua-
tion textbooks (Patton 1997; Posavac & Carey 2007; Rossi, Lipsey, & Freeman
2004; Weiss 1998) have a chapter(s) introducing the concepts, methodology, and
usefulness of theory-driven evaluation. The purpose of this chapter is to discuss
the conceptual framework, applications, and new developments of theory-driven
evaluation for facilitating further advancement.

2 Conceptual Framework of Program Theory

The tenet of theory-driven evaluation is that the design and application of evalua-
tion needs to be guided by a conceptual framework called program theory (Chen
1990, 2005). Program theory is defined as a set of explicit or implicit assump-
tions by stakeholders about what action is required to solve a social, educational
or health problem and why the problem will respond to this action. The purpose
of theory-driven evaluation is not only to assess whether an intervention works
or does not work, but also how and why it does so. The information is essential
for stakeholders to improve their existing or future programs.
Theory-driven evaluation is sharply different from another type of evalua-
tion, called black-box evaluation. Black-box evaluation mainly assesses whether
an intervention has an impact on outcomes. It does not interest in the transforma-
tion process between the intervention and outcomes. Similarly, theory-driven

R. Strobl et al. (Hrsg.), Evaluation von Programmen und Projekten für eine demokratische
Kultur, DOI 10.1007/978-3-531-19009-9_2, © Springer Fachmedien Wiesbaden 2012
18 Huey T. Chen

evaluation is also different from method-driven evaluation. Method-driven eval-


uation uses a research method as a basis for conducting an evaluation. According
to method-driven evaluation proposes the design of an evaluation is mainly guid-
ed by the predetermined research steps required by a particular method, quantita-
tive, qualitative, or mixed. Unlike method-driven evaluation views evaluation
mainly atheoretical, methodological activities, Method-driven evaluation tends to
ignore stakeholders’ view and concern in evaluation.
As a basis for designing theory-driven evaluation, program theory is a sys-
tematic configuration of stakeholders’ prescriptive assumptions and descriptive
assumptions underlying programs, whether explicit or implicit (Chen 1990,
2005). Descriptive assumptions, called change model, deal with what causal
processes are expected to happen to attain program goals. Prescriptive assump-
tions, called action model, deal with what actions must be taken in a program in
order to produce desirable changes. Theory-driven evaluation uses the action
model and change model to address contextual factors and planning and imple-
mentation issues that are greatly interested to stakeholders.
Change Model: A change model describes the causal process generated by
the program. The elements of a change model consist of the following three
elements:
Goals and Outcomes: Goals reflect the desire to fulfill unmet needs, as with
poor health, inadequate education, or poverty. Outcomes are the concrete, meas-
urable aspects of these goals.
Determinants: To reach goals, programs require a focus, which will clarify
the lines their design should follow. More specifically, each program must iden-
tify a leverage or mechanism upon which it can develop a treatment or interven-
tion to meet a need. That leverage or mechanism is variously called the determi-
nant or the intervening variable.
Intervention or Treatment: Intervention or treatment comprises any activity
(ies) in a program that aims directly at changing a determinant. It is, in other
words, the agent(s) of change within the program.
Action Model: An action model is a systematic plan for arranging staff, re-
sources, settings, and support organizations to reach a target group and deliver
intervention services. The action model consists of the following elements.
Implementing Organization: Assess, Enhance, and Ensure Its Capabilities:
A program relies on an organization to allocate resources, to coordinate activi-
ties, and to recruit, train, and supervise implementers and other staff. How well a
program is implemented may be related to how well this organization is struc-
tured. Initially, it is important to ensure that the implementing organization has
the capacity to implement the program.
Theory-driven evaluation: Conceptual framework, application and advancement 19

Program Implementers: Recruit, Train, Maintain Both Competency and


Commitment: Program implementers are the people responsible for delivering
services to clients: counselors, case managers, outreach workers, school teachers,
health experts, and social workers. The implementers’ qualifications and compe-
tency, commitment, enthusiasm, and other attributes can directly affect the qual-
ity of service delivery.
Peer Organizations/Community Partners: Establish Collaborations: Pro-
grams often may benefit from, or even require, cooperation or collaboration
between their implementing organizations and other organizations. If linkage or
partnership with these useful groups is not properly established, implementation
of such programs may be hindered.
Intervention and Service Delivery Protocols: Intervention protocol is a cur-
riculum or prospectus stating the exact nature, content, and activities of an inter-
vention – in other words, the details of its orienting perspective and its operating
procedures. Service delivery protocol, in contrast, refers to the particular steps to
be taken to deliver the intervention in the field.
Ecological Context: Seek Its Support: Some programs have a special need
for contextual support, meaning the involvement of a supportive environment in
the program’s work. Both microlevel contextual support and macrolevel contex-
tual support can be crucial to a program’s success. Microlevel contextual support
comprises social, psychological, and material supports clients need to allow their
continued participation in intervention programs. In addition to microlevel con-
textual support, program designers should consider the macrolevel context of a
program, that is, community norms, cultures, and political and economic proc-
esses. These, too, have the ability to facilitate a program’s success.
Target Population: Identify, Recruit, Screen, Serve: In the target group el-
ement, crucial assumptions at work include the presence of validly established
eligibility criteria; the feasibility of reaching and effectively serving a target
group; and the willingness of potential clients to become committed to, or coop-
erative with, or at least agreeable to joining the program. Relationships among
the components are illustrated in Figure 1.
20 Huey T. Chen

Figure 1: The Conceptual Framework of Program Theory

Figure 1 indicates that the action model must be implemented appropriately to


activate the „transformation“ process in the change model. For a program to be
effective its action model must be sound and its change model plausible; its im-
plementation is then also likely to be doing well. Figure 1 also illustrates evalua-
tion feedback as represented in dotted arrows. Information from implementation
can be used to improve the planning or the development of the action model.
Similarly, information from the change model can be used to improve the im-
plementation process and the action model. This conceptual framework of pro-
gram theory should be useful to evaluators charged with designing an evaluation
that produces accurate information about the dynamics leading to program suc-
cess or program failure.
Theory-driven evaluation: Conceptual framework, application and advancement 21

3 Examples of Theory-Driven Evaluation

3.1 Example of Theory-Driven Process Evaluation

Comprehensive theory-driven process evaluation is associated with certain strat-


egies and approaches from the taxonomy. Two evaluations are discussed here to
show some of the possible functions of this kind of evaluation.

3.1.1 Evaluating an Anti-Drug Abuse Program.

One comprehensive, theory-driven process evaluation that closely mirrors this


handbook’s conceptual framework of program theory is an evaluation of a large
anti-drug abuse program for middle school students in Taiwan (Chen 1997). The
program asked teachers to identify drug-abusing students and provide them with
counseling services. A small group of top officials within Taiwan’s Ministry of
Education had designed the program; under the nation’s centralized education
system, the Ministry of Education approved appointments and salaries of teach-
ers and administrators. When the program began in January 1991, 3.850 students
had been identified as active drug abusers. That number declined sharply, plung-
ing 96 %, to 154 students by June 1991.
The program’s huge success led to a theory-driven process evaluation being
conducted to examine how the program had been implemented. Hopes were that
this program’s example could foster the smooth implementation of other pro-
grams. The anti-drug abuse program featured a documentary program plan, but it
was incomplete in comparison to the action model or program plan illustrated in
Figure 1. Acting as facilitators, evaluators convened separate focus group meet-
ings with top officials of the education ministry and with teacher representatives
to obtain the information needed to complete the program plan. (The separate
meetings acknowledged teachers’ tendency to be silent in the presence of top
officials, who have much more power than teachers do.) Evaluators played the
role of facilitators and consultants, helping these key stakeholders develop their
program theory. The final version of the program plan ultimately used for evalu-
ation had been agreed to by both groups; the plan is presented on the left side of
Table 1.
22 Huey T. Chen

Table 1: The Spring Sun Program: normative versus actual

Program
domains/ Normative Actual
dimensions
Goal/outcome Reduction of student drug use to Reduction of drug use, but uri-
be verified through urinalysis nalysis collection environment
not controlled
Treatment Primary: provide quality coun- Primary: counseling mainly
seling to abusers involved use of threats, admon-
Secondary: basic drug education ishment, and/or encouragement
not to use
Secondary: basic drug education
Implementation
Environment
Target group All drug abusing students Only those drug abusing stu-
dents who were easy to reach
Implementors Teachers provided with adequate Teachers lacked adequate drug
drug treatment training and treatment skills and information
information
Mode of deli- Compulsory individual counse- Compulsory individual counsel-
very ling ing; but with problems such as
lack of plan, format and objec-
tive
Implementing All schools that can adequately Smaller schools had difficulties
organisation implement the program implementing the program
Inter-organi- Effective centralized school Communication gap, mistrust
sational proce- system between Ministry of Education
dures and the schools
Micro-context Eliminate video game arcades Video game arcades still exist
Macro-context Strong public support Strong public support, but prob-
lematic education system (elit-
ism)

The program plan entailed mixing research methods – both quantitative and
qualitative – to collect data. For example, quantitative methods were applied to
rate teachers’ satisfaction with a workshop on drug counseling skills sponsored
Theory-driven evaluation: Conceptual framework, application and advancement 23

by the education ministry, whereas qualitative methods were used to probe con-
textual issues of the teachers’ opinions of the workshop. The right side of Table
1 displays empirical findings for the program’s real-world implementation; com-
parison of the program theory to the implementation reveals large discrepancies.
The program had been carried out, but the quality of services and the system of
implementation were far from being impressive. The discrepancies between plan
and implementation resulted from a lack of appropriate counseling training, the
overburdening of teachers with counseling work with no change to their usual
teaching responsibilities, and lack of communication as well as mistrust between
an authoritarian ministry and the teachers. The evaluation results created doubt
about how a program without strong implementation achieved a 96 % decrease
in drug abuse in schools.

3.2 Examples of Theory-Driven Outcome Evaluation

Two basic models of intervening mechanism evaluation predominate in the dis-


cipline: linear and dynamic.

3.2.1 The Linear Model

The linear model is currently a very popular application of intervening mecha-


nism evaluation. Linear models assume that the causal relationships among in-
terventions, determinants, and outcomes are unidirectional: intervention affects
determinant, and determinant then affects outcome. No reciprocal relationships
operate among the variables. In linear models, the number and sequence of the
determinants under study determine the model’s form. The following causal
diagrams illustrate the common linear model forms.
One-Determinant Model. This model, represented by Figure 2, contains a
single determinant and is the fundamental model for intervening mechanism
evaluation.
24 Huey T. Chen

Figure 2: An example of a one-determinant model

The one-determinant model is illustrated here by an evaluation of an alcohol and


drug abuse prevention program at a college (Miller, Toscova, Miller, & Sanchez
2000). The intervention consisted of multiple components: print media, video-
tapes, speakers, referral services, and development of self-control. The determi-
nant was perception of risk, and the outcome was a reduction in alcohol and drug
use among the students on the campus where the program was established. As
predicted, the data showed that after the interventions, there was heightened
awareness on campus of the risks of substance abuse, which in turn reduced
alcohol and drug use there. The one-determinant model is relatively easy to con-
struct.
Multiple-Determinant Model, No Sequential Order. Another common linear
model is the model with two or more determinants, each affected by the inter-
vention or affecting the outcome, but in no particular sequence. A workplace
nutrition program provides an example of the multiple-determinant model (Kris-
tal, Glanz, Tilley, & Li 2000). The intervention featured at-work nutrition classes
and self-help. The stakeholders and evaluators selected three determinants: pre-
disposing factors (skills, knowledge, belief in diet-disease relationship), enabling
factors (social support, perceived norms, availability of healthful foods) and
stage of change (action and maintenance stages being subsequent to the interven-
tion). The outcome variable was dietary change (eating vegetables and fruits).
The model of this program is illustrated in Figure 3.
Theory-driven evaluation: Conceptual framework, application and advancement 25

Figure 3: Workplace nutrition program as a multiple determinant. No se-


quential order.

Kristal and colleagues found that the intervention did enhance predisposing fac-
tors as well as the likelihood of entering and remaining in the subsequent stages
of change. They also found that the intervention did not affect enabling factors.
The program was failing because the intervention was failing to activate one of
the three determinants.
Multiple-Determinant Model With Sequential Order. The model containing
two or more determinants aligned in a causal order is a multiple-determinant
model with a sequential order. That is, certain determinants affect others in a
particular sequential order. An example of this kind of linear model is found in
an evaluation of a school-based antismoking campaign (Chen, Quane, & Garland
1988). The intervention contained components such as an antismoking comic
book, discussions of the health messages the comic book delivered, and parental
notification about the intervention program. The determinants of the model, in
sequence, were the number of times the comic book was read, and knowledge of
the comic book’s story and characters. The sequential order indicates that re-
peated reading of the comic book changed the extent of knowledge about the plot
and characters. The sequence is illustrated in Figure 4.
26 Huey T. Chen

Figure 4: Antismoking program as a multiple-determinant with sequential


order model

The outcome to be measured was change in attitudes, beliefs, and behaviors


related to smoking. The evaluation determined that the distribution of the comic
book affected the number of times the comic book was read, which in turn af-
fected knowledge of its content. However, neither of these determinants was
shown to affect students’ smoking-related attitudes, beliefs, or behaviors.
The Dynamic Model. The dynamic model of intervening mechanism evalua-
tion assumes that multidirectional, reciprocal causal relationships exist among
intervention, determinant, and outcome. The relationship between determinant
and outcome, especially, is reciprocal rather than one-way: The determinant
affects the outcome, and the outcome also affects the determinant. A hypotheti-
cal educational program illustrates the model well. The project’s focus was to
equip parents with skills and strategies to assist their children with homework;
homework had been chosen as a determinant of primary students’ school per-
formance. The model made clear, however, that the relationship between paren-
tal involvement and student performance need not be linear. Parents becoming
more involved in a child’s schoolwork might improve the child’s performance,
and then seeing the improved performance, parents perhaps might feel gratified,
stimulating their willingness to devote time and effort to remaining involved in
the child’s education. This form of the dynamic model is represented in Figure 5.
Theory-driven evaluation: Conceptual framework, application and advancement 27

Figure 5: Education program as a dynamic model

The conceptual framework of program theory is comprehensive enough to de-


sign and conduct different types of theory-driven evaluation depending on stake-
holders’ evaluation interests and needs. Readers are referred to Chen (Chen
1990, 2005) for detailed information on the following applications of theory-
driven evaluation: Facilitating stakeholders to clarify a program theory underly-
ing a program, using program theory to facilitate stakeholders for developing a
sound intervention program, using an action model to design a systematic proc-
ess evaluation, using a change model to design a theory-driven outcome evalua-
tion, and integrating an action model and change model or a comprehensive
evaluation.

4 Advantages of Theory-Driven Evaluation

Some advantages of theory-driven evaluation are discussed as follows:


Delineation of a strategy to consider stakeholders’ views and interests: An
evaluation suffers without adequate input from stakeholders. The challenges,
however, are how to understand stakeholders’ views and evaluation and how to
integrate their interests in evaluation. The conceptual framework of program
theory provides an effective tool for evaluators to communicate major evaluation
issues with stakeholders and to design an evaluation that incorporates their inter-
ests.
Holistic assessment: The conceptual framework of program theory allows a
holistic approach to assessing the merits of a program. Theory-driven evaluation
can explain how and why a program achieves particular results by illustrating its
means of implementation as well as underlying mechanisms of influence. The
conceptual framework of program theory addresses issues in both the action
28 Huey T. Chen

model and the change model, so that it helps evaluators achieve a balanced view
from which to assess the worth of a program.
Comprehensiveness of information needed to improve programs: A theory-
driven evaluation that examines how a program’s structure, implementation
procedure, and causal mechanisms actually work in the field will provide useful
information to stakeholders for program improvements.
Balance of scientific and practical concerns: Researchers are greatly con-
cerned about the scientific rigor of an evaluation, while stakeholders desire an
evaluation that addresses service issues. Since the conceptual framework of pro-
gram theory uses an action model to address service issues and tackle rigorous
issues in the change model, it has potential for greater dialogue and collaboration
between academic and practical communities and for narrowing the gap between
scientific and service communities.
Advancement of evaluation theory and methodology: Theory-driven evalua-
tion has been applied in addressing scientific and service issues for a few dec-
ades. Lessons learned from the applications can be applied to further advance
evaluation theory and methodology. The rest of this article will introduce recent
developments of theory-driven evaluation in areas such as the integrative validity
model and bottom-up evaluation approach.

5 Integrative Validity Model

Stakeholders are clients and users of evaluation results, evaluators must under-
stand and address their view and need in evaluation. Because of working stake-
holders intensively, theory-driven evaluation recognizes that stakeholders have a
great interest in intervention programs that are capable of accomplishing two
functions: goal attainment and system integration. Goal attainment means an
intervention can activate causal mechanisms for attaining its prescribed goals as
illustrated in the change model. System integration refers to an intervention is
compatible or even synergic with other components in a system. These compo-
nents include organizational missions and capacity, service delivery routine,
implementers’ capability, relationships with partners, clients’ acceptance, and
community norms as discussed in the action model. Stakeholders value goal
attainment, but they are equally or even more interested in system integration
because they are responsible for delivering services in the real world. Note also
that although goal attainment and system integration are related outcomes attrib-
utable to an intervention, they do not necessarily go hand-in-hand. An effica-
cious or effective intervention does not mean that it is suitable for a community-
Theory-driven evaluation: Conceptual framework, application and advancement 29

based organization to implement it or vice versa (Chen 2010; Chen & Garbe P.
2011).
Stakeholders are greatly keen in evaluative evidence in system integration
and goal attainment, but this interest has often not been satisfactorily met in
evaluations. Traditionally, evaluators have applied the Campbellian validity
typology (Campbell & Stanley 1963; Cook & Campbell 1979; Shadish, Cook, &
Campbell 2002) for outcome evaluation. It is essential to note that the Campbel-
lian validity typology was developed for research rather than evaluation purposes
(Chen H.T., Donaldson, & Mark 2011). Its primary aim is for researchers to
provide credible evidence in examining causal relationships among variables.
Evaluators have found the typology is also very useful for outcome evaluation
and have intensively applied it in addressing goal attainment issues. The typol-
ogy has made a great contribution to program evaluation. However, the applica-
tion of the typology as a major framework or standard for outcome evaluation
has added evaluators’ neglect of system integration issues. Since it is neither the
scope nor intention of the Campbellian validity typology to be used for the pur-
pose of designing well-balanced evaluations for meeting stakeholders’ evalua-
tion needs, it is up to evaluators to develop a more comprehensive perspective
for systematically addressing both goal attainment and system integration issues.
Theory-driven evaluation proposes an integrative validity model (Chen 2010;
Chen & Garbe P. 2011) to take on this challenge. Building on Campbell and
Stanley’s (Campbell & Stanley 1963) distinction of internal and external valid-
ity, the integrative validity model proposes three types of validity for evaluation:
effectual, viable, and transferable.
Effectual validity is the extent to which an evaluation provides credible evi-
dence that an intervention causally affects specified outcomes. This validity is
similar to the concept of internal validity proposed by Campbell and Stanley
(1963). According to the Campbellian validity typology, randomized experi-
ments are the strongest design in enhancing effectual validity. The next is quasi-
experimental methods. Effectual validity is crucial for addressing goal attain-
ment issues.
The integrative validity model proposes viable validity to address stake-
holders’ interest in system integration. Viable validity is the extent to which an
intervention is successful in the real world. Here, viable validity refers to stake-
holders’ views and experiences regarding whether an intervention program is
practical, affordable, suitable, evaluable, and helpful in the real world. More
specifically, viable validity means that ordinary practitioners – rather than re-
search staff – can implement an intervention program adequately, and that the
intervention program is suitable for coordination or management by a service
delivery organization such as a community clinic or a community-based organi-
30 Huey T. Chen

zation. An additional inquiry is whether decision makers think the intervention


program is affordable and can 1) recruit ordinary clients without paying them to
participate, 2) have a clear rationale for its structure and linkages connecting an
intervention to expected outcomes, and 3) ordinary clients and other stakeholders
regard the intervention as helpful in alleviating clients’ problems or in enhancing
their well-being as defined by the program’s real-world situations. In this con-
text, helpful is whether stakeholders can notice or experience progress in allevi-
ating or resolving a problem.
In the real world, stakeholders organize and implement an intervention pro-
gram. Thus, they have real viability concerns. Viability alone might not guaran-
tee an intervention’s efficacy or effectiveness, but in real-world settings, viability
is essential for an intervention’s overall success. That is, regardless of the inter-
vention’s efficacy or effectiveness, unless that intervention is practical, suitable
to community organizations’ capacity for implementation, and acceptable to
clients and implementers, it has little chance of survival in a community.
The integrative validity model also contributes to identifying viability eval-
uation – a new evaluation type that can assess the extent to which an intervention
program is viable in the real world (Chen 2010). Viability evaluation requires
mixed (qualitative and quantitative) methods. On the one hand, evaluation relies
on quantitative methods to collect data with which it can monitor progress on
recruitment, retention, and outcome. On the other hand, evaluation requires an
in-depth understanding of stakeholders’ views on, and their experience with, the
specific intervention program.
The third component of the integrative validity model is transferable valid-
ity. The concept is a revision of the Campbellian validity typology’s external
validity. Since the Campbellian typology was developed for research purposes,
external validity is conceptualized as an endless quest for confirmation of an
intervention’s universal worth – impossible for any evaluation to achieve (Chen
2010). The integrative validity model proposes a re-conceptualization of external
validity as transferable validity from a stakeholders’ perspective for usage in
evaluation. Qualitative evaluators (Coxe, West, & Aiken 2009) prefer to use the
term ‘transferability’ to external validity to emphasize that generalizability can
be enhanced by qualitative methods such as thick description. This chapter uses
transferability to represent issues related to generalizability, but stresses that
transferability can be enhanced by qualitative and/or quantitative methods.
Transferable validity for program evaluation is defined according to such con-
cerns. Thus, the integrative validity model defines transferable validity as the
extent to which evaluation findings of viability and effectuality can be trans-
ferred from a research setting to a real-world setting or from one real-world set-
Theory-driven evaluation: Conceptual framework, application and advancement 31

ting to another targeted setting. This definition stresses that transferability for
program evaluation has a boundary – the real world.
Evaluation approaches with strong effectual validity tend to be low in trans-
ferable validity. For example, efficacy evaluation provides the most rigorous
evidence on effectual validity, but it maximizes effectual validity at the expense
of transferable validity. Efficacy evaluation applies randomized controlled trials
(RCTs) that create an ideal and controlled environment in order to rigorously
assess intervention effect. Manipulation and control used in maximizing effec-
tual validity greatly reduce evaluation results’ transferable validity to the real
world. For example, to maximize effectual validity, RCTs usually use highly
qualified and enthusiastic counselors as well as homogenous and motivated cli-
ents that hardly resemble real-world operations. Stakeholders may regard evi-
dence provided in efficacy evaluation to be irrelevant to what they are doing.
Effectiveness evaluation is superior to efficacy evaluation for addressing
transferable validity issues. Effectiveness evaluation estimates intervention ef-
fects in ordinary patients in real-world, clinical practice environments. To reflect
the real world, recruitment and eligibility criteria are loosely defined to create a
heterogeneous and representative sample of the targeted populations. Interven-
tion delivery and patient adherence are less tightly monitored and controlled than
in efficacy evaluations. The central idea is that to enhance transferability, effec-
tiveness studies must resemble real-world environments. RCTs that require an
intensive manipulation of setting are not suitable for effectiveness evaluation –
evaluators often need to resort to non-RCT methods. Through scarifying some
level of effectual validity, effectiveness evaluation enhances transferable valid-
ity.
Theory-driven evaluation argues effectiveness evaluation’s transferable va-
lidity can be further enhanced by incorporating contextual factors and causal
mechanisms as described in the action-change framework in the assessment
(Chen 1990, 2005). In addition, theory-driven evaluation proposes the concepts
of exhibited or targeted generalization for facilitating evaluators to address trans-
ferability issues (Chen 2010). Exhibited generalization of an evaluation itself
provides sufficient contextual factors for an intervention to be effective in real-
world applications. Potential users can adapt the information on the effectiveness
of the intervention together with the contextual factors. Users can thereby assess
its generalization potential with regard to their own populations and settings and
decide whether to apply the intervention in their communities. Exhibited gener-
alization can be achieved through the „action model-change model“ framework
in the theory-driven approach (Chen 1990, 2005) as previously discussed. Stake-
holders sometimes have a particular real-world target population or setting to
which they want to transfer the evaluation results. This is targeted generalization;
32 Huey T. Chen

that is, the extent to which evaluation results can be transferred to a specific
population and real world setting. Targeted generalization is achieved through
methods such as sampling (Shadish et al. 2002), Cronbach’s UTOS approach
(Cronbach 1982), or the dimension test (Chen 1990). Thus through exhibited or
targeted generalization, transferable validity adds a workable evaluation concept
to program evaluation.
Furthermore, it is important to stress that transferable validity can mean ei-
ther transferability of effectuality or transferability of viability. Transferability of
effectuality has been the focus of the literature discussing external validity or
generalizability. Transferability of viability, however, is an emerging concept
that asks the question „To what extent can evaluation findings of an interven-
tion’s viability be transferred from one real-world setting to another targeted
setting?“ The distinction is important; that an intervention’ effectuality might
transfer to another setting does not guarantee that an intervention’s viability will
similarly be transferable.

6 Top-Down vs. Bottom-Up Approaches for Advancing Validity

It is desirable for an evaluation to have effectual validity, viable validity, and


transferable validity. As discussed previously, these types of validity do not go
hand in hand; it is extremely difficult to simultaneously maximize all three types
of validity in an evaluation. Two approaches have been proposed to sequentially
deal with them: top-down and bottom-up (Chen 2010; Chen & Garbe P. 2011).
The traditional top-down approach is a series of evaluations, beginning with
maximizing effectual validity by efficacy evaluations, then moving on to effec-
tiveness evaluations aimed at strengthening transferable validity. This strategy
has been intensively and successfully used in biomedical research. Many scien-
tists and evaluators traditionally regard such a top-down approach as the gold
standard of scientific evaluation. However, the application of this approach to
evaluate health promotion/social betterment programs are found to be not as
fruitful as expected. Recently, evaluators and researchers have increasingly rec-
ognized the application of this approach results in a huge gap between interven-
tion research and real-world practice (Glasgow, Lichtenstein, & Marcus 2003).
Theory-driven evaluation proposes the bottom-up approach (Chen 2010;
Chen & Garbe P. 2011) as an alternative to sequentially address validity issues.
Since stakeholders regard viable validity as prime importance, the bottom-up
approach proposes that the evaluation sequence begins with a viability evalua-
tion. If this real-world intervention is in fact viable, a subsequent effectiveness
evaluation provides sufficient objective evidence of the intervention’s effective-
Theory-driven evaluation: Conceptual framework, application and advancement 33

ness in the stakeholder’s real word. If necessary, the effectiveness evaluation


could also address issues of whether such effectiveness is generalizable to other
real world settings. After the intervention is deemed viable, effective, and gener-
alizable in real world evaluations, an efficacy evaluation using methods such as
RCTs will rigorously assess a causal relationship between intervention and out-
come. The differences between the top-down approach and the bottom-up ap-
proach are illustrated in Figure 6.

Figure 6: Top-Down Approach vs. Bottom-Up Approach


34 Huey T. Chen

The bottom-up approach has a number of advantages over the top-down ap-
proach:
Assure intervention’s usefulness to stakeholders and avoid wasting money.
The traditional top-down approach usually begins with an expensive and time-
consuming efficacy evaluation to assess an innovative intervention. After mil-
lions of dollars are spent on an efficacy evaluation, it might be found that the
efficacious intervention is very difficult to implement in the real world, not of
interest to stakeholders, or may not be real-world effective. This kind of ap-
proach tends to waste money.
By contrast, the bottom-up approach starts from viability evaluation. This
first assesses the viability of an intervention as proposed by researchers or stake-
holders. Because interventions with low viability are screened out in the begin-
ning, this approach could save funding agencies considerable money and re-
sources. The bottom-up approach encourages funding agencies to fund many
viability evaluations and to select highly viable interventions for further rigorous
studies.
Provide an opportunity to revise and improve an intervention in the real
world before its finalization. One top-down approach limitation is finalizing the
intervention protocol or package before or during efficacy evaluation – the pro-
tocol is not supposed to change after the evaluation. And when an intervention
protocol is finalized at such an early stage, it prevents the intervention from
gaining feedback from the real-world implementation or stakeholders’ inputs for
improvement. This approach seriously restricts an intervention’s generalizability
to the real world.
By contrast, the bottom-up approach affords an opportunity to improve an
intervention during the viability evaluation. Intervention protocols refined from
stakeholder inputs and implementation experience increase their real-world rele-
vancy and contribution.
Provide an Alternative Perspective for Funding. In theory, funding agencies
are interested in both scientific and viability issues. They want to see their fund-
ed projects be successful in communities or to have the capability of solving
real-world problems. In practice, however, many agencies tend to heavily em-
phasize scientific factors such as RCTs or other randomized experiments as a
qualification criterion for grant application (Donaldson, Christie, & Mark 2008;
Huffman & Lawrenz 2006), while paying insufficient attention to viability is-
sues. As discussed previously, if funding policy excessively stresses internal
validity issues, it could waste money on projects that might be rigorous and in-
novative but that have little practical value. The bottom-up approach provides an
alternative perspective for funding agencies to address scientific and viability
issues in funding process. This perspective suggests three levels of funding:
Theory-driven evaluation: Conceptual framework, application and advancement 35

Funding for viability evaluation: This funding level provides funds for as-
sessing the viability of existing or innovative interventions. It will formally rec-
ognize a stakeholder’s contribution in developing real-world programs. Re-
searchers can also submit their innovative interventions for viability testing. In
doing so, however, they will have to collaborate with stakeholders in addressing
practical issues.
Funding for effectiveness evaluation: The second level of funding is an ef-
fectiveness evaluation for viable and popular interventions. Ideally, these evalua-
tions should address both effectual and transferable validity issues.
Funding for efficacy evaluation: The third level of funding is efficacy eval-
uation for those interventions proven viable, effective, and transferable in the
real world. Efficacy evaluation provides the strongest evidence of an interven-
tion’s precise effect, with practical value as an added benefit.
These three levels of funding will promote collaborations between stake-
holders and researchers and ensure that evaluation results meet both scientific
and practical demands.

7 Concurrent Validity Approaches

Under the conceptual framework of the integrative validity model, concurrent


validity approaches contemplate dealing with multiple validity issues in a single
evaluation. A concurrent approach has important implications for program eval-
uation. Outcome evaluation is often time-consuming. For example, the turn-
around time for an efficacy or effectiveness evaluation of a program could easily
be a few years. A long turnaround time plus the related expenses are major rea-
sons why stakeholders ask for only one outcome evaluation as opposed to multi-
ple outcome evaluations as discussed in the top-down or bottom-up approaches
for a new or existing program.
In conducting a concurrent evaluation, evaluators face a challenging ques-
tion: what type of evaluation is preferable for addressing validity issues? General
guidance for concurrent approaches follows.
Maximizing Effectual Validity: When stakeholders need strong, objective
proof of a causal relationship between an intervention and its outcomes, when
they are willing to provide abundant financial resources to support the evalua-
tion, and when they are willing to accept a relatively long timeline for conduct-
ing the evaluation, effectual validity is a priority. Evaluators will use the Camp-
bellian validity typology, and when they do, RCT is the gold standard.
Maximizing Viable Validity: If stakeholders have a program with multiple
components that are challenging to implement in a community, and if they need
36 Huey T. Chen

evaluative information to assure the survival of the program, viable validity


should be a priority. If stakeholders need information about whether a program is
practical or helpful in the real world or whether real-world organizations, imple-
menters, and clients favor the program, an appropriate choice is to maximize
viable validity. Evaluators could apply a viability evaluation for this purpose.
Mixed (qualitative and quantitative) methods (Greene & Caracelli 1997; Tashak-
kori & Teddlie 2003) are particularly appropriate for viability evaluation.
Optimizing: If stakeholders prefer that an evaluation provide evidence of
two or three types of validity (e.g., viable, effectual, and transferable), they must
focus on finding an optimal solution for multiple validities in an evaluation
(Chen 1988, 1990; Chen & Rossi 1983). A combination of effectiveness evalua-
tion methodology with program theory is particularly useful for optimizing mul-
tiple validities (Chen 1990, 2005).

8 Theory-Driven Evaluation as an Alternative to Reductionism and Fluid


Complexity

Evaluators have different views on how to conceptualize an intervention program


and how to solve a problem. These different views of a program have a profound
influence on how to evaluate the program. To illustrate the point, I will start with
a discussion of two contrasting views of a program: reductionism and fluid com-
plexity. Reductionism postulates that a program is stable and can be analytically
reduced to a few core elements. A problem can be solved by using an appropriate
intervention. The focus of black-box evaluation discussed previously is a good
example of reductionism. The main focus of the evaluation is to assess whether a
manipulation of the intervention can produce desirable outcomes. Other elements
are subject to control in analysis in order to increase precision of assessment.
One of the major benefits of reductionism is that it coexists well with statistical
models and can provide an accurate estimation. Reductionism has made a sig-
nificant contribution to quantitative evaluation. However, reductionism can over-
simplify a program and provide an unsustainable solution.
Fluid complexity provides a contrasting view to reductionism. This perspec-
tive argues that a program tends to be made up of diverse and interactive ele-
ments in responding to turbulence in environment. As a result, a program is con-
stantly changing. A problem has to be solved by a modification of groups of
variables simultaneously and rapidly. The way an expedition team functions
provides a good illustration of the fluid complexity view. Christopher Columbus’
expedition team not only had to constantly revise its plans and activities in order
to alleviate ongoing external threats, but also completely changed the mission.
Theory-driven evaluation: Conceptual framework, application and advancement 37

After the original mission of finding a route to India was replaced with the new
mission of discovering a new world and tasks were adjusted accordingly, the
team and many others judged the expedition an enormous success. Fluid com-
plexity makes an important contribution by bringing evaluators’ attention to
environmental influences and the dynamics of program processes. This approach
may be useful for program planning and management, but in its current form, it
has limitations in evaluation. Not many existing quantitative methods or statisti-
cal models are capable of analyzing such complicated transformation processes
and interaction effects. Whether qualitative methods could meet the challenge
remains to be seen. Furthermore, if a program is extremely complex and dy-
namic, then it lacks an entity for meaningful evaluation. In this case, consultants
are more suitable than evaluators for offering opinions on how to address prob-
lems or issues generated from the constantly fluid and ever changing system.
The theory-driven evaluation’s view on a program represents a synthesis of
reductionism and fluid complexity. Theory-driven evaluation postulates that a
program must address both change and stability forces as described by these two
contrasting views. On the one hand, a program’s political, social, or economic
environment can create uncertainties that pressure the program for making
changes. On the other hand, a program has to maintain some level of stability in
order to provide a venue for transforming an intervention for desirable outcomes.
Many programs address these opposite forces through taking proactive measures
to reduce or even managing uncertainties. The action model and change models
discussed previously provide a conceptual framework for understanding where
proactive measures take place. For example, program managers and staff can
build partnerships to buffer political pressure, strengthen organizational ties with
funding agencies to increase chances to get funds, provide implementers training
and incentive to reduce turnover, mobilize its community bases to generate
community support for reducing criticisms, select a robust intervention for re-
ducing potential implementation problems, and so on. A problem can be solved
by reducing uncertainties and manipulating components as specified in the action
and change models.
By synthesizing reductionism and fluid complexity, theory-driven evalua-
tion may have the benefits of both worlds. It agrees with fluid complexity on the
influences of uncertainties on a program, but argues that uncertainties could be
reduced through anticipatory action such as better planning and information
feedback. In addressing change and stability forces, the theory-driven evalua-
tion’s program view as expressed in program theory is more complicated than
the reductionism’s view of a program, but its scope is manageable and analyz-
able within the capability of existing quantitative and qualitative methods. There
are programs suitable for applying either reductionism or fluid complexity, but
38 Huey T. Chen

the majority of intervention programs may be more applicable with the theory-
driven evaluation’s program view. Theory-driven evaluation provides an alterna-
tive for assessing these programs.

9 Discussion

Program evaluation is a young applied science. At its infancy stage, it had heav-
ily borrowed concepts, methods, approaches, and theories from matured sci-
ences. These methodologies and theories have been applied to evaluate and
found their usefulness. They will continue to make contributions to program
evaluation in the future. However, since these imported methodologies and theo-
ries were not developed for evaluation, I believe there are limits to how far they
can help to advance evaluation. To further advance program evaluation, we may
need more in-born evaluation theories and methodologies dedicating mainly for
evaluation causes to energize the field. The development of theory-driven
evaluation as demonstrated in this chapter represents an endeavour in this direc-
tion.

References

Bickman, L. (Ed.). (1987). Using program theory in evaluation. San Francisco Jossey-
Bass .
Bickman, L. (Ed.). (1990). Advances in program theory. San Francisco: Jossey-Bass.
Campbell, D. T., & Stanley, J. (1963). Experimental and quasi-experimental designs for
research. Chicago: RandMcNally.
Chen H.T./Donaldson, S. L./Mark, M. M. (2011). Validity Frameworks for Outcome
Evaluation In C. HT, S. L. Donaldson & M. M. Mark (Eds.), Advancing Validity.
In: Outcome Evaluation: Theory and Practice (Vol. Forthcoming ). San Francisco:
Jossey-Bass.
Chen, H. T. (1988). Validity in evaluation research: a critical assessment of current issues.
Policy and Politics, 16(1), S. 1-16.
Chen, H. T. (1990). Theory-driven evaluations. Thousand Oak, CA: Sage.
Chen, H. T. (1997). Normative evaluation of an anti-drug abuse program. Evaluation and
Program Planning, 20(2), S. 195-204.
Chen, H. T. (2005). Practical program evaluation: assessing and improving planning,
implementation, and effectiveness. Thousand Oak, CA: Sage.
Chen, H. T. (2010). The bottom-up approach to integrative validity: a new perspective for
program evaluation. Eval Program Plann, 33(3), S. 205-214. doi: S0149-
7189(09)00101-3 [pii]10.1016/j.evalprogplan.2009.10.002
Theory-driven evaluation: Conceptual framework, application and advancement 39

Chen, H. T./Garbe P. (2011). Assessing Program Outcomes from the Bottom-Up Ap-
proach: An Innovative Perspective to Outcome Evaluation. In: H. T. Chen, S. L.
Donaldson & M. M. Mark (Eds.), Advancing Validity in Outcome Evaluation: The-
ory and Practice (Vol. Forthcoming ). San Franscisco Jossey-Bass.
Chen, H. T./Quane, J./Garland, T. N. (1988). Evaluating an antismoking program. Evalua-
tion and the Health Professions 11(4), S. 441-464.
Chen, H. T./Rossi, P. H. (1983). The theory-driven approach to validity. Evaluation and
Program Planning, 10, S. 95-103.
Connell, J. P./Kubisch, A. C./Schorr, L. B./Weiss, C. H. (1995). New approaches to eval-
uating community initiatives: Concepts, methods and contexts. Washington, DC:
Aspen Institute.
Cook, T. D./Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis Issues
for Field Settings. Chicago: Rand McNally.
Coryn, C. L. S./Noakes, L. A./Westine, C. D./Schoter, D. (2011). A systematic review of
theory-driven evaluation practice from 1990 to 2009. American Journal of Evalua-
tion, 32(2), S. 199-266.
Coxe, S./West, S. G./Aiken, L. S. (2009). The analysis of count data: a gentle introduction
to poisson regression and its alternatives. J Pers Assess, 91(2), 121-136. doi:
908606900 [pii]10.1080/00223890802634175
Cronbach, L. J. (1982). Designing Evaluations of Educational and Social Programs. San
Francisco: Jossey-Bass.
Donaldson, S. L./Christie, C. A./Mark, M. M. E. (2008). What counts as credible evidence
in applied and evaluation pracrtice? Newbury Park, CA: sage.
Fulbright-Anderson, K./Kubisch, A. C./Connell, J. P. (Eds.). (1998). New approaches to
evaluating community innitiatives. Vol. 2: Theory, measurement and analysis.
Washington, D.C.: Aspen Institute.
Glasgow, R. E.,/Lichtenstein, E./Marcus, A. C. (2003). Why don’t we see more transla-
tion of health promotion research to practice? Rethinking the efficacy-to-
effectiveness transition. Am J Public Health, 93(8), S. 1261-1267.
Greene, J./Caracelli, V. J. (Eds.). (1997). Advanced in mixed-method evaluation: Teh
chanllenge and benefits of integarting diverse paradigm (Vol. 74). San Francisco:
Jossey-Bass.
Hansen, M. B./Vedung, E. (2010). Theory-Based Stakeholder Evaluation. American
Journal of Evaluation, 31(3), 295-313. doi:10.1177/1098214010366174
Huffman, D./Lawrenz, F. (Eds.). (2006). Critical Issues in STEM Evaluation. San Fran-
cisco: Jossey-Bass.
Kristal, A. R./Glanz, K./Tilley, B. C./Li, S. (2000). Mediating factors in dietary change:
Understanding the impact of a worksite nutrition intervention. Health Education &
Behavior, 27(1), S. 112-125.
Miller, W. R./Toscova, R. T./Miller, J. H./Sanchez, V. (2000). A theory-based motiva-
tional approach for reducing alcohol/drug problems in college. [Evaluation Studies
Multicenter Study Research Support, U.S. Gov’t, Non-P.H.S. Research Support,
U.S. Gov’t, P.H.S.]. Health Educ Behav, 27(6), S. 744-759.
Patton, M. Q. (1997). Utilization-Focused Evaluation (3d ed. ed.). Thousand Oaks, CA.
Pawson, R., /Tilly, N. (1997). Realistic evaluation. Thousand Oaks, CA: Sage.
40 Huey T. Chen

Posavac, E. J., /Carey, R. G. (2007). Program Evaluation: Methods and Case Studies.
Upper Saddle River, New Jersey: Pearson Prentice Hall.
Rogers, P. J./Hasci, T. A./Petrosino, A./Huebner, T. A. (Eds.). (2000). Program theory in
evaluation: Challenges and Opportunites (Vol. 87). San Francisco: Jossey-Bass.
Rossi, P. H./Lipsey, M. W./Freeman, H. E. (2004). Evaluation: A systematic approach.
Thousand Oaks, CA: Sage.
Shadish, W. R./Cook, T. D./Campbell, D. T. (2002). Experimental and quasi-experimental
designs for generalized causal inference. Boston: Houghton Mifflin.
Tashakkori, A.,/Teddlie, c. (Eds.). (2003). Handbook of Mixed Methods in Social and
Behavioral Research. thousand Oaks, CA: Sage.
Weiss, C. (1998). Evaluation (2nd edition ed.). Englewood Cliffs, New Jersey: Prentice
Hall.
Wholey, J. S. (Ed.). (1987). Using program theory in evaluation (Vol. 33). San Francisco:
Jossey-Bass.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy