[House Hearing, 110 Congress]
[From the U.S. Government Publishing Office]


 
ESEA REAUTHORIZATION: OPTIONS FOR IMPROVING NCLB'S MEASURES OF PROGRESS 
=======================================================================
                                HEARING

                               before the

                              COMMITTEE ON
                          EDUCATION AND LABOR

                     U.S. House of Representatives

                       ONE HUNDRED TENTH CONGRESS

                             FIRST SESSION

                               __________

             HEARING HELD IN WASHINGTON, DC, MARCH 21, 2007

                               __________

                           Serial No. 110-11

                               __________

      Printed for the use of the Committee on Education and Labor


                       Available on the Internet:
      http://www.gpoaccess.gov/congress/house/education/index.html

                     U.S. GOVERNMENT PRINTING OFFICE

34-025 PDF                 WASHINGTON DC:  2007
---------------------------------------------------------------------
For sale by the Superintendent of Documents, U.S. Government Printing
Office  Internet: bookstore.gpo.gov Phone: toll free (866)512-1800
DC area (202)512-1800  Fax: (202) 512-2250 Mail Stop SSOP, 
Washington, DC 20402-0001





















                    COMMITTEE ON EDUCATION AND LABOR

                  GEORGE MILLER, California, Chairman

Dale E. Kildee, Michigan, Vice       Howard P. ``Buck'' McKeon, 
    Chairman                             California,
Donald M. Payne, New Jersey            Ranking Minority Member
Robert E. Andrews, New Jersey        Thomas E. Petri, Wisconsin
Robert C. ``Bobby'' Scott, Virginia  Peter Hoekstra, Michigan
Lynn C. Woolsey, California          Michael N. Castle, Delaware
Ruben Hinojosa, Texas                Mark E. Souder, Indiana
Carolyn McCarthy, New York           Vernon J. Ehlers, Michigan
John F. Tierney, Massachusetts       Judy Biggert, Illinois
Dennis J. Kucinich, Ohio             Todd Russell Platts, Pennsylvania
David Wu, Oregon                     Ric Keller, Florida
Rush D. Holt, New Jersey             Joe Wilson, South Carolina
Susan A. Davis, California           John Kline, Minnesota
Danny K. Davis, Illinois             Bob Inglis, South Carolina
Raul M. Grijalva, Arizona            Cathy McMorris Rodgers, Washington
Timothy H. Bishop, New York          Kenny Marchant, Texas
Linda T. Sanchez, California         Tom Price, Georgia
John P. Sarbanes, Maryland           Luis G. Fortuno, Puerto Rico
Joe Sestak, Pennsylvania             Charles W. Boustany, Jr., 
David Loebsack, Iowa                     Louisiana
Mazie Hirono, Hawaii                 Virginia Foxx, North Carolina
Jason Altmire, Pennsylvania          John R. ``Randy'' Kuhl, Jr., New 
John A. Yarmuth, Kentucky                York
Phil Hare, Illinois                  Rob Bishop, Utah
Yvette D. Clarke, New York           David Davis, Tennessee
Joe Courtney, Connecticut            Timothy Walberg, Michigan
Carol Shea-Porter, New Hampshire

                     Mark Zuckerman, Staff Director
                   Vic Klatt, Minority Staff Director
















                            C O N T E N T S

                              ----------                              
                                                                   Page

Hearing held on March 21, 2007...................................     1
Statement of Members:
    Altmire, Hon. Jason, a Representative in Congress from the 
      State of Pennsylvania, prepared statement of...............    66
    McKeon, Hon. Howard P. ``Buck,'' Senior Republican Member, 
      Committee on Education and Labor...........................     3
    Miller, Hon. George, Chairman, Committee on Education and 
      Labor......................................................     1

Statement of Witnesses:
    Doran, Harold C., senior research scientist, American 
      Institutes for Research....................................    30
        Prepared statement of....................................    32
    Dougherty, Chrys, Ph.D, director of research, National Center 
      for Educational Accountability.............................    19
        Prepared statement of....................................    21
    McWalters, Peter, Commissioner of Elementary and Secondary 
      Education, State of Rhode Island...........................    26
        Prepared statement of....................................    28
    Olson, Allan, co-founder and chief academic officer, 
      Northwest Evaluation Association...........................     5
        Prepared statement of....................................     7
    Woodruff, Valerie, Secretary of Education, State of Delaware.    16
        Prepared statement of....................................    18

Additional Materials Submitted by Chairman Miller:
    Darling-Hammond, Linda, Charles E. Ducommun professor, 
      Stanford University School of Education, prepared statement 
      of.........................................................    66
    National School Boards Association (NSBA) letter.............    75


ESEA REAUTHORIZATION: OPTIONS FOR IMPROVING NCLB'S MEASURES OF PROGRESS

                              ----------                              


                       Wednesday, March 21, 2007

                     U.S. House of Representatives

                    Committee on Education and Labor

                             Washington, DC

                              ----------                              

    The committee met, pursuant to call, at 10:28 a.m., in room 
2175, Rayburn House Office Building, Hon. George Miller 
[chairman of the committee] presiding.
    Present: Representatives Miller, Kildee, Payne, Hinojosa, 
Tierney, Kucinich, Wu, Holt, Davis of California, Sarbanes, 
Sestak, Loebsack, Hirono, Yarmuth, Hare, Courtney, Shea-Porter, 
McKeon, Petri, Castle, Souder, Ehlers, Platts, Keller, Fortuno, 
Boustany, Kuhl, and Heller.
    Staff present: Aaron Albright, Press Secretary; Tylease 
Alli, Hearing Clerk; Alice Cain, Senior Education Policy 
Advisor (K-12); Fran-Victoria Cox, Documents Clerk; Adrienne 
Dunbar, Legislative Fellow, Education; Amy Elverum, Legislative 
Fellow, Education; Denise Forte, Director of Education Policy; 
Gabriella Gomez, Senior Education Policy Advisor (Higher 
Education); Lloyd Horwich, Policy Advisor for Subcommittee on 
Early Childhood, Elementary and Secretary Education; Lamont 
Ivey, Staff Assistant, Education; Brian Kennedy, General 
Counsel; Ann-Frances Lambert, Administrative Assistant to 
Director of Education Policy; Ricardo Martinez, Policy Advisor 
for Subcommittee on Higher Education, Lifelong Learning and 
Competitiveness; Stephanie Moore, General Counsel; Jill 
Morningstar, Education Policy Advisor; Joe Novotny, Chief 
Clerk; Lisette Partelow, Staff Assistant, Education; Rachel 
Racusen, Deputy Communications Director; Theda Zawaiza, Senior 
Disability Policy Advisor; Mark Zuckerman, Staff Director; 
James Bergeron, Counselor to the Chairman; Robert Borden, 
General Counsel; Kathryn Bruns, Legislative Assistant; Steve 
Forde, Communications Director; Jessica Gross, Deputy Press 
Secretary; Taylor Hansen, Legislative Assistant; Chad Miller, 
Professional Staff; Susan Ross, Director of Education and Human 
Resources Policy; and Linda Stevens, Chief Clerk/Assistant to 
the General Counsel.
    Chairman Miller [presiding]. Good morning. The Committee on 
Education and Labor will come to order.
    Today's hearing will shed light on one of the most 
important decisions we face in reviewing the No Child Left 
Behind law: whether or not to reform the current definition of 
adequate yearly progress. I can think of no question more 
central to the reauthorization and goals of the law.
    As one of the original authors of No Child Left Behind, I 
am often asked how I would like to see the law changed. The 
short answer is that I would like to see us be responsive to 
legitimate concerns while maintaining the core values of the 
law, providing an equal opportunity and an excellent education 
to every child, regardless of their race, their family income 
or disability.
    I recognize that there are some legitimate concerns with 
the current accountability system. And today we have the 
opportunity to focus on two concerns that have been central to 
this discussion on reauthorization: One, will a growth model 
system offer real accountability for student achievement? And, 
two, are there other credible and reliable academic indicators 
in addition to standardized tests that can offer an accurate 
picture of student achievement?
    With the system that we have currently, what is commonly 
known as the status model, we know there are some schools where 
students are making real progress and yet these schools are 
still not making AYP. Under the current system, a gain or loss 
in the percentage of students who are proficient could be a 
result of factors largely outside the school.
    At the joint hearings we held in this room last week with 
members of the House and Senate Education Committees, every 
organization who testified proposed growth models as the 
solution to these challenges. Today we will have the 
opportunity to examine whether growth models are the answer 
schools and states are seeking.
    The second focus of today's hearing has also generated much 
debate. And that is the concern that a single standardized test 
is too blunt an instrument to fairly and effectively measure 
school progress. We have heard from many in the civil rights, 
education and research communities who acknowledge that using 
one standardized test to compare students against a single set 
of high standards is essential to closing the achievement gap.
    They have also expressed valid concerns that that single 
test may not be able to tell us all we need to know about what 
students and schools can do. Having the most accurate 
information on student progress is critical to closing the 
achievement gap. And looking at other evidence in addition to 
state tests may be the way to obtain a more complete view of a 
child's true progress.
    Further, including indicators such as graduation rates and 
advanced course taking may incentivize progress in closing the 
debilitating achievement gaps in those critical areas. Today we 
will hear from leading experts and practitioners on these two 
complex accountability issues: growth models and multiple 
indicators.
    I look forward to their testimony and ask them to keep in 
mind three questions as we look for their help in these areas.
    First, are growth models and multiple indicators of 
performance consistent with No Child Left Behind's goal of 
ensuring that all children can read and do math at grade level 
by 2014?
    Second, do states have the capacity they need to ensure 
that information gathered to determine whether a school or 
district has made adequate progress is both valid and reliable?
    Third, do these approaches appropriately credit improving 
schools, or do they overstate academic progress? In other 
words, are they a step forward in offering a fairer, more 
reliable means of accountability, or are they a step backward, 
simply another loophole that hinders accountability?
    Our collective goal in reauthorizing No Child Left Behind 
should be to look to those changes that improve the integrity 
of the act and move us forward toward the stated goal of the 
act, to provide opportunity and an excellent education to every 
child.
    I want to thank the witnesses in advance for their 
testimony.
    And I would like now to yield to the senior Republican on 
the committee, Mr. McKeon, for his opening statement.
    Mr. McKeon. Thank you, Mr. Chairman, for convening this 
hearing as part of the series of hearings on No Child Left 
Behind we launched a week ago.
    Though last week's discussion provided a broad overview of 
our reauthorization effort and gathered input from both the 
House and the Senate, I believe today's hearing and the others 
to follow will serve an even more important purpose as they 
delve into the real challenges at the heart of NCLB.
    Today we begin with an examination of options for improving 
NCLB's measure of progress. And I thank our panel of witnesses 
for joining us for this examination.
    Adequate yearly progress is a benchmark that makes NCLB 
different from other education laws that came before it. It is 
the measure that tells all of us legislators, parents, 
teachers, administrators, and taxpayers exactly how a school is 
doing in educating students from one grade level to the next. 
And for that reason, it is vital that the concept remains in 
place.
    However, as we approach this year's reauthorization, it is 
important that we are open minded to tweaks in the law that 
could make it more practical while ensuring that the underlying 
principle of accountability remains consistent. And that is 
where growth models enter into this discussion.
    Under current No Child Left Behind guidelines, school 
districts use a status model to compare the performance of 
students in a specific grade against the performance of the 
students of that same grade during the previous year. Some have 
raised concerns about the reliability of the status model. They 
argue that a model which compares the achievement of the same 
students over time within a growth model may be more 
appropriate and act as a more accurate measure of adequate 
yearly progress.
    As we review the Department of Education's growth model 
pilot programs as well as last year's Government Accountability 
Office report on the implementation of growth models to 
determine if schools in certain states were making adequate 
yearly progress under No Child Left Behind, I believe that 
growth models can play an important role in this 
reauthorization. However, these growth models must be well-
designed. They must be rigorous. And they must meet a number of 
criteria that are consistent and central to NCLB.
    For example, they must include the requirements that all 
students reach proficiency, that the gaps between groups of 
students continue to close, and the growth model is tracked as 
part of a state data system and that a state's assessment 
system must produce comparable results from grade to grade and 
year to year.
    With that being said, members of this committee know as 
well as anyone that the reliability and utility of growth 
models is the focus of an ongoing debate. So I think we all can 
comfortably say today that we are not necessarily here to 
wholeheartedly embrace the concept nor dismiss it out of hand. 
Instead, we are simply here to listen and to learn. I am 
looking forward to this hearing and the additional hearings we 
will be having in this series.
    And again, I thank the witnesses and look forward to their 
testimony.
    Chairman Miller. Thank you.
    With that, we will begin with the witnesses.
    Our first witness will be Allan Olson, who is the co-
founder and chief academic officer of the Northwest Evaluation 
Association. Northwest Evaluation Association is a non-profit 
organization that provides research, support and technical 
assistance to 2,400 partnering school districts and education 
agencies throughout the United States. Dr. Olson has led the 
Northwest Evaluation Association in its efforts to build the 
largest nationwide database in longitudinal student test 
results.
    Valerie Woodruff is the secretary of education from the 
Delaware Department--excuse me. I think our colleague wanted to 
introduce----
    Mr. Castle. Thank you, Mr. Chairman.
    It is a great pleasure for me to introduce Delaware's 
secretary of education, Valerie Woodruff. Val has been 
secretary since July of 1999, prior to which she served as the 
associate secretary for curriculum and instructional 
improvement for Delaware. Her career is rooted in education. 
And she has been a teacher, counselor, assistant principal and 
principal in high schools in both Maryland and Delaware.
    As secretary, Val has led the implementation of Delaware's 
accountability system as well as implementation of No Child 
Left Behind. I appreciate Val's commitment to raising student 
achievement, the importance of high-quality teachers and school 
leaders and the belief that all children deserve an excellent 
educational experience.
    Val is the Delaware representative on the Southern Regional 
Education Board, serves on the executive committee of SREB and 
is the first K through 12 educator to serve as vice chair. She 
has also served on the Board of the Council of Chief State 
School Officers and was the president of the Chief State School 
Officers from November of 2005 through November of 2006.
    We are lucky to have her in Delaware. And don't try to take 
her away.
    I yield back.
    Chairman Miller. Thank you.
    Welcome, Secretary Woodruff.
    Dr. Chrys Dougherty is the associate director of Research 
National Center for Educational Accountability. Dr. Dougherty 
is the director of this center and has authored the ``Parents 
Guide to Asking the Right Questions about School'' and has 
written extensively on the value of longitudinal data and the 
10 essential elements of statewide student information systems.
    He has been an elementary school science teacher in 
Oakland, California, is a professor of statistics, 
econometrics--ergonomics is what we fight over in the labor 
side of this committee.
    Mr. Dougherty. Yes, econometrics.
    Chairman Miller. Econometrics. Yes. You are the guys? They 
are always quoting you guys about this and that. Okay--and 
education policy at the LBJ School of Public Affairs.
    Peter McWalters is commissioner of the Rhode Island 
Department of Elementary and Secondary Education. Prior to 
becoming Rhode Island's commissioner, he served over 20 years 
in a variety of educational leadership and teaching positions, 
including the superintendent of schools in the city school 
district of Rochester, New York.
    Dr. Harold Doran is the senior research scientist, American 
Institutes for Research, where he supports the development of 
state testing and accountability systems as an applied 
statistician and psychometrician. And he is currently a member 
of Secretary Spellings' peer review panel for state growth 
models. He has been an elementary school principal and a 
classroom teacher.
    Welcome to all of you, and thank you for your contributions 
this morning.
    Mr. Olson, we are going to begin with you.
    There will be a light in front of you. The green lights 
will go on when you start your testimony. There will be a 
yellow light that suggests you should start wrapping up in the 
next minute or so, and then a red light when your time has run 
out.
    But we will obviously allow you to finish a thought and a 
sentence and maybe even a paragraph. There you go.

  STATEMENT OF ALLAN L. OLSON, CO-FOUNDER AND CHIEF ACADEMIC 
           OFFICER, NORTHWEST EVALUATION ASSOCIATION

    Mr. Olson. If it is brief.
    Chairman Miller, Ranking Minority Member McKeon and members 
of the committee, I appreciate the opportunity to testify 
before you.
    Again, my name is Allan Olson. I am co-founder of an 
organization called the Northwest Evaluation Association. The 
Northwest Evaluation Association is a not-for-profit 
organization. We provide testing services to school districts 
around the nation and also have a very strong research staff. 
So we do research in the field also.
    We are currently providing very accurate measures, 
assessments, and growth measures for approximately 3 million 
children multiple times a year in 49 states. After 30 years of 
experience in research, it is clear that NCLB could be 
strengthened and more effective if states were allowed to and 
encouraged to implement measures of student achievement that 
were accurate enough to actually measure growth, actually 
measure growth of the individual students. Okay?
    So I am talking about student level and actually designed 
specifically for determining change over time at the child 
level. An accurate growth measure provides the best evidence of 
a school's effectiveness. It also improves the assessment data 
in ways that help students, teachers, parents, and others focus 
learning and focus their efforts to improve learning over time.
    In other words, a very good accurate measure and a growth 
measure will inform many people within the education community 
in manners that allow them to change their behaviors to become 
increasingly effective. So a good growth measure is not only 
probably the best accountability measure, it is also the best 
possible way to improve our capacity to improve learning.
    Today's computerized adaptive tests represent the most 
common approach to meet these requirements. However, states 
could develop other methodologies.
    An accurate measure of each student's achievement is 
reported on a cross-grade vertical measurement scale provides 
the school and the state information about whether a student is 
proficient, in other words, meets all the requirements inside 
No Child Left Behind related to status capacity and information 
about how far a student is below or above that standard, not 
just information that the child is below or above, but actually 
how far below or above, which also gives us a chance to 
establish growth targets at the child level, growth targets 
that would lead toward proficiency and/or growth targets that 
would help children or focus on children who are well above the 
standard at the time of the measure.
    Allowing states to accurately measure growth of each child 
strengthens all the foundation pieces of No Child Left Behind 
while providing educators evidence that will inform improvement 
of instruction and learning. So what we would be asking for is 
states be allowed to have a system that increases the quality 
and accuracy of information to inform the process's improvement 
while putting in place an accountability measure as required by 
No Child Left Behind.
    As I mentioned before, growth measure is the best measure 
of whether a school, a program, a district is being effective 
in meeting the needs of its children. A growth measure that is 
accurate enough to measure growth in an individual child also 
helps a district know whether they are being effective with 
children of differing characteristics, whether that is 
ethnicity, whether it is gender, whether it is starting place 
on a scale. It gives the school district information about how 
effective they are with those children.
    Accuracy is the center piece of a good growth measure. The 
test requirements today that are in place, the tests the states 
have in place today are quite accurate for children who happen 
to be near the proficient line or happen to be in the middle of 
a distribution. But by the nature of the design requirements 
for tests, the tests that are in place will not be as accurate 
for low-achieving children or will not be as accurate for high-
achieving children.
    And if your measure isn't as accurate for low-or high-
achieving children, it will not be a best growth measure for 
those children and will not provide the kind of information 
that will lead to constant improvement focussed on those 
children. A measure of that nature also will not be very 
accurate for purposes of diagnostic reporting, which is one of 
the requirements of No Child Left Behind. But states probably 
are falling short of the intent of that particular provision in 
the law. A good, accurate measure, a good, accurate growth 
measure would allow states to respond in that manner.
    I think in order to have a very good measure, it will be 
important to remove the real tight constraints right now that 
are in place, either intentionally or by the nature of the way 
the law is being implemented, remove the constraints for a very 
tight alignment to just grade level content standards with the 
measure. Many children are functioning well below those content 
standards. And we need to measure those children well.
    The law calls for challenging all children. To challenge 
all children, we must have a measure that is accurate for all 
children and be able to set growth targets that are appropriate 
for those children.
    Thank you very much.
    [The statement of Mr. Olson follows:]

   Prepared Statement of Allan Olson, Co-Founder and Chief Academic 
               Officer, Northwest Evaluation Association

    The Northwest Evaluation Association (NWEA) is a not-for-profit 
organization which partners with over 2,500 school districts to promote 
student learning provides, precise and consistent growth assessment 
testing services for over 3 million children in 49 states. For over 30 
years, we have been providing assessments in key subjects in grades 2-
12, as well as detailed reports on student learning, and offering 
training to help educators use data to improve practice. Our tests are 
given multiple times per year in paper-and-pencil and computer-adaptive 
formats and give educators, parents, students, and policymakers a clear 
and comprehensive look at how much academic growth individual students 
are making over time. This kind of data has been of great value to our 
partner districts and has resulted in increases in the number of 
children tested at a rate of over 50 per cent per year. NWEA's 
mission--``partnering to help all kids learn''--also has lead us to 
research educational policy and practice based on the extensive data in 
our database and our experience with thousands of teachers and schools.
    In the course of this research and working with our 2,500 partner 
districts, it has become clear to us that in order to help students 
learn more, we have to provide teachers with the information that they 
need to be able to identify student strengths and deficiencies and to 
better understand how far each child is from achieving proficiency. 
This means that we have to measure accurately each student's current 
achievement level to understand what a student knows and needs to know 
next, and to track each student's growth over time to be sure that 
young people are moving at a rate of growth that will help them become 
proficient. We have to provide this information to the teacher as 
quickly as possible, in a form that enables the teacher to make the 
best instructional decisions for the students.
    The aspect of this approach that is germane today is the 
measurement and use of student growth information. What we mean by 
growth measurement is using assessment to ``follow the child'' in order 
to find the actual achievement level of the child and then to measure 
it over time.
    In this area, our organization has reached three conclusions, as 
follows:
    1. We will gain a much more complete and useful picture of the 
performance of our schools if we include the growth of individual 
students in our accountability systems.
    2. Students must have growth targets that challenge them and that 
lead them to the state's definition of proficiency in a set of skills 
that will make them productive members of society when they graduate 
from high school.
    3. Teachers, principals, students, and parents must all have a 
clear understanding of the amount of achievement growth that the 
student must make each year to enable them to participate in the 
student's growth.
    Why is measuring individual achievement growth important?
    As NCLB has been implemented, it has become increasingly obvious 
that the way student achievement is measured currently does not begin 
to tell us whether the school is doing a good job or a poor job 
teaching the students that come through its doors. While there are many 
reasons for this, the issue can be seen very clearly as follows:
    Schools ``A'' and ``B'' have the same percentage of students 
identified as ``proficient.'' Students in school B grew, on average, 
twice as much as students in school A to achieve their proficiency. 
Which school is doing a better job?
    We believe that the answer is the school that is achieving greater 
rate of progress in moving students towards proficiency. Promoting the 
growth of individual students from one year to the next is the hallmark 
of a successful school. This is especially true for students who are 
below proficiency levels for a given grade and need to grow faster in 
order to catch up. Providing teachers a measure of how much the student 
must grow to get where the students needs to be also gives that teacher 
a useful tool for addressing the learning needs of each individual 
student.
    Students come to school with different preparation, motivation, and 
support resources. It is the job of every school to help every student 
move forward regardless of his or her current achievement level. For 
students with low achievement levels, the school needs to accelerate 
growth, to help these students reach levels that will allow them to 
compete when they graduate from school. For students with high 
achievement levels, the school needs to keep them growing to keep them 
engaged and to allow them to reach their full potential.
    Research (Kingsbury and McCall, 2006) has clearly indicated that 
schools vary greatly in the amount of growth that they cause in student 
achievement. It is equally clear that student growth differs by grade 
and demographic group within a school. Without information about 
student growth, we cannot tell the full story of a school, and we 
shouldn't try to judge whether the school is doing a good job or not.
    Can we measure achievement growth of individual students?
    It is clear that two components are needed to measure the 
achievement growth of individual students. The first requirement is the 
ability to measure students accurately to gain a deep understanding of 
where their learning is. Current tests provide very little information 
about students who are high performers and are well beyond their grade 
level or low performers who are well behind grade level. To be able to 
measure achievement for these students requires a measurement scale 
that goes beyond grade-level testing and identifies what students know 
across the many strands of knowledge that a student needs to know to be 
identified as a proficient.
    Let me illustrate the point. Consider, for example, a twelve year 
old child (grade 6) performing two grade levels below his age level 
(grade 4). If that child achieves a year and a half of growth for each 
of the next two years, he will be in grade 8 and perform much like a 
7th grade student. That is a huge success. However, if we only measure 
the ``status'' of the child as to his age level, and not the growth, we 
will conclude that the child is a failure and the school is failing him 
even though he will have caught up a whole grade level. Further, we 
won't be able to inform the teacher, the parents, or the child where 
the student is truly performing so that they can craft a plan to reach 
proficiency.
    The tools are available to provide this kind of detailed 
information. Growth measures have been in use for several decades. 
Computerized adaptive testing (CAT: Weiss, 1982) was developed by 
researchers with funding from the federal government in order to 
provide a way to measure large, diverse groups of individuals 
efficiently and accurately. An adaptive test allows us to measure the 
performance of high-achieving and low-achieving students as accurately 
as we measure the students in the middle of the distribution. Since its 
development, adaptive testing has been used for a host of high-stakes 
and low-stakes applications, from individuals entering the armed 
services to individuals trying to be certified in high-tech 
specialties. NWEA alone has administered over 60,000,000 adaptive tests 
to students.
    NWEA urges Congress to allow states and school districts to measure 
student growth as part of the accountability requirements under No 
Child Left Behind. We believe the great advantages such an approach 
provides will be sufficient motive to states to adopt this option as 
they consider how best to serve their children.
    It is important to stress that we are not proposing to abandon 
information about whether a child is operating at grade level. Rather, 
we want to allow states to go further. As illustrated in the slides 
that accompany this testimony, we can be far more effective in helping 
children achieve greater growth, so they can move to proficiency and 
beyond, if we more accurately know where they are performing and we can 
measure their performance growth.
What Measuring Growth Can Do
    One of the critical challenges confronting NCLB is ensuring that 
accountability is linked to approaches that actually are useful in 
helping schools and teachers help students reach proficiency.
    If we know where a student stands, and how much they must grow 
before they graduate, we should be able to marshal our resources to 
make sure that the needed growth occurs.
    If we know how much growth is typical for a student who starts the 
year with a certain level of achievement, we should be able to 
immediately set goals for the student that represent good growth, great 
growth, and incredible growth.
    If we know the growth goals for a student, we should be able to 
tell the teacher exactly what the student needs to learn by the end of 
the year to meet the growth goals.
    If we know the growth goals for each student that a teacher is 
working with, we should be able to guide that teacher so that he or she 
can design and redesign the instructional approach she will take with 
her students.
    And if we accomplish these things, the accountability is aligned 
with how students learn and what schools need to do.
    After all, the central issue is how we help the current generation 
of students meet our expectations. Measuring growth of each child gives 
us information that we can use to improve the growth of all of our 
students. At the same time, information about growth at the class and 
school level helps us describe our schools and their efficiency in ways 
that are far more useful to schools, teachers, parents and kids than 
what we learn by confining ourselves to the simple status question of 
current grade level.
    Finally, for our students who aren't growing to meet their growth 
goals, our response needs to be centered on the needs of those 
students. We need to reorganize to help the students.
    In conclusion, our request is a simple one: make it clear in the 
law that states are permitted, or even encouraged, to do more than just 
measure status. They can, and should, also measure growth as part of 
that same process.
    Thank you for the opportunity to share our experience and data with 
you this morning.
Improving NCLB Accountability
    Current Law: NCLB requires states to develop a measure of annual 
yearly progress (AYP) in order to hold districts and schools 
accountable. It stipulates that by the 2005-06 school year the states 
must have in place an assessment system for all students, as well as 
various subgroups, that annually tests student performance in reading/
language arts and mathematics in grades 3-8, and for a single test in 
grades 10-12. By the 2007-08 school year, states are also required to 
assess every student in science, at least once in each of the following 
grade spans: 3-5, 6-9, and 10-12.
    NCLB also allows states and localities to include other measures of 
student academic progress but these measures may not be used in place 
of the assessments described above for purposes of establishing AYP.
    The Problem: Currently under NCLB, schools are evaluated for their 
progress in improving student performance by comparing successive 
groups of students rather than tracking the same group of students over 
time. In other words, to meet AYP, schools must show that each grade 
level (e.g. third graders) has improved over the previous year, not 
that each student or the same group of students (e.g. third graders 
that are now fourth graders) has progressed. Therefore, these yearly 
comparisons do not track the performance of the same students.
    This approach to assessment does not provide the information we 
need to accurately measure what individual students know and what 
educators need to know to address their learning deficiencies and 
support their achievement growth.
    In addition, since the focus of NCLB is on measuring proficiency 
rather than annual learning progress, schools that have improved 
substantially but have not yet reached proficiency targets are rated 
the same way as schools that have no improvement. Achieving learning 
gains provides no credit to these schools.
    The Solution: In addition to the annual testing by grade and by 
subject currently required, states should be allowed to meet their NCLB 
annual yearly progress assessment requirements by measuring the 
performance growth of every student.
    NCLB recognizes the critical role that timely, accessible, and 
accurate information about student academic performance plays in 
informing and motivating educators, policymakers, parents, and the 
public in finding ways to raise student achievement and close the 
achievement gap. Giving states the option of measuring student growth 
to meet AYP assessment requirements would provide a more accurate 
measure of how students are progressing. By measuring growth over the 
course of each grade, it would provide educators a clear roadmap for 
bringing a student to proficiency.
    Currently, schools that improved substantially but did not make AYP 
are viewed the same way as schools that made no improvement. Including 
a growth measure in assessing school improvement would be fairer. 
Schools that have made substantial gains in student academic 
performance would be recognized for those improvements, even if they 
still do not meet proficiency standards. This change also would allow 
states to focus their support on those schools that are really 
struggling.
Questions and Answers
    What are the key attributes of a growth model of assessment?
    Growth measures provide the kind of information about what students 
know and do not know in key strands of knowledge within subject areas 
that helps teachers identify and focus on student strengths and 
deficiencies and determine what needs to be taught next. Using growth 
models, educators and young people can identify desired semester-by-
semester targets for student achievement that, if met, will ensure that 
young people are making progress toward mastering content and attaining 
proficiency. With this information, proficiency targets are not some 
abstract, far-away goal but clear benchmarks for students and teachers 
to reach that help ensure that students achieve proficiency over time.
    Measuring growth requires testing students against a common scale. 
This means that student achievement is measured to determine where a 
student fits across the entire continuum of learning in a particular 
subject area rather than on a grade-specific scale. The growth measure 
is actually a measure of growth toward proficiency, which is not tied 
to grade level but to mastery of content. Tests used by states today 
that measure what a student needs to knows within a particular grade 
level provide very good information about students performing in the 
middle range of performance (where state cut scores for accountability 
are pegged). But these tests do not ask enough questions to paint a 
useful portrait of what is happening with high-achieving and low-
achieving young people who typically perform at the extremes or outside 
their grade levels. For example, state grade-level tests provide little 
information when a sixth-grade student is performing at the fourth-
grade level or about a fourth-grade student who is performing at the 
fifth- or sixth-grade levels.
    If a state chooses to measure student performance growth from year 
to year instead of progress towards meeting fixed performance targets, 
won't the gaps between low- and high-performing students just be 
continued?
    Not necessarily. If states set growth targets on the road to 
proficiency then states, districts, and schools will continue to have 
markers to meet to ensure that all students graduate from high school 
with the knowledge and skills they need for productive and success 
lives.
    Is it realistic to assume that low-performing students can grow at 
a faster rate than higher-performing students to meet those targets?
    Currently, NCLB requires states, districts and schools to meet 
fixed performance targets by grade and by subject for all children. The 
only way to meet the intended purpose of NCLB--to close achievement 
gaps--is to identify those gaps and develop strategies for addressing 
them. By providing schools and teachers information on how a student is 
progressing within the school year and between school years is more 
likely to impact teaching and learning and, therefore, accelerates 
improvements in student achievement.
    Using growth measures also addresses another key problem with the 
current law. Currently, state targets for AYP are set all over the map. 
While a few states have set high performance targets early on, many are 
waiting until several years from now to establish higher targets for 
achievement that are closer to desired proficiencies. This delay means 
that in several years, schools that have been judged as meeting AYP 
will suddenly be far off from state targets. Growth measures provide a 
way of setting steady and achievable targets that are based on what can 
truly be expected of young people.
    How are student growth measures different than the currently used 
value-added testing, also called a ``growth model''?
    The U.S. Department of Education is supporting pilot ``growth 
model'' accountability plans in school districts in Arkansas, Delaware, 
Florida, North Carolina, and Tennessee. It has been mandated for use by 
all school districts in Pennsylvania and Ohio and several hundred 
school districts in 21 states. New legislation in Arkansas and 
Minnesota calls for implementing a form of value-added measurement, and 
the School Boards Associations in Iowa and New York are currently 
piloting a value-added program. Dallas and Seattle are the most 
prominent urban districts that use the value-added approach. In some 
states, such as Tennessee, this value-added model (VAM) is the bedrock 
of the accountability system, and the results are used to judge the 
quality of schools and the effectiveness of individual teachers.
    Value-added models of assessment, however, are an analytic 
methodology applied to NCLB test results. It is a method of statistical 
analysis, rather than a particular test, used to analyze longitudinal 
test data in order to isolate factors affecting a student's growth over 
time. It provides educators general information about which students 
have benefited most and least and about instructional impact--how 
effective it has been in providing students with a year's worth of 
growth from where they began the year. Through this information, 
teachers, principals, district administrators, and school board leaders 
can learn whether high achievers, middle achievers, or low-achievers 
are making the most progress, and what can be done to raise the 
performance of each group. Impact data can determine whether and the 
extent to which schools and classroom teachers are effective in raising 
performance.
    The currently used value-added models, however, do not provide the 
kind of rich multiple-times per year diagnostic information about the 
key strands of knowledge within subject areas that each student needs 
to master to move to the next level of performance. It also does not 
tell why a particular teacher is effective or not effective. And the 
value-added analysis is applied to tests that are not particularly 
accurate for students who are high achievers and low achievers, thus 
blunting its value as even a broad analytic tool.
    Won't a growth model require a sophisticated data system that will 
substantially add to state and district costs?
    States will be given the flexibility to continue with the current 
assessment models or to substitute or add a growth measure of progress 
towards measuring AYP.
    Our experience suggests that using a growth measure of progress 
could cost less, not more, than the current NCLB testing requirement. 
In Idaho, for instance, the cost is $13.00 per students to test 
students in grades 2-10, four times a year, including training and 
reporting costs. This is less than most states are spending on once-a-
year testing under NCLB requirements.
    Isn't testing itself the problem, imposing unnecessary burdens on 
school districts and leading teachers to teach to the test? Shouldn't 
we just eliminate the testing requirements from the law?
    If the nation is serious about accountability in education and 
about making sure that tax dollars invested in education result in a 
student population that is prepared for work and postsecondary 
education, we should not back away from the concept of testing. The 
issue is not whether or not to test but what kind of testing will yield 
the kind of information that actually helps teachers help students. 
Expansion in the use of growth measures rather than one-shot grade-
level tests can help educators, policymakers, and parents determine 
whether schools and students are actually making required progress 
toward proficiency. They also will tell educators, school board 
members, parents, and students what areas of learning they need to be 
working on to make desired growth targets.
    Since more than 2,500 school districts use out of grade-level 
testing currently, why does the law need to be changed? Can't districts 
simply do what you're proposing under current law?
    Yes, any district can use whatever test it wants to measure student 
learning. However, the law makes specific reference to use of grade-
level tests without referring to growth measures to fulfill the 
assessment and accountability requirements of NCLB. The 2,500 school 
districts that use growth measures to determine the performance growth 
of children are pioneers that have demonstrated the value of this kind 
of assessment to provide comprehensive information about individual 
student achievement in key subject areas to help further accelerate 
achievement gains. There are over 12,000 other public school districts 
that might include testing that tracks the performance of students over 
time if the law explicitly recognized this kind of testing as an 
alternative in determining whether schools, districts, and states are 
in compliance with the law.
    Do other companies also offer this sort of testing, or are you 
simply trying to change NCLB to benefit NWEA?
    Many other testing organizations--such as the Educational Testing 
Service and Scantron--already use testing methodologies that can 
pinpoint individual student achievement against a common scale and 
provide immediate feedback. This type of testing was first introduced 
by the U.S. military in the 1970s. The computer-adaptive testing used 
by NWEA, for example, is basically the same methodology ETS uses in its 
Graduate Record Exam and GMAT tests.
    Encouraging states to use computer-adaptive methodologies and 
growth measures that can given by computer or paper-and-pencil tests 
might actually hurt NWEA by providing much larger companies greater 
incentives to develop growth measures and enter this market. But we 
believe that it is the right thing to do and is not simply a matter of 
which companies have the biggest market share, but whether we have the 
kinds of tests that will help more schools bring more students to 
proficiency.
Northwest Evaluation Association
    The Northwest Evaluation Association (NWEA) is a national nonprofit 
organization based in Portland, Oregon, that partners with school 
districts and education agencies nationwide to promote academic student 
growth and school improvement. NWEA provides computer adaptive and 
paper-and-pencil assessments in mathematics, language arts, and science 
in grades 2-12 as well as training and comprehensive reporting tools 
that enable educators to measure and promote individual student and 
school academic growth. Their products and tools are provided at a 
price districts can afford, and any profit is reinvested in product 
development and technical assistance.
    Three decades of experience nationwide. Over the past 30 years, the 
company has tested more than 25 million young people; it currently is 
helping to assess more than 4 million students a year in more than 
2,400 school districts in 49 states. Its presence is particularly 
strong in Illinois, Indiana, Minnesota, New Hampshire, and South 
Carolina, where it tests the vast majority of students in the state.
    Growing demand for student growth data to support NCLB. NWEA has 
grown by 50 percent a year in recent years to meet the demand of school 
districts for formative assessments that track the growth of individual 
students over time and offer immediate feedback to district leaders, 
teachers, students, parents, and school board members.
    An immediate and vital source of information for teachers. The 
value of the assessments to schools is considerable, in part because 
students and teachers receive immediate results which allow them to 
better understand and develop strategies to offset student learning 
deficiencies. The assessments evaluate student achievement across 
content standards, and results help identify problem areas in content 
knowledge, skills, and concepts that need addressing to best maximize 
achievement. Because it is a growth measure, teachers use the data to 
determine if students are making equal to, or normal, growth. The test 
also offers schools valuable information about the most effective 
teachers, student groupings, or the need for alternative ways to focus 
instruction.
    An accountability tool for NCLB. In addition, schools and district 
leaders can compare scores with the growth targets for a particular 
year and see whether students are on target for meeting proficiency 
levels required to achieve the goals of NCLB. Results can be 
disaggregated by NCLB subgroups to give periodic indicators about how 
well a school is doing in serving diverse populations.
    A unique resource for finding proven answers to some of our most 
challenging educational issues. All the student growth data gathered by 
NWEA is aggregated into our Growth Research Database, the largest 
nationwide repository of student test results which is used by states, 
national organizations, and prominent national researchers to assess 
the impacts of policy and practice on student achievement growth.
    For more information, go to http://www.nwea.org.
    
[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]
    
    
                                 ______
                                 

STATEMENT OF VALERIE WOODRUFF, SECRETARY OF EDUCATION, STATE OF 
                            DELAWARE

    Ms. Woodruff. Good morning.
    Chairman Miller. I am going to have to turn my microphone 
on, and you are going to have to pull yours closer to you.
    Ms. Woodruff. Okay. It wasn't on. Okay?
    Chairman Miller, Ranking Member McKeon, and members of the 
committee, thank you for this opportunity to testify today 
about the implementation of growth models and accountability 
systems.
    I am proud to say that Delaware was among several states 
that had implemented a school and district accountability 
system to measure our progress and standards-based reform prior 
to the passage of No Child Left Behind.
    We began assessing English language, arts, and mathematics 
in 1998. And based on early information about the goals of No 
Child Left Behind, we applauded the initial work of Congress 
and believed that we could easily meet the requirements of the 
law.
    Our original accountability system included three measures 
of student performance: status, which is essentially AYP; 
growth; and the improvement of the lowest performing student. 
To our schools and to our communities, these measures made 
sense and had what I refer to as face validity. Simply stated, 
educators and others understood the value of measuring not only 
the performance of one cohort of students to another, but also 
the change in performance of the same cohort of children over 
time.
    And certainly, they saw the value of attending to and 
measuring the improvement of the lowest performing students and 
of closing the achievement gap. Delaware was the tenth state to 
receive approval of our accountability plan in the spring of 
2003. Also, we were among the first states to receive full 
approval of our standards and assessment system.
    Delaware implemented a unique identifier in 1984. And we 
have worked diligently since that time to link student 
demographic data with achievement data. And we have reported 
that for many years.
    Given all of these factors, we were anxious to talk with 
the Department of Education and to convince them that the use 
of growth models was a natural progression in creating a mature 
accountability system. When the department allowed states to 
submit growth models for the 2006 accountability measurement, 
we felt confident that our proposal would be approved. That did 
not occur. And we were perplexed at the feedback we received.
    None of the questions were related to the model itself. 
They had to do with other things. It did not seem that the peer 
reviewers had clear guidance about the criteria, nor did they 
understand the different models that can be used to measure 
growth.
    We made several changes. And we were approved and will be 
using the growth model measurement for the 2007 accountability 
year.
    The model that we chose supports our philosophy of 
continuous improvement for all students. It is easy to 
understand. It is easy to explain. It provides schools with 
information that shows which students are making progress 
toward proficiency, which students are maintaining proficiency, 
and which students are slipping backwards, which is something 
we all want to avoid.
    It is not enough to measure the average performance of even 
a small cohort of students. Systems must focus on the 
performance of individual students and must provide schools 
with the appropriate incentives to address student needs.
    Moving forward, the law should not only encourage the use 
of a variety of accountability models, not only allow it, but 
also encourage it. These models should be focused on individual 
student achievement and build on adequate yearly progress to 
promote more valid, reliable, and educationally meaningful 
determinations. States need to be encouraged to innovate and 
seek new and better ways of continuous student achievement.
    Specifically, the Department of Education must establish 
clear and consistent policies and procedures that enable states 
to use growth models. It should articulate the foundation 
elements that must be in place. For example, the state must 
have a unique student identifier, approved standards and 
assessment system, and a data system that is able to collect 
and track individual performance over time.
    When states have those elements in place, they should not 
then have to guess about how their proposals will be judged. 
Those criteria need to be clear and understandable. They should 
define what must be contained, and they must select and train 
peer reviewers so that states can be guaranteed a fair and 
equitable review of all proposals, regardless of the background 
or philosophical beliefs of the reviewers. The peer review 
process must be transparent and iterative and be focused on 
improving the quality of the accountability system, not 
limiting their scope and use.
    In order for states to pursue stronger, more robust systems 
of accountability, a partnership of support and technical 
assistance must be in place. States need ongoing technical 
assistance in order to build a strong knowledge base about 
accountability models. We need to benefit from research about 
which models are most effective and why. And they need 
continuing support and development and improving of data 
systems.
    For example, as strong as our data system is today in 
Delaware, we can benefit from knowledge and support about 
cutting edge technology. All states are eager to learn more and 
to improve the quality of education for all of our children.
    I appreciate the opportunity. And I will be glad to answer 
questions. Thank you.
    [The statement of Ms. Woodruff follows:]

 Prepared Statement of Valerie Woodruff, Secretary of Education, State 
                              of Delaware

    Chairman Miller, Ranking Member McKeon and members of the 
committee, thank you for this opportunity to testify today about the 
implementation of growth models in accountability systems. My name is 
Valerie Woodruff. I am the Secretary of Education in the state of 
Delaware. I am the immediate Past President of the Council of Chief 
State School Officers.
    I am proud to say that Delaware was among several states that had 
implemented a school and district accountability system to measure our 
progress in standards based reform prior to the passage of No Child 
Left Behind. We began assessing English language arts and mathematics 
in 1998. Based on the early information about the goals of NCLB, we 
applauded the initial work of Congress and believed that we could 
easily meet the requirements of the law. Our original accountability 
system included three measures of student performance: status, growth, 
and improvement of the lowest performing students. To our schools and 
to our community, these measures made sense and had what I refer to as 
``face validity.'' Simply stated, educators and others understood the 
value of measuring not only the change in performance of one cohort of 
students to another but also the change in performance of the same 
cohort of students over time. And certainly, they saw the value of 
attending to and measuring the improvement of our lowest performing 
students and of closing the achievement gap.
    Delaware was the tenth state to receive approval of our 
accountability plan in the spring of 2003. Also, we were among the 
first states to receive full approval of our standards and assessments. 
Delaware implemented a unique student identifier in 1984 and has worked 
diligently and deliberately since that time to link student demographic 
data with achievement data. Given all these factors, we were anxious to 
engage the Department of Education and to convince them that the use of 
growth models was a natural progression in creating a mature 
accountability system.
    When the Department allowed states to submit growth model proposals 
for the 2006 accountability measurement, we felt confident that our 
proposal would be approved. That did not occur, and we were perplexed 
at the feedback we received. It did not seem that the peer reviewers 
had clear guidance about the criteria, nor did they understand the 
different models that can be used to measure growth. We were required 
to make several changes in order to receive approval for the 2007 
accountability year.
    The model that we chose supports our philosophy of continuous 
improvement for all students. It is easy to explain and understand. It 
provides schools with information that shows which students are making 
progress toward proficiency, which students are maintaining 
proficiency, and which students are slipping backwards. It is not 
enough to measure the average performance of even a small cohort of 
students. Systems must focus on the performance of individual students 
and must provide schools with the appropriate incentives to address 
student needs.
    Moving forward, the law should not only allow but also encourage 
the use of a variety of accountability models. These models should be 
focused on individual student achievement and build on adequate yearly 
progress (AYP) to promote more valid, reliable, and educationally 
meaningful accountability determinations. States must be encouraged to 
innovate and to seek new and better ways of supporting continuous 
student achievement.
    Specifically, the Department of Education must establish clear and 
consistent policies and procedures that enable states to use growth 
models for accountability. It should articulate the foundation elements 
that a state needs to have in order to qualify to use a growth model. 
For example, a state must have a unique student identifier; approved 
standards and assessment systems; a data system that is able to collect 
and track individual student performance over time. When states have 
those elements in place, they should not have to guess at how their 
proposals will be judged.
    The Department should clearly define what criteria must be 
contained in a growth model proposal, and they must select and train 
the peer reviewers so that states can be guaranteed fair and equitable 
reviews of all proposals regardless of the background or philosophical 
beliefs of the reviewers. The peer review process must be fully 
transparent and iterative and be focused on improving the quality of 
accountability systems, not limiting their scope and use.
    In order for states to pursue stronger, more robust systems of 
accountability, a partnership of support and technical assistance must 
be in place. States need ongoing technical assistance in order to build 
a strong knowledge base about accountability models. They need to 
benefit from research about which models are most effective and why. 
They need continuing support in development and improvement of data 
systems. For instance, as strong as Delaware's data system is today, we 
can benefit from knowledge of cutting edge technology. All states are 
eager to learn more and to improve the quality of education for our 
children.
    I appreciate the opportunity to address the committee today. Thank 
you for your leadership. I will be glad to respond to your questions.
                                 ______
                                 
    Chairman Miller. Dr. Dougherty?

   STATEMENT OF CHRYS DOUGHERTY, PH.D, DIRECTOR OF RESEARCH, 
         NATIONAL CENTER FOR EDUCATIONAL ACCOUNTABILITY

    Mr. Dougherty. I would like to thank the first two 
presenters for making a lot of my points for me.
    First, I would agree with Dr. Olson that it is very 
important to look at growth across the entire achievement 
spectrum, that it is valuable both for accountability, and it 
is valuable from the point of view of school improvement.
    My organization, the National Center for Educational 
Accountability, identifies and studies consistently higher-
performing schools to see what they do compared to the average 
performing schools. And looking at student growth is a critical 
part of this process.
    And I would like to thank Dr. Woodruff for emphasizing the 
importance of longitudinal student data systems at the state 
level to be able to do these types of models. Our organization 
has been working very closely as lead partner on the data 
quality campaign to essentially encourage all states to develop 
longitudinal student data systems. We have got a packet that 
should be in your hands that describes a lot of the information 
about which states have made progress in that area.
    Twenty-seven states so far, according to our survey, 
actually have the critical, three critical data elements in 
place in order to do, as of next year, a growth model based on 
longitudinal student data. Now, that doesn't mean that they 
have every component in place. Dr. Woodruff mentioned 
assessment system requirements and so forth. But it does mean 
that from the point of view of building a statewide 
longitudinal data system they are definitely on track.
    And I would like to compliment the Congress for essentially 
funding longitudinal data grants, which has helped to 
accelerate this process of states developing longitudinal 
student data systems. If you had done the same list 3 or 4 
years ago, you would have had fewer than 10 states with the 
capability longitudinally of doing any kind of growth model. 
Now it is up to 27. It is very likely it will be over 40 in 
another 3 years. So that has been very helpful.
    I just want to mention that the way growth is handled now 
as part of AYP and these growth models--and this is reiterating 
some of the things that have been said--you have got status, 
which is are enough kids proficient today. You have safe 
harbor. Are you reducing the percent of kids that are not 
proficient? And growth is the third.
    If kids are way below proficient, are you growing them on a 
path or a track to proficiency. And Dr. Doran is expert in a 
lot of the different methods you can use to say what do you 
mean by on track to proficiency, how do you measure that. It 
looks like I have a minute left, so I am going to mention a 
couple of the different ways.
    The system Delaware uses is essentially it takes students 
and it puts them in achievement bands, level one, two, three, 
four, five. Or California would do far below basic, below 
basic, basic, proficient, advanced. And basically you monitor 
the progress of students over the bands. You essentially, as it 
were, deduct points for kids falling back. You give more points 
for kids moving forward.
    And everybody can understand that. It is very simple. That 
is called a value table approach. That is one approach.
    Another approach is just to draw a trajectory or line 
between where the kid is now in proficiency. If he is below 
proficient, it could be a curved line. It could be a straight 
line. And if you next year are on or above the line, then you 
are meeting the growth requirement for being on track to 
proficiency.
    And the third approach, which Dr. Doran's organization 
specializes in, is using statistical models to project or 
predict whether or not a student will be proficient based on 
past patterns of students with a certain score in, let's say, 
3rd grade and a certain score in 4th grade. What were the odds 
that that kid would be proficient in 6th grade?
    So that uses, again, longitudinal data, which states need 
to have in order to be able to develop these models and also in 
order to be able to validate these models to see the extent to 
which students who were predicted to be proficient actually get 
there. And that is very critical, the validation part of these 
growth models.
    I finally want to mention that as we move toward putting 
attention also on kids who are proficient, not only not 
slipping back below proficiency, but also growing to levels 
above proficiency, I don't know if the AYP system is the right 
place to handle that because of the issue of you don't want to 
offset kids not growing at the bottom end with kids growing at 
the top end. You don't want to use one to offset the other.
    But rather, you want to look at both issues separately and 
maybe make the growing of the kids at the top end be part of a 
recognition system. And maybe that is the way to handle it and 
not through the AYP system.
    Thank you very much. I would be happy to answer questions 
afterwards.
    [The statement of Mr. Dougherty follows:]

  Prepared Statement of Chrys Dougherty, Ph.D, Director of Research, 
             National Center for Educational Accountability

    Mr. Chairman and members of the Committee, I thank you for the 
opportunity to testify about the use of good educational data, over 
time, to measure the growth of student achievement. I am Chrys 
Dougherty, Director of Research at the National Center for Educational 
Accountability (NCEA), national sponsor of Just for the Kids.
    The Center is one of 14 national organizations that are managing 
partners of the Data Quality Campaign. This campaign is a national, 
collaborative effort to encourage and support state policymakers to: 1) 
improve the collection, availability, and use of high-quality education 
data, and 2) implement state longitudinal data systems to improve 
student achievement. I will refer in my testimony to the Ten Essential 
Elements of a statewide longitudinal data system identified by NCEA and 
the Data Quality Campaign (attached), and to information from NCEA's 
Survey on State Data Collection which identifies where states are 
currently in implementing high-quality data systems capable of 
answering questions critical to improving schools and school systems (a 
selected list of these questions is also attached).
    I have also been privileged to serve on a panel for the U.S. 
Department of Education's Institute of Education Sciences to review 
state applications for the state longitudinal data system grant program 
authorized under title II of the Education Sciences Reform Act of 2002, 
and currently serve on a panel for the U.S. Department of Education to 
review state applications to implement growth models for NCLB.
An Overview of Growth Models
    ``Growth models'' can be defined as any analysis or measurement of 
the progress of individual students over time. The growth models of 
interest here ask the question: Is the student growing fast enough to 
be ``on track'' to reach the desired goal in the desired length of 
time? For example, is the student progressing well enough to be ready 
to handle rigorous high school coursework by the time he or she enters 
high school?
    Growth models of this type should be distinguished from 
conventional ``value-added'' models, which ask the question, ``Is the 
student growing faster than would be predicted by his or her 
characteristics?'' Typically these characteristics include the 
student's prior test scores. However, students could be growing faster 
than predicted for typical students like themselves, and yet not fast 
enough to reach proficiency in the desired length of time--or ever.
    Annual testing in grades 3-8 has been crucial for the development 
of growth models. These models are based on following students year 
after year and looking at individual growth every year, rather than 
waiting several years to find out whether the student has 
progressed.\1\
---------------------------------------------------------------------------
    \1\ The ability to look at student growth was a major motivator for 
the early adoption of grades 3-8 testing in states such as Tennessee, 
Texas and North Carolina. Annual testing data was critical for Texas's 
Comparable Improvement growth model, North Carolina's growth model, and 
Tennessee's value-added model.
---------------------------------------------------------------------------
    Since the desired goal under the No Child Left Behind Act (NCLB) is 
proficiency, the first question that NCLB growth models address is 
whether non-proficient students are growing fast enough to reach 
proficiency in the near future--usually in the next three years.
    A second question that NCLB growth models sometimes address is 
whether already proficient students are growing fast enough to stay 
proficient.
    A third question that these models should address is whether 
already proficient students are growing to levels higher than 
proficiency. NCLB as currently written does not encourage states and 
school districts to address this question.\2\ This question is 
especially important in states where the proficiency standard is below 
that required to prepare students for college and other postsecondary 
training for skilled careers.
---------------------------------------------------------------------------
    \2\ The exception to this is NCLB's authorization of funding for 
Advanced Placement incentive programs.
---------------------------------------------------------------------------
    We would like to encourage school systems to focus on whether 
students, particularly disadvantaged students, are growing toward 
readiness for college and skilled careers after high school. Goals and 
standards that states set for accountability--ones to which sanctions 
are attached--are likely to be lower than those which school systems 
should adopt for purposes of goal-setting, curriculum design, and long-
term planning.\3\
---------------------------------------------------------------------------
    \3\ For a discussion of why accountability standards are often not 
set high enough to be worthy goals for long-range planning, see 
``Identifying Appropriate College Readiness Standards for All 
Students,'' www.just4kids.org/en/research--policy/college--career--
readiness.
---------------------------------------------------------------------------
    Therefore, an incentive for growth to higher levels is probably 
best accomplished not through the Adequate Yearly Progress (AYP) 
system, but rather by encouraging the creation of voluntary programs 
for identifying and publicly recognizing schools that are successful at 
placing students, particularly disadvantaged students, on a trajectory 
to these higher standards. Identifying these schools and examining 
their best practices should be the topic of ongoing research and 
dissemination.\4\
---------------------------------------------------------------------------
    \4\ See www.just4kids.org for examples of efforts to identify and 
recognize higher performing schools and to research and disseminate 
their practices.
---------------------------------------------------------------------------
Data That Is Necessary to Measure Student Academic Growth
    The ability to follow individual students over time, as necessary 
for growth models, requires a longitudinal data system. Specifically, 
to create growth models, states need at least the following three 
elements from the list of Ten Essential Elements identified by the Data 
Quality Campaign (www.dataqualitycampaign.org):
     Element One: A statewide student identifier making it 
possible to follow the same students over time
     Element Three: The ability to link students' test score 
records over time
     Element Four: Information on untested students and the 
reasons why they were not tested.
The Status of State Data Systems Capable of Measuring Growth
    According to the 2006 NCEA Survey on State Longitudinal Data 
Systems, 27 states will have the capability of doing a growth model as 
of the 2007-08 school year, based on their possession of these three 
elements for at least two years. These states, listed on the Data 
Quality Campaign's website at www.dataqualitycampaign.org/survey--
results/policy.cfm, are:
  
              Alaska            Massachusetts             Rhode Island
                    Colorado        Minnesota                Tennessee
                    Connecticut      Nebraska                    Texas
            Delaware                   Nevada                     Utah
             Florida               New Mexico                  Vermont
              Hawaii                 New York                 Virginia
              Kansas             North Dakota               Washington
            Kentucky                     Ohio            West Virginia
           Louisiana             Pennsylvania                Wisconsin
    The Statewide Longitudinal Data System grants have helped many 
states develop and improve their longitudinal student data systems. 
These competitive grants from the U.S. Department of Education's 
Institute of Education Sciences have not only increased the ability of 
states to do growth models, but also their capacity to provide 
information to teachers and principals on the academic growth of their 
students.
    Better information is a critical tool for school improvement.
    Thank you, Mr. Chairman. I'd be happy to answer any questions you 
may have.
Essential Elements and Fundamentals of a Longitudinal Data System
    While each state's education system is unique, it is clear that 
there is a set of 10 essential elements that are critical to a 
longitudinal data system:
    1. A unique statewide student identifier that connects student data 
across key databases across years
    2. Student-level enrollment, demographic and program participation 
information
    3. The ability to match individual students' test records from year 
to year to measure academic growth
    4. Information on untested students and the reasons they were not 
tested
    5. A teacher identifier system with the ability to match teachers 
to students
    6. Student-level transcript information, including information on 
courses completed and grades earned
    7. Student-level college readiness test scores
    8. Student-level graduation and dropout data
    9. The ability to match student records between the P--12 and 
higher education systems
    10. A state data audit system assessing data quality, validity and 
reliability
    In addition to the 10 essential elements, states need to ensure 
that they take into account the following fundamental concepts in the 
construction of their longitudinal systems.
    Privacy Protection: One of the critical concepts that should 
underscore the development of any longitudinal data system is 
preserving student privacy. An important distinction needs to be made 
between applying a ``unique student identifier'' and making 
``personally identifiable information'' available, for example. It is 
possible to share data that are unique to individual students but that 
do not allow for the identification of that student. It also is 
critical to put in place encryption and data security protocols to 
secure the transmission or transaction of data between and among 
systems. States should ensure that they bring privacy considerations 
into the development of each repository and the exploration of each 
protocol or report.
     Maximizing the Power of Education Data While Ensuring 
Compliance with Federal Student Privacy Laws: A Guide for Policymakers
     State Longitudinal Data Systems and Student Privacy 
Protections Under the Family Educational Rights and Privacy Act
     The Family Educational Rights and Privacy Act (FERPA) and 
State Longitudinal Data Systems
     State Data Systems and Privacy Concerns: Strategies for 
Balancing Public Interest
    Data Architecture: Data architecture defines how data are coded, 
stored, managed and used. Good data architecture is essential for an 
effective data system. Many states are in the process of improving 
their data architecture so that they can clearly communicate with all 
entities with which they share and from which they receive data. 
Districts need to know specifically how data elements are defined 
(e.g., what a ``dropout'' is), how they should be formatted, and how 
and when the data should be transferred to the state education agency. 
Without these standard definitions and dictionaries, state education 
agencies will have an extremely difficult time making sense of the data 
received from their districts. With standards in place that are used by 
everyone, staffing resources and processing or cycle time can be 
greatly reduced, data can be made available to users when they need 
them, and reports can be based on clear and common definitions.
    Data Warehousing: Many states are in the process of designing and 
building or upgrading their data warehouses. Policymakers and educators 
need a data system that not only link student records over time and 
across databases but also make it easy for users to query those 
databases and produce standard or customized reports. A data warehouse 
is, at the least, a repository of data concerning students in the 
public education system; ideally, it also would include information 
about educational facilities and curriculum and staff involved in 
instructional activities, as well as district and school finances. The 
warehouse should ensure student and teacher confidentiality, allow 
longitudinal analyses, and include analytical capabilities for its 
users. Examples of the capabilities that should be available in a data 
warehouse include, but are not limited to, trend analyses; tracking of 
students over time and across campuses and/or districts; queries 
designed and conducted by different users (with different levels of 
access to detailed data, depending on user classification); and 
standard summary reports at the campus, district or state level for 
policymakers and educators. The key to effective data warehousing is 
the timely and efficient use and reporting of data.
    Interoperability: Data interoperability entails the ability of 
different software systems from different vendors to share information 
without the need for customized programming or data manipulation by the 
end user. Interoperability reduces reporting burden, redundancy of data 
collection, and staff time and resources. It allows for better, faster 
and clearer reporting of data. It depends on systems having common data 
standards and definitions. Organizations such as the Schools 
Interoperability Framework Association work to ensure the creation of 
platform-independent, vendor neutral open standards that can be used by 
educators and vendors to design and implement interoperable data 
systems.
    Portability: Data portability is the ability to exchange student 
transcript information electronically across districts and between P-12 
and postsecondary institutions within a state and across states. 
Portability has at least three advantages: it makes valuable diagnostic 
information from the academic records of students who move to a new 
state available to their teachers in a timely manner; it reduces the 
time and cost of transferring students' high school course transcripts; 
and it increases the ability of states to distinguish students who 
transfer to a school in a new state from dropouts. The large interstate 
movement of students in the wake of Hurricane Katrina made the value of 
such a system obvious. Data portability is supported by the 
implementation of interoperable systems, but it requires states that 
use these systems to have a set of common definitions or protocols.
    Professional Development around Data Processes and Use: Building a 
longitudinal data system requires not only the adoption of key elements 
outlined in this paper but also the ongoing professional development of 
the people charged with collecting, storing, analyzing and using the 
data produced through the new data system. The local school person who 
inputs course grades needs to understand fully how his/her work fits 
into the broader data system, the principal needs to understand how 
data can effect daily school management--both facilities and academic 
decisions--and policymakers need to understand how their decisions are 
limited or expanded based on the quality of the data available. For 
these changes in culture and management to occur, states need to make 
it a priority to rethink and possibly reorganize how education data is 
managed throughout the system, increase training and professional 
development for staff--both managers and users--and assist all 
employees and stakeholders of the state education system to be active 
consumers of the longitudinal data system.
    Researcher Access: Research using longitudinal student data can be 
an invaluable guide for improving schools and helping educators learn 
what works. These data are essential to determining the value-added of 
schools, programs and specific interventions. States are developing 
ways to make student-level data available to researchers while 
protecting the privacy of student records under the Family Education 
Rights and Privacy Act. Because state education agencies and local 
school districts usually do not have the resources to conduct this 
research themselves, providing access to the data to outside 
researchers with appropriate privacy protections allows critical 
research to be done at no cost to the state or to school districts.
Policy Implications of State Data Systems in 2006-07
    Does your state collect the most relevant data to inform your 
policy conversations and decisions?
    Policymakers and educators need longitudinal data systems capable 
of providing timely, valid and relevant data. Access to these data 
gives teachers the information they need to tailor instruction to help 
each student improve, gives administrators the resources and 
information to effectively and efficiently manage, and enables 
policymakers to evaluate which policy initiatives show the best 
evidence of increasing student achievement.
    Does your state have the data to answer these timely questions? 
Based on responses to the 2006 NCEA survey, only a few states can 
answer each of these priority questions facing policymakers and 
educators today.
    1. Which schools produce the strongest academic growth for their 
students? (27 states can answer this question; States must have 
Elements 1, 3, 4 to answer this question)
    Alaska, Colorado, Connecticut, Delaware, Florida, Hawaii, Kansas, 
Kentucky, Louisiana, Massachusetts, Minnesota, Nebraska, Nevada, New 
Mexico, New York, North Dakota, Ohio, Pennsylvania, Rhode Island, 
Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, 
Wisconsin
    2. What achievement levels in middle school indicate that a student 
is on track to succeed in rigorous courses in high school? (5 states 
can answer this question; States must have Elements 1, 3, 6, 7 to 
answer this question)
    Arkansas, Florida, Georgia, Texas, Utah
    3. What is each school's graduation rate, according to the 2005 
National Governors Association graduation compact? (28 states can 
answer this question; States must have Elements 1, 2, 8, 10 to answer 
this question)
    Alabama, Alaska, Arizona, Arkansas, Colorado, Connecticut, 
Delaware, Florida, Iowa, Kansas, Louisiana, Massachusetts, Minnesota, 
Nevada, New Hampshire, New Mexico, North Dakota, Ohio, Oregon, South 
Dakota, Texas, Utah, Vermont, Virginia, Washington, West Virginia, 
Wisconsin, Wyoming
    4. What high school performance indicators (e.g., enrollment in 
rigorous courses or performance on state tests) are the best predictors 
of students' success in college or the workplace? (4 states can answer 
this question; States must have Elements 1, 3, 6, 7, 8, 9 to answer 
this question)
    Arkansas, Florida, Georgia, Texas
    5. What percentage of high school graduates who go on to college 
take remedial courses? (14 states can answer this question; States must 
have Elements 1, 8, 9 to answer this question)
    Alabama, Alaska, Arkansas, Florida, Georgia, Hawaii, Louisiana, 
Massachusetts, North Dakota, Oregon, Texas, Vermont, Washington, 
Wyoming
    6. Which teacher preparation programs produce the graduates whose 
students have the strongest academic growth? (10 states can answer this 
question; States must have Elements 1, 3, 4, 5 to answer this question)
    Delaware, Florida, Hawaii, Kentucky, Louisiana, New Mexico, Ohio, 
Tennessee, Utah, West Virginia

[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]

                                 ______
                                 
    Chairman Miller. Mr. McWalters?

 STATEMENT OF PETER MCWALTERS, COMMISSIONER OF ELEMENTARY AND 
           SECONDARY EDUCATION, STATE OF RHODE ISLAND

    Mr. McWalters. Chairman Miller, Ranking Member McKeon, 
thank you for this opportunity. My name is Peter McWalters. I 
am the commissioner from Rhode Island. I have been there for 15 
years. And before that, I was a superintendent of schools in 
Rochester. I am clearly an urban educator.
    I am pleased to be able to talk to you today as you 
consider reauthorizing No Child Left Behind. I was the 
president of CCSSO in 2000, 2001 when we authorized this. And I 
supported it then. I support it now.
    It represents the very best form of federal intent. It 
essentially is the Civil Rights bill. It is part of a 
children's bill of rights. And it pushed states to focus on 
success for every student.
    The emphasis of standards and assessments and 
accountability on public information was needed then as it is 
now. And it has been beneficial for the nation. Now 5 years 
down the road, I think we can see some areas in which the law 
could and should be modified to help achieve the goals we all 
share.
    As CCSSO has said in its recent recommendations regarding 
No Child Left Behind reauthorization, we are in a new stage of 
standards-based reform. Many of the basic foundational pieces 
are in place. The question now is, how do we build on the use 
of these foundations to improve student achievement and close 
the gaps?
    I would submit to you that this will require innovation, 
change beyond currently understood, capacity building and 
retooling of systems, and quite honestly, judgments that are 
based on leadership, content, capacity, and the context of 
districts and schools. We need a federal law that values these 
things.
    As you prepare to reauthorize No Child Left Behind, I ask 
you to consider three issues: how states determine whether 
schools have met their targets, how we publicly identify 
schools that have missed their targets, and how states can best 
deliver assistance and implement consequences to help districts 
as well as schools meet these goals.
    As you know, schools may be identified for improvement if 
they miss any single one multiple target established in the 
law. And these targets are almost exclusively based in the 
tests that states administer at seven grade levels.
    We are not afraid to use student performance as the 
ultimate measure of school improvement. Our testing system in 
Rhode Island developed with the support of the federal funds is 
a tri-state partnership under which Rhode Island, New 
Hampshire, and Vermont established a common set of grade-level 
expectations and standards and developed an assessment system 
lined up with those standards. This partnership, known as the 
New England Common Assessment Program, is exactly the type of 
initiative that the federal government should continue to 
support.
    In addition to our state assessment system, we believe in 
Rhode Island in a number of means by which we can and do 
measure school performance. We administer attitudinal surveys 
to students, parents, teachers, and administrators. We visit 
schools on a structured visitation. We publish the results of 
both of those--they are all online.
    Parents get access to all this information. We measure 
school climate as in safety. We measure student connectedness. 
Does anybody here know me? Is anybody listening to me?
    We measure instructional leadership. We measure 
instructional practice, teacher competencies, as in, do they 
even know what the standards are. We track all of this stuff as 
well as parent involvement. We conduct peer review visits at 
every school.
    Every school is required by law to write an annual school 
improvement plan and submit district plans to us. And if you 
are in intervention, we get to not only review them, but we 
approve them. We have a very aggressive statute of progressive 
support and intervention.
    Test results should be the initial measure of districts and 
schools. But the law should allow states to employ indicators 
in addition to student performance to determine whether schools 
and districts are making adequate yearly progress.
    These indicators should include measures of capacity such 
as school climate, teacher expectations, leadership, 
instructional leadership, teacher development, program 
implementation fidelity, and parent engagement. These 
indicators should be supplemental to assessment results, but 
they should be allowed to be part of an overall determination 
of school as well district progress.
    As you know, NCLB is quite prescriptive in regards to 
identifying schools and districts that have missed annual 
targets. Under the terms of the law, all schools that missed 
even one target are placed in the same status: identified for 
improvement. This label tells us only that the school has not 
met the target. It does not tell us why.
    I have seen the school fail in 1 year go from high-
performing to insufficient progress because it missed a single 
target. And we find this hard to explain in terms of the public 
policy or what Valerie called face validity.
    I believe that the law should establish a graduated system 
of classifications for schools and districts that have 
identified for improvement. The identification of schools and 
districts should include information as to how many targets 
were missed as well as over how many years. The identification 
of schools and districts should also indicate the capacity of 
the school or district to meet these targets as determined by 
indicators other than test results.
    Finally, I ask you to consider how states develop support 
systems and intervention strategies for schools and districts 
that have been identified for improvement. We don't need an 
intervention system that is based on a score card. We need a 
system that will give us multiple ways to measure all the 
components of the viability of a school in a district and to 
offer scaffolded responses based on the needs of schools and 
districts.
    The system as it stands is not designed to give schools a 
blueprint for success. It is a retributive system. We will not 
shrink from our responsibility of raising achievement and 
closing the gap. But we need the law to value our experience 
and leverage the expertise and give us more options over 
schools that are identified for improvement.
    Not all schools admit their targets are in the same place. 
Some may be truly dysfunctional institutions in need of a great 
deal of help, even restructuring. Others may be on task and the 
path toward success. How do states know if this is the case? 
Only through multiple measures.
    Indicators of measures of leadership, instructional 
leadership capacity, school climate, community involvement, and 
program integrity. Only through this can we determine the 
course, the appropriate course of action to take.
    Now that we are 5 years into the implementation of this 
law, it is obvious that many schools that have missed their 
annual targets are doing all they can within failing systems. 
That is, school improvement is often a matter of district 
capacity. In these instances, the intervention at school level 
will do nothing to solve the underlying systemic problem.
    When a state intervenes in a school that has missed 
targets, the state must have on-hand the complete picture of 
the school and district capacity. The law should not prescribe 
our responses. It should give us the authority to use our best 
professional judgment to build school improvements.
    The Rhode Island approach has been entered into district 
negotiated agreements that we write, negotiate, and finally 
approve on a program, budget, and personnel basis. That is 
pretty powerful. This is part of our process of progressive 
support and intervention.
    We are ready to do the work. To do that, we need an NCLB 
that is more than just a score card based on student 
performance and a list of mandated responses. We need 
indicators to measure all components of the health and capacity 
of the system.
    Chairman Miller. Mr. McWalters, I am going to ask you to 
wrap up.
    Mr. McWalters. Very good.
    The last piece that I would say is when you passed this 
authorization, there was a sense of impatience on your part, 
which was well-deserved at that time. I think 5 years in the 
credibility of individual states' capacity is now known and can 
be reviewed in a peer review system.
    [The statement of Mr. McWalters follows:]

 Prepared Statement of Peter McWalters, Commissioner of Elementary and 
               Secondary Education, State of Rhode Island

    Chairman Miller, Ranking Member McKeon, and members of the 
Committee, thank you for the opportunity to testify today on improving 
the ways we measure student progress. My name is Peter McWalters, and I 
am the Commissioner of Elementary and Secondary Education in the State 
of Rhode Island, where I have served for 15 years. I am also a past-
president of the Council of Chief State School Officers and a former 
Superintendent of Schools in an urban district, Rochester, New York.
    I am pleased to be able to talk with you today as you consider 
reauthorization of the No Child Left Behind Act. I supported the law in 
its passage. It represents the best form of federal intent and has 
pushed the states to focus on success for every student. The emphasis 
on standards and assessments and on public information was needed at 
the time, and it has been beneficial to the nation. But now, five years 
down the road, I think we can see some areas in which the law could and 
should be modified to help us achieve the goals that we all share.
    As CCSSO has said in its recent recommendations regarding NCLB 
reauthorization, we are in a new stage of standards-based reform. Many 
of the basic foundations are in place. The question now is: How do we 
build on and use these foundations to improve student achievement and 
close achievement gaps? I would submit to you that this will require 
innovation, capacity, and judgments that are based on district capacity 
to respond to specific conditions that have led to low student 
achievement. We need a federal law that values those things.
    As you prepare to reauthorize NCLB, I ask you reconsider three 
issues:
     how states determine whether schools have met their 
targets,
     how we publicly identify schools that have missed their 
targets, and
     how states can best deliver assistance and implement 
consequences to help schools meet their goals.
    As you know, schools may be identified for improvement if they miss 
any single one of the multiple targets established in the law. And 
these targets are almost exclusively based on the tests that states 
administer at seven grade levels.
    We are not afraid to use student performance as the ultimate 
measure of school improvement. Our testing system in Rhode Island, 
developed with the support of federal funds, is a tristate partnership, 
under which Rhode Island, New Hampshire, and Vermont established in 
common a set of grade-level standards and expectations and developed an 
assessment system lined up with those standards. This partnership, 
known as the New England Common Assessment Program, is exactly the type 
of initiative that the Federal government should continue to support.
    In addition to our state assessment system, we have in Rhode Island 
a number of means by which we can--and do--measure school performance. 
We administer an annual survey to all students, teachers, and parents, 
and from the results of this SALT Survey we tabulate ``Learning Support 
Indicators'' that measure school climate, instructional practices, and 
parental involvement. We conduct peer-review visits at every school in 
the state every five years. Each school is required by law to write an 
annual School Improvement Plan, and each district writes an annual 
District Strategic Plan, and these plans are at the center of our work 
with all schools and districts.
    Test results should be the initial measure of the school. But the 
law should allow states to employ indicators in addition to student 
performance to determine whether schools and districts are making 
Adequate Yearly Progress. These indicators could include measures of 
capacity such as evaluations of school climate, instructional 
practices, instructional leadership, teacher development, program 
implementation, and parental engagement. These indicators should be 
supplementary to assessment results, but they should be allowed as part 
of the overall determination of school and district progress.
    As you know, the NCLB is quite prescriptive in regard to 
identifying schools and districts that have missed annual targets. 
Under the terms of the law, all schools that miss even one target are 
placed in the same status: Identified for Improvement. This label tells 
us only that the school has failed; it does not tell us why. I have 
seen a school fall in one year from high performing to insufficient 
progress because it missed a single target, and we find this hard to 
explain to the school and to the public at large.
    I believe that the law should establish a graduated system of 
classifications for schools and districts that have been identified for 
improvement. The identification of schools and districts should include 
information as to how many targets were missed as well as for how many 
years. The identification of schools and districts should also indicate 
the capacity of the school or district to meet all targets, as 
determined by indicators other than test results.
    Finally, I ask you to reconsider how states develop support systems 
and intervention strategies for schools and districts that have been 
identified for improvement. We don't need an intervention system that 
is based on a scorecard. We need a system that will give us multiple 
ways to measure all components of the health and the capacity of 
schools and districts and to offer scaffolded responses based on the 
needs of the school or district. The system as it stands is not 
designed to give schools a blueprint for success. It is a retributive 
system.
    We will not shirk our responsibility for raising achievement and 
closing the achievement gap. But we need the law to value our 
experience and expertise and give us more options once schools are 
identified for improvement. Not all schools that miss their targets are 
in the same condition. Some may be truly dysfunctional institutions in 
need of a great deal of help--even restructuring. Others may be on task 
and on the path toward success. How do states know if this is the case? 
Only through multiple measures--indicators to measure leadership, 
instructional capacity, school climate, community involvement--can we 
determine what course to take to help schools meet their goals.
    Now that we are five years into implementation of the law, it is 
obvious that many schools that have missed their annual targets are 
doing all that they can do within a failing system. That is, school 
improvement is often a matter of district capacity. In these cases, 
state intervention at the school level will do nothing to solve the 
underlying systemic problems.
    When a state intervenes in a school that has missed targets, the 
state must have on hand a complete picture of the school and district 
capacities. The law should not prescribe our responses. It should give 
us the authority to use our best professional judgment to build school 
improvement. The Rhode Island approach has been to enter into District 
Negotiated Agreements on program, budget, and personnel with those 
districts that have missed their annual targets. This is part of our 
process of Progressive Support & Intervention, which is based on 
multiple indicators that present information for broader and deeper 
than assessment results.
    We are ready to do the work. To do that, we need from NCLB more 
than just a scorecard based on student performance and a list of 
mandated responses. We need indicators to measure all components of the 
health and capacity of the system. We need intervention strategies that 
help us build the capacity in each identified school and district. And 
we need the freedom and capacity to do our work, while always keeping 
the goals clear and the actions and outcomes transparent so as to 
improve the public-education system.
    I ask, therefore, that you consider revising the prescribed 
sequence of mandated responses to Title I schools that have been 
identified for improvement so that states can develop graduated support 
and intervention strategies that best meet the needs of each identified 
school.
    I have asked you today for a good deal of accountability at the 
state level, for I believe that the states have the ability to take on 
this challenge. When Congress passed and the President authorized the 
NCLB, there was a general sense of impatience with progress that the 
states had made. The law is therefore both comprehensive and 
prescriptive in regard to state responsibilities. The states have taken 
on these responsibilities in a serious and committed manner, and I 
therefore believe we are ready to move to a new level of shared 
understanding. States should be able to submit their annual compliance 
plans, which the Education Department would verify and accept after 
good-faith peer review
    The CCSSO recommendations for NCLB reauthorization include several 
items that support the points I have brought to you today, including 
calling on Congress to allow states to include additional relevant data 
in making judgments about school progress, allowing states to 
differentiate consequences for schools that have missed their annual 
targets, investing more in state capacity to assist and intervene in 
districts and schools that have missed their targets, and creating a 
new process for innovative models and a greatly revised system of peer 
review that would allow states to continuously innovate in 
accountability and other areas--with proper guarantees for results.
    Thank you for your attention and leadership on these important 
issues. I have with me several supportive documents regarding the 
accountability system in Rhode Island that I would like to present to 
you for your records, and I look forward to any questions you may have.
                                 ______
                                 
    Chairman Miller. Thank you.
    Dr. Doran?

   STATEMENT OF HAROLD C. DORAN, SENIOR RESEARCH SCIENTIST, 
                AMERICAN INSTITUTES FOR RESEARCH

    Mr. Doran. Thank you. Chairman Miller, Ranking Member 
McKeon, and honorable members of the committee, thank you for 
this opportunity to share my thoughts on ways to improve the No 
Child Left Behind Act.
    My name is Harold Doran, and I am a senior research 
scientist at the American Institutes for Research in 
Washington, D.C. In this role, I help states and districts 
across the country develop their testing and accountability 
systems. I am also a former classroom teacher and elementary 
school principal in Tucson, Arizona.
    The question I have been asked to respond to today is 
whether the AYP provisions would benefit from having additional 
ways to evaluate schools, what some refer to as multiple 
measures, and whether these measures can be joined to form a 
compensatory accountability system. The term ``compensatory'' 
denotes that not meeting AYP under one measure could be 
compensated for using a secondary measure. I believe the 
provisions could be strengthened if multiple measures were 
added.
    In my discussion today, I would like to explain this 
position and suggest specific measures that I believe would 
strengthen the legislation. I emphatically support the use of 
multiple measures, as do most educational experts. However, 
there are multiple views on what set of measures to be included 
in accountability systems. Even more challenging is how these 
measures can be combined in forming a compensatory 
accountability system.
    To reduce ambiguity, I would offer the following definition 
of multiple measures for today's conversation: an 
accountability system that includes multiple measures uses test 
scores from more than a single test, achievement indicators 
collected by other means, or various statistical methods for 
evaluating the data. By this definition, NCLB already uses 
multiple measures.
    But the law does not permit for one to compensate for 
another measure. I believe the integrity of the law would be 
enhanced if it were modified to accommodate the following: 
multiple measures; and allow states to use those measures to 
create rigorous compensatory systems.
    First, any consideration of new measures, however, must 
first be met with a discussion of criteria to avoid watering 
down any of the current systems. One, including new indicators 
should result only in added rigor to core content areas. Two, 
incorporating multiple measures should not result in systems 
that are too complex so that they are difficult to implement or 
confusing to parents and educators.
    I have four specific recommendations. Two of these 
recommendations would add measures that could serve in a 
compensatory role. One recommendation adds to AYP. And the last 
is a recommendation to ensure system integrity.
    NCLB currently monitors the proficiency rates of high 
school students in language arts, reading and math. When 
students do not reach levels of proficiency on the statewide 
regular tests, their only option is to retake the same test a 
year later.
    However, an alternative that could be used is to provide 
students with an opportunity to enroll in targeted coursework 
that targets their specific area of need and allow for them to 
pass an end-of-course examine that allows for them to 
demonstrate mastery of the content.
    For instance, a student may not reach proficiency on the 
statewide test because it were known that he struggled with 
concepts in geometry. Subsequently, the student could enroll in 
a geometry course, demonstrate proficiency via a new state-
developed end-of-course exam that is equally as rigorous as the 
statewide NCLB test.
    Learning is fundamentally about change. However, the 
methods by which AYP are currently calculated do not follow 
this logic and, in many ways, are actually biased. The current 
reality is that the mathematical model used to measure 
proficiency rates must be improved.
    For example, a school with many students scoring in the 
highest performance category can have a drop in students' 
academic performance that still remains above proficiency and 
still be classified as a school making AYP. In contrast, a 
school with many students beginning well below proficiency and 
learning at remarkable rates, is likely not to be recognized as 
a high-performing school.
    It is my recommendation that AYP calculations include 
results obtained from growth models as another method for 
evaluating schools. NCLB currently requires students to 
participate in science assessments beginning in 2008. However, 
the results of those assessments will not be included currently 
in AYP calculations. It is my recommendation that they should 
be. It is also possible to develop end-of-course exams in 
science, as previously suggested.
    Last, I would like to offer a suggestion on the use of 
NAEP. It cannot be used to measure AYP, but it can be used to 
inform how state performance standards are set and partly used 
to determine overall system integrity. I would like to 
recommend that this committee support a research agenda that 
would investigate and report how best to establish links 
between NAEP and the various state assessment programs.
    In many respects, the variability in standards and 
difficulty of the assessment programs across states is 
important and reflects idiosyncrasies in the educational 
programs. On the other hand, this variability presents a 
significant challenge, given that we live in a highly mobile 
society.
    It is my view that reauthorized versions of NCLB should 
establish national policy using NAEP to illustrate the 
comparability of proficiency levels across the country. This 
information would be extremely valuable as states build or 
refine their standards and assessment programs. It will also 
provide policymakers with a window to assess system integrity.
    Thank you for your time. I hope these suggestions are 
helpful. And I am grateful to answer any questions that you may 
have.
    [The statement of Mr. Doran follows:]

   Prepared Statement of Harold C. Doran, Senior Research Scientist, 
                    American Institutes for Research

    Chairman Miller, ranking member McKeon, and honorable members of 
the committee, thank you for this opportunity to share my thoughts on 
ways to improve the No Child Left Behind Act. My name is Harold Doran, 
and I am a senior research scientist at the American Institutes for 
Research (AIR) in Washington, DC. In this role, I help states and 
districts across the country develop their testing and accountability 
systems.
    The question I have been asked to respond to is whether the 
adequate yearly progress (AYP) provisions in NCLB would benefit from 
having additional ways to evaluate schools, what some refer to as 
multiple measures, and whether these measures can be joined to form a 
compensatory accountability system. The term compensatory denotes that 
not meeting AYP under one measure could be compensated for using a 
secondary measure.
    I believe the AYP provisions could be strengthened if multiple 
measures were added. In my discussion today, I would like to explain 
this position and suggest specific measures that I believe would 
strengthen the legislation.
Why Multiple Measures?
    I emphatically support the use of multiple measures, as do most 
educational experts. However, there are multiple views on what set of 
measures to include in accountability systems. Even more challenging is 
how these measures can be combined in forming a compensatory 
accountability design. To reduce ambiguity, I would offer the following 
definition of multiple measures for today's conversation: An 
accountability system that includes multiple measures uses test scores 
from more than a single test, achievement indicators collected by other 
means, or various statistical methods for evaluating the data.
    By this definition, NCLB already relies on multiple measures. But 
the law does not permit one measure to compensate for another measure. 
I believe the integrity and strength of the law would be enhanced if it 
were modified to accommodate the following:
    1. Permit for multiple measures; and
    2. Allow states to use those measures to create rigorous 
compensatory systems.
    Any consideration of new measures, however, must first be met with 
a discussion of criteria to avoid watering down our current systems:
    1. Increased Rigor. Including new indicators should result only in 
added rigor to core content areas.
    2. Simplicity and Transparency. Incorporating multiple measures 
should not result in complex systems that are difficult to implement or 
that are confusing to parents and educators. The elegance of 
simplicity, combined with a focus on rigor, will guard against over-
engineering accountability designs.
Specific Recommendations for Multiple Measures
    I have four specific recommendations. Two of these recommendations 
would add measures that could serve in a compensatory role, one 
recommendation adds to AYP, and the last is a recommendation to ensure 
system integrity.
            End-of-Course Exams
    NCLB currently monitors the proficiency rates of high-school 
students in language arts/reading and math. When students do not reach 
levels of proficiency on the statewide regular tests, their only option 
in many cases is to retake the same test. However, an alternative that 
could be used is to provide students with an opportunity to enroll in 
coursework that targets their specific areas of need and allow for them 
to pass an end-of-course test that demonstrates mastery of the content.
    For instance, a student may not reach proficiency on the statewide 
NCLB test only because he struggles with concepts in geometry. 
Subsequently, the student could enroll in a geometry course and, at the 
end of this course, demonstrate proficiency via a state-developed end-
of-course exam in geometry that is equally as rigorous as the statewide 
NCLB test.
            Growth Models
    Learning is fundamentally about change. However, the methods by 
which AYP are currently calculated do not follow this logic and are, in 
many ways, biased.
    The current reality is that the mathematical model used to measure 
proficiency rates must be improved. For example, a school with many 
students scoring in the highest performance category can have a drop in 
students' academic performance that still remains above the proficiency 
bar and still be classified as making AYP. In contrast, a school with 
many students beginning well below proficiency, but learning at 
remarkable rates, is likely not to be recognized as a high-performing 
school.
    It is my recommendation that AYP calculations include results 
obtained from growth models as another method for evaluating schools. 
The results from these models can be used in a manner similar to the 
safe-harbor provisions as another way to make AYP. If permitted, the 
models must conform to the same high expectations for proficiency as 
currently required and not simply reward growth.
            Incorporate Science Results into AYP
    The 2001 NCLB requires students to participate in science 
assessments beginning in 2008. However, the results of those science 
assessments are not included in the current AYP calculations.
    Including science in AYP calculations will encourage schools to 
emphasize science as a component of their core curricula. It will also 
be possible to develop end-of-course exams in science as previously 
suggested.
            National Assessment of Education Progress (NAEP) Research 
                    for Comparability
    Last, I would like to offer a suggestion on the use of NAEP--it 
cannot be used to measure AYP, but it can be used to inform how state 
performance standards are set and partly used to determine overall 
system integrity. I would like to recommend that this committee support 
a research agenda that would investigate and report how best to 
establish links between NAEP and the various state assessment programs 
across the country.
    In many respects, the variability in content standards and 
difficulty of the assessments across states is important and reflects 
critical idiosyncrasies in the educational programs. On the other hand, 
this variability presents a significant challenge given that we live in 
a highly mobile society. For example, a student attaining mathematical 
proficiency in Arizona may attend college and/or obtain professional 
work outside of that state.
    Hence, my view is that reauthorized versions of NCLB should 
establish national policy using NAEP to illustrate the comparability of 
proficiency levels across the country. This information would be 
extremely valuable as states build and/or refine their standards and 
assessment programs. It will also provide policymakers with a window to 
assess system integrity.
    Should the committee accept the notion that additional indicators 
are necessary to establish more robust systems, I would then encourage 
the committee to further consider how these multiple indicators can be 
combined to form a judgment about school quality that still aligns with 
the basic tenets of proficiency set forth in the legislation.
    I hope these suggestions are helpful as this committee moves 
forward with deliberations related to NCLB improvements. I am grateful 
for the opportunity to testify today and am happy to answer any 
questions you may have.
                                 ______
                                 
    Chairman Miller. Thank you all very much for your time.
    Dr. Doran, you say on the bottom of the first page of your 
statement that by definition NCLB already relies on multiple 
measures, but the law does not permit one measure to compensate 
for another measure.
    And, Commissioner McWalters, you said in your statement 
that these indicators should be supplementary to assessment 
results, but they should be allowed to be as part of an overall 
determination of school and district progress. Are those two 
things consistent?
    Mr. Doran. Maybe I can clarify exactly what I mean and 
explore for just a moment. Currently there is a limited set of 
multiple measures that are permitted. In reality, a student has 
a single opportunity to demonstrate proficiency on the test. We 
know that tests when designed well can provide very useful and 
good information. But the reality is some kids have mastered 
the content, but for one reason or another, didn't have an 
opportunity to demonstrate their proficiency on the test when 
it was given on that day.
    And I think what I am saying is this. You talk to 
practitioners. You talk to statisticians. You talk to testing 
professionals. And we might say that if I had a different day 
or a different way for a student to demonstrate their 
proficiency on this content, he would have. I know the student 
has mastered the material. But it just didn't work today. So I 
need a different day to test, or I need a different way. The 
goal is still the same: evaluate whether the student has 
mastered the concept. Just provide multiple, parallel tracks to 
identify whether the student has done so.
    Chairman Miller. Commissioner McWalters?
    Mr. McWalters. I would concur with that. There are two 
different multiple measures we are talking about. One is 
actually about student performance. And the other is capacity 
issues. I meant don't let the capacity issue measure. The 
indicators of whether they are on task should not somehow 
compensate for student performance. But I would concur.
    We are a state that very much is trying now to come up with 
embedded assessments that can be audited for reliability and 
use to drive practice. Those kind of measures when done right 
ought to be compensatory as in added to and part of an 
explanation.
    Chairman Miller. Let me follow up on what you just 
mentioned. Because in your testimony, you also stated--and this 
is what concerns me--``Now that we are 5 years in the 
implementation of the law, it is obvious that many schools that 
have missed their annual targets are doing all that they can do 
within a failing system.''
    Mr. McWalters. Right.
    Chairman Miller. And I think all of us here as we have 
visited schools and schools that haven't made AYP and they show 
you what changes they are making, you leave that school and get 
in your car and drive away thinking they don't have a chance.
    Mr. McWalters. Right, right. That is right.
    Chairman Miller. Because you just don't see any change in 
the capacity to do what is necessary. They have moved everybody 
around. They have given people titles.
    Mr. McWalters. Right.
    Chairman Miller. But it is just not going to happen. And it 
hasn't happened for the last 20 years in the same schools.
    Mr. McWalters. Right.
    Chairman Miller. So you start to think, you know--and so, I 
am intrigued with the idea of multiple indicators as also being 
able to give you a handle on what is going on in that school or 
even within that district.
    Mr. McWalters. Yes, right.
    Chairman Miller. But certainly, within that school of 
whether it is time for professional development or teachers to 
work together or to review one another's activity, all of these 
things that we think measure a learning environment. But again, 
you are really talking about two separate purposes.
    Mr. McWalters. That is right, sir.
    Chairman Miller. Is that correct?
    Mr. McWalters. Absolutely. I think most of us--again, when 
I said I was an urban educator, I was in a state that didn't 
have an urban capacity. The state could not intervene at the 
district level because some of our urbans they are bigger 
institutions almost than the State Department.
    So when this law started--and I think it started in the 
right place--all commissioners were into school improvements 
where you can go in and you can possibly restructure a school 
to work for a while. But if you step back and that is in a 
system that is dysfunctional, then that system will eventually 
come back to neutral, if you will.
    So this issue of what other measures of school health, 
district health aligned from state health to school room is 
very complicated business. And it is only actually with the 
emerging information systems that you can start tracking 
expenditures, time on task, teacher development.
    And the most impenetrable ones so far is when you have bad 
teacher practice with kids who have never been given a fair 
chance. And then you find teachers who actually begin to 
understand their own limitations when they see good standards 
and good feedback. And then you start realizing teacher 
retooling is part of an enormous investment strategy.
    I think my only point is if you don't know all of that and 
you just keep using one indicator, I just don't see the 
viability of that changing the improvement structure.
    Chairman Miller. Dr. Dougherty, do you want to comment?
    Mr. Dougherty. I would like to add that I am hearing two 
issues here. One is a more nuanced way of determining whether 
the kid is okay. Is the kid on a trajectory to being 
proficient. And the second is a more nuanced approach to 
whether the school is okay and the school is on a trajectory.
    Chairman Miller. And a district.
    Mr. Dougherty. And a district. And that is a very important 
point because school systems--schools exist within systems. And 
a lot of times the problem is the system is dysfunctional.
    Chairman Miller. Okay.
    Secretary Woodruff, let me ask you this. This 2007 school 
year you are going to be using an approved growth model. Is 
that correct?
    Ms. Woodruff. Yes, sir.
    Chairman Miller. What is the biggest change that you think 
you are going to notice?
    Ms. Woodruff. We don't know. We will be calculating the 
school's rating based on the traditional model. We will also be 
calculating using the growth model. And then we will be able to 
see whether or not there is any difference in the school rating 
between the two. So until we actually implement and evaluate 
that implementation, I really can't give you a clear answer.
    Chairman Miller. Thank you. My time is up.
    Mr. Olson, I am going to have to get you on a second round 
here. But I am quite intrigued with your track record in terms 
of administering these adaptive tests. And I would like to come 
back to that.
    But I would like now to recognize Congressman McKeon.
    Mr. McKeon. Thank you, Mr. Chairman.
    And just following up a little bit on the line of 
questioning that you were doing with Commissioner McWalters and 
Dr. Doran, you are talking about having a dual type different 
modes of testing because it would do a better job.
    One of the things that I have found in talking to people is 
their complaint already of having too many tests. Would this be 
another layer on top of that that they would have to deal with?
    Mr. Doran. A couple of issues.
    One, in thinking about this a bit, I think it is clear that 
we are talking about two buckets, two kinds of things that we 
want to collect indicators on and about schools. School process 
indicators, things that are illustrative of how healthy the 
school is in its instructional leadership and how well students 
are spending time on task and so forth. And those don't 
necessarily become quantifiable in the sense of whether 
students have mastered the core content or not. So those are 
the school process variables bucket. And those are extremely 
important.
    Then there is the other bucket, which are those measures 
that are designed to specifically measure whether students have 
met the outcomes that are expected of them or not. Now, with 
respect to student outcomes, if we had multiple measures--that 
is, other ways that we could evaluate whether the students have 
mastered the contents or not--I wouldn't necessarily suggest 
that students would be tested multiple times, per say.
    I think that students should be given multiple 
opportunities to demonstrate the mastery of the concept. So for 
example, if a student did fine and demonstrated their mastery 
of the concept on the regular statewide assessment, that is 
fine. That is the only assessment maybe that student would need 
to participate in unless the school or the classroom teacher 
for other reasons wanted the student to participate in 
something else.
    However, if the student didn't demonstrate mastery of the 
concepts on that particular test, I think there should be 
multiple avenues from which the school--the state has designed 
a system such that the school can choose an alternative path. 
Now, I don't think that, based on my conversations with 
professionals and state departments of education, my 
experiences as a practitioner, which was 10 years, that people 
would push back and reject an opportunity to allow students to 
have multiple opportunities to demonstrate their mastery of the 
concept.
    I think where people would push back is if students were 
required to participate in repetitive tests that didn't give 
them useful information upon which they could make 
instructional diagnoses from there.
    Mr. McWalters. I would completely concur with that. When I 
talk about multiple assessments, I think one of the things that 
we are still missing is that the test is perceived as a state 
test. And thank God, now I think most of us have at least got 
standards in systems where they are aligned. But teachers don't 
own them yet.
    And until we have worked at the level of teachers 
developing assessments just like the state tests or versions of 
it that I would call embedded, much more performance-based, 
much more on demand and that the state's obligation is to have 
a system that is auditing that so it is either got quality and 
it is reliable.
    But any of you that know anything about the writing process 
strategy statewide or nationally know that it is hard work and 
it is probably extensive to get it embedded. But until teachers 
begin to own the assessment decisions that would add up to 
improving the state test, then you are still doing a dip stick 
strategy and you are not going to change practice 
substantively.
    So the teachers I talk to don't think of my instrumentation 
as additional testing. They think of it as instructional 
assessment.
    Mr. McKeon. But it still takes more time away from 
classroom instruction because they have to do another----
    Mr. McWalters. The ones that I am talking about would be 
done right in an instructional program. It would be part of the 
instructional practice just like a quiz is today.
    Mr. McKeon. Okay.
    Mr. McWalters. If you know what I mean.
    Mr. McKeon. Okay. But a quiz also takes time away from 
instruction. I mean, at some point whenever you are evaluating, 
you are taking time away from instruction. I am just saying 
that was one of the complaints we have is we already have all 
these tests. And I am not saying anything about the validity of 
it, the importance of it.
    Just, I think, when you say we get push back on some 
things, you get push back on just about everything.
    Secretary, one of the questions I had is we both come from 
the largest state. You are one of the smaller states. Do you 
think what you are doing could be replicated with the number of 
districts we have, the number of schools we have within our 
state and then the same thing across the country?
    Ms. Woodruff. Well, actually, the growth model that we have 
in place absolutely could be used in small states, large 
states. It really doesn't matter. We are using the value table. 
It is very simple to understand, as Dr. Dougherty mentioned. 
Students are given points for different progressions toward 
proficiency. If the student slips, then the school gets fewer 
points. So the system itself is one that really can be used in 
a small system or a large one. That is not an issue at all.
    Could I comment on the previous conversation for a moment?
    Mr. McKeon. Go ahead.
    Ms. Woodruff. One of the things that I think that we have 
gotten away from is helping teachers and others understand that 
assessment has been and always will be a part of instruction 
and that those quizzes and end of course assessments and so 
forth are important. In Delaware, we have a student 
accountability system. And at certain levels students who are 
well below our standard must attend summer school.
    We have developed a system by which school districts can 
bring to us what we call other indicators of performance. And 
if students can show proficiency according to those other 
indicators, then they do not have to go to summer school and 
face other consequences like not going to the next grade and so 
forth.
    So I think that what both Dr. Doran and Commissioner 
McWalters are talking about in terms of other kinds of 
assessments really can be done. The system we have now probably 
isn't as sophisticated as it ought to be. But something like 
that makes sense to families and makes sense to students as 
they get older certainly and certainly to teachers.
    Mr. McKeon. Thank you.
    Chairman Miller. Mr. Kildee?
    Mr. Kildee. Thank you, Mr. Chairman.
    Commissioner McWalters, you testified in support of a 
differentiated interventions for schools that do not meet AYP, 
depending on how close they are. Can you describe how you might 
differentiate the consequences for schools that fall short a 
little, fall short a lot?
    Mr. McWalters. Well, right now our practice--we are 
actually in this practice. We have gone out--we have systems, 
and they tend to be embedded in big urban systems. I am going 
to be dramatic.
    You have flat line indicators. I mean, the first indication 
is teach, for God's sake. And that is a pretty heavy 
assessment. But when you go in, usually when you find there is 
a pretty complicated set of dysfunctions from leadership to 
school culture, attitudes. I mean, you just want to shut the 
place down, which that is the one dramatic thing we can do.
    But the truth is you go from there to places that have 
reasonably good cultures, but they just internalize low 
expectations. They love the kids, but they are not working with 
them. So you need to know that when you are going in there. And 
you need to know whether or not it is about alignment, time on 
task, command, control.
    You need to settle either those initiatives at the state 
and district level. And once you have them in your tool kit, 
you need to know whether the district is part of that problem. 
Is it the districts that have the systems of dysfunction? And 
if it does, that changes the trajectory of change.
    When I talk about AYP, I have two images. One is a 
realistic one for a school and a realistic one for a big system 
with a series of alignments that all have to be dealt with. So 
I think my point is I am in a little state that has enough 
information systems on health, time, expense, personnel that 
that is the level of intervention that we are now dealing with.
    And I just see differentiated treatments for different 
schools. There is a phrase in my state now, ``Great schools 
look awfully similar. Terrible ones can look awfully 
different.''
    Mr. Kildee. Well, let us take this. You have a school A and 
school B. One just barely missed AYP. And the other one just 
was way, way down the scale.
    Mr. McWalters. Right. That is right.
    Mr. Kildee. Can't we have effects, penalties, consequences, 
whatever you want to call them?
    Mr. McWalters. That is right.
    Mr. Kildee. Do you apply those effects, consequences, 
penalties differently in those instances?
    Mr. McWalters. Well, I would like to be able to--yes, my 
answer is I think we have to have better degrees of judgments 
made about what the intervention and the penalties are. And I 
think those should be in a proposal that is kind of a change 
theory or status that is reviewed by a peer review structure so 
that it is not hidden, it is not made up on the spot. It is a 
whole program of that is one reviewed.
    Because one of the other issues I think we have to admit is 
we are at a scale of intervention that is still an experiment 
in 50 states. None of us have an answer here. I need both 
assurance and cover that in good faith I am doing public policy 
work that can be tracked over time for its effectiveness. And I 
think that is what the peer review system ought to kind of 
review and sanction.
    Mr. Kildee. Well, for example, at one point you might 
require tutoring for students.
    Mr. McWalters. Right.
    Mr. Kildee. Because perhaps there was a great differential 
between where they should be. One just barely fell short. Is 
there something short of tutoring one could do in that school 
that would help raise that?
    Mr. McWalters. Well, I will give the example of a--in the 
first round, I think the drama was needed because it uncovered 
those places we are hiding behind averages. But once you got at 
that, many of the places actually got on task, identified 
through disaggregation what they had to do, and they went about 
the business of doing it.
    But now that we are into this over time, you have schools 
that kind of drop in and drop out. And to go in there 
effectively, sometimes they see it coming. Sometimes it is as 
simple as a cohort question. You want to be able to go in with 
an instrumentation.
    Sometimes it is instructional practice. Sometimes there was 
a change in leadership. And sometimes it is more time on task 
like tutoring. I am suggesting that all of those are decisions 
that need to be made in the context of a really comprehensive 
assessment of where the school or district is.
    Mr. Kildee. Thank you. Just one more question. Suppose one 
of these groups whom we disaggregate the data for falls short 
and that could bring the whole school out of compliance with 
AYP.
    Mr. McWalters. Right.
    Mr. Kildee. Is there something we can do rather than say 
that school is out of AYP and therefore must suffer the 
consequences, the effect, whatever you want to call it, that we 
do something for that one group to help raise them up? Or do we 
just declare the whole school not achieving AYP?
    Mr. McWalters. I would say you just ask me. That is the 
biggest question in my state now on the periphery. When you 
have a system that is perceived to be a pretty good system, 
good system, good school, in one indicator, usually second 
language or minority or poor kids in a system that they are a 
tiny percentage, in those early days, that was exactly what I 
needed because you could go after people that never talked 
about it.
    But now that everybody knows that is the indicator, once 
you have that, this issue of saying the school is now not in 
AYP and is in need of improvement it is--I don't know if the 
word is redundant or superfluous. Because now you still could 
have a reasonably high-performing place that is not running 
away from the identification of needing to do something about a 
target population. But the rhetoric of the big system--I am 
either in or out--it is not effective.
    Mr. Kildee. Thank you, Commissioner. Thank you very much.
    Thank you, Mr. Chairman.
    Chairman Miller. Thank you.
    Mr. Castle?
    Mr. Castle. Thank you, Mr. Chairman.
    And let me thank the panel of witnesses who were 
exceptional. I started this week giving a speech to our 
district superintendents on the growth model. And then I have 
listened to you. And I have decided now I knew a lot less about 
it than I thought I did going into it. So you have opened up 
the book for study, I think, here.
    Let me ask you, Dr. Doran, a question on something a little 
bit unrelated in your written testimony, which I am looking at 
now. You indicated in the discussion on NAPE, ``It cannot be 
used to measure AYP.'' I agree with that. ``But it could be 
used to inform how state performance standards are set,'' et 
cetera, and, ``recommend that the committee support a research 
agenda that would investigate and report how best to use links 
between NAEP and the various state assessment programs across 
the country.'' And I agree with that, too.
    And I have seen the charts that have shown how states are 
achieving on their own assessments versus how they do on the 
NAEP test, the National Assessment for Education Progress Test. 
And I would assume the state assessment would include 
standards, too. I mean, to me they are perhaps--I am not saying 
anyone is cheating. But obviously, some states are setting a 
lot higher standards than others.
    And that concerns me. I am not sure that is what the 
purpose of all this is. But I just wonder if you wanted to 
expand on that a little bit in terms of your thinking. I 
understand your conclusion is we need to study it further.
    Mr. Doran. I would be happy to. It is true. We know that 
there is a lot of variability in at least the two things that 
you mentioned. We know the difficulty of the assessments vary 
across states. And we also know the difficulty and the breadth 
of the content standards vary across the states.
    And there is some research. It is not comprehensive. But 
there is some research that has done exactly what you 
mentioned. We have seen how state tests can be used to match up 
to NAEP. And we can compare how state performance compares to 
NAEP. I think we need to extend that, and that is why I am 
recommending that. I would like to see that happen a bit more 
comprehensively.
    I think this is important for a number of reasons. One, I 
am not sure that there is a great deal of understanding of 
exactly what is happening in or why there is this great 
variability across states. And I think we need to open the door 
to start having that conversation about if there is 
variability, what is the cause of that variability, and are 
some states, in fact, doing things that other states should be 
doing.
    So I think having a policy that would help illustrate the 
comparability of standards and assessments across states would 
then lead us down the path of a better understanding about what 
some states are doing that may, in fact, should be replicated 
in other locations. Why do I think that is important? Well, we 
know that some students start high school in one state and they 
move into another state. And they may have a difficult time 
catching up. Or they may be advanced, and they may be bored.
    That some students graduate high school in, say, Arizona 
and they may move and attend college in California or obtain 
work in California. But the proficiency definitions in Arizona 
and California may be very disparate.
    So we in many respects don't have a really strong system of 
coherence. And I know why. Because we have--someone mentioned 
50 or 52 different experiments happening with the district in 
Puerto Rico. So I would recommend this because I think, a, we 
need an illustration of what is happening in terms of 
comparability. And, two, I think that would lead us down the 
road of a better understanding of why there are variances.
    Mr. Castle. Thank you.
    Let me jump subjects here and to Secretary Woodruff and Dr. 
Dougherty, getting back to the growth model.
    Secretary Woodruff, you mentioned that Delaware has been 
using longitudinal data systems that track individual student 
progress since 1984. And my impression is from your testimony 
and from what I know that indeed Delaware was more advanced in 
that area than perhaps some other states had been.
    Dr. Dougherty, I think you indicated that 27 states could 
do growth now and 40 in several years. Is that a correct 
statement? Well, let me ask the question. And that is the whole 
growth business is a little more complicated than I had 
thought, I am learning. And my concern is--and I think it is an 
important part of our discussion on the reiteration of No Child 
Left Behind perhaps this year.
    But my concern is the ability of the states to do it. We 
have had a lot of complaints about the cost of No Child Left 
Behind, et cetera. And I don't want to overburden. On the other 
hand, I would like to do something which is positive. I am just 
curious as to where we are vis-a-vis the states and how 
simplistic this would be for them to do or how complicated it 
would be for them to do it. If you all could share your 
thoughts on that.
    Ms. Woodruff. Well, I think Dr. Dougherty certainly is much 
more the expert on the lay of the land, if you will, across 
states and where different states are. But I know that in our 
conversations at CCSSO that as states are putting the data 
systems in place and learning more about assessment systems and 
how growth can work, there is a desire among my colleagues for 
this kind of accountability model to be used because we feel 
that it really can help us, quite frankly, incentivize our 
schools and people within our schools more than the status 
model alone.
    Mr. Dougherty. Yes, I would say that basically the data 
nerds in the state agencies have been wanting longitudinal data 
systems for years. And they never got the leverage until No 
Child Left Behind came along and you started to talk about, 
``Well, you have got to desegregate kids by ethnicity,'' and so 
forth and so on. And then how do you keep track of which kid 
belongs in which group with kids bubbling in every year? That 
is going to create errors and so forth.
    And so, you basically--one of the biggest positive 
consequences of No Child Left Behind is just the better 
development of data systems and the greater use of data for 
school improvement, system evaluation, and so forth. There has 
been tremendous progress. My organization was originally a 
small non-profit called Just for the Kids. And we started out 
in 2000 surveying the states to see who could do longitudinal 
data pictures involving student growth, tracking, who has been 
enrolled in the school for how long.
    And Tom Luce, who founded our organization, said, you know, 
find me 15 states that can do this. Well, we found about five. 
So now it is a lot more than 15, so we are making tremendous 
progress in this area. The recognition that it is valuable, 
that it is not only valuable for accountability, but you can 
then put information in the hands of educators.
    I have not only got my kid, but my kid comes in this fall 
and I have got an academic history on the kid going back. So if 
he doesn't understand multiplication, maybe he didn't learn 
place value last year. Understanding that building these data 
systems is valuable, both for evaluation, accountability, and 
school improvement and the teacher and principal and district 
level.
    Mr. Castle. Thank you.
    Thank you, Mr. Chairman.
    Chairman Miller. Ms. Hirono?
    Ms. Hirono. Thank you, Mr. Chairman.
    I think that NCLB should allow for multiple assessments 
because what we have now is just not fixable enough as a really 
helpful way to measure student progress. And right now the 
Department of Education is approving growth models on a pilot 
basis. And they are limiting this to only 10 states.
    I note, Dr. Dougherty, that in your testimony that 27 
states are pretty much ready to go with a growth model and that 
the NCLB right now does not contemplate that by statute.
    So yes or no would be good, for all of you, if we should 
amend NCLB to allow for more flexibility to allow the states 
right now to propose a growth model as an assessment measure. 
Can we just go down the line?
    Mr. Olson. Yes.
    Ms. Woodruff. Yes.
    Mr. Dougherty. Yes.
    Mr. McWalters. Yes.
    Mr. Doran. Yes.
    Ms. Hirono. Thank you.
    Thank you, Mr. Chairman.
    Chairman Miller. I am impressed. Thank you, Ms. Hirono.
    We will go back to Mr. Boustany.
    Mr. Boustany. Thank you, Mr. Chairman.
    Given that math and science have been--there is a strong 
consensus that these areas of education are critical for our 
national competitiveness vis-a-vis China and other countries in 
a global economy.
    For those of you who have looked at the longitudinal 
tracking, are there clear differences with regard to math 
versus language arts when you look at the tracking system? And 
is it easier to implement longitudinal tracking with math 
education than with language arts?
    Mr. Doran. I have done a bit of research on this actually. 
It is a tough question to answer. It is a good question. And we 
think about this question quite a bit actually.
    In the growth modeling world, and in a slight variation 
from the kinds of growth models that we are talking about 
today, something called value added models, we tend to be able 
to pick up what statisticians call a bit more signal, that is, 
we can minimize statistical noise, with math. We don't know 
exactly why.
    Some hypothesize that math tends to be a little bit more of 
a linear kind of an instructional subject as opposed to 
reading, which one may or may not--and there are arguments on 
the other side of that, that they say math isn't as linear. But 
from a statistical perspective in some of the research that I 
have done with value added models, which are slightly different 
than growth models that we are talking about here today, we are 
able to at least pick up a bit more sensitivity on what is 
happening within the school in the subject area of math.
    We still do very good work with--or we think we can still 
do very good work statistically with reading scores. But the 
sensitivity in terms of how much we can capture for whatever 
reason isn't as good in reading as it is in math. It is still 
good, and I don't want to undermine that it is not. But we can 
pick up better patterns of what is happening in schools and 
minimize statistical noise with math when compared to reading.
    Mr. McWalters. I think the issue of math and reading 
comprehension and communication are the central elements. My 
experience in this is that we have to delve deeper into what 
reading comprehension means. And testing has its limits there.
    But in the industry that I represent, people understand 
teaching reading. And yet they stop teaching it developmentally 
by the 4th grade, which is why you have so many kids who can't 
answer comprehension questions when they get into high school. 
And math is too often defined as operations as opposed to 
problem solving.
    And my experience is that once you are in high school, a 
student who can't solve the math problem probably isn't reading 
and comprehending what you are even asking them to do. If you 
can reduce it to an operation, they tend to be able to do it.
    I have kids who can pass an algebra test if it is done as 
algebra problems. If you take the numbers off the page and put 
it as a problem to be configured and then solved, they can't do 
it.
    Mr. Boustany. So there is a strong linkage between language 
skills and math solving ability.
    Mr. McWalters. At the higher up that you tend to go.
    Mr. Boustany. The higher up you go?
    Mr. McWalters. Absolutely. Problem solving----
    Mr. Boustany. So it is critical that if we are going to use 
longitudinal tracking as a tool, you wouldn't want to separate 
out the two. You would want to track both areas longitudinally?
    Mr. McWalters. Yes, absolutely.
    Mr. Boustany. Yes, yes.
    Also, Dr. Doran, I was very pleased to hear your commentary 
on the variability of NAEP and many of the state assessments. 
And this seems to be something that has been unmasked clearly 
since No Child Left Behind has been in play. And I agree. I 
think it is an area clearly that needs to be researched more 
thoroughly. And so, I thank you for bringing up that point.
    I see my time is running out.
    Mr. Chairman, I will yield back.
    Chairman Miller. Ms. Davis?
    Mrs. Davis of California. Thank you, Mr. Chairman.
    Thank you all for being here. I really appreciate your 
expertise on this.
    Commissioner McWalters, you mentioned one of the problems 
that we have certainly seen in the San Diego area where we had 
a school meeting AYP on one of 30--only missing it on one of 38 
requirements.
    In your research, if we were to address the specific 
shortfalls for a school and just look at that element--and in 
many cases, it is in special education or perhaps even in 
English language learners. Does that actually cover the needs 
for that school? Or how do you think we should best address 
that?
    Mr. McWalters. Now you are into context. And take this as 
an experienced practitioner, but it is not definitive. I can 
imagine a place where you expose the one indicator and the 
people are as upset at the school level as we would be. It is 
almost like when we finally got decent information that has 
surfaced, they are willing to step toward the problem.
    There is another school where that one indicator--those 
kids become the problem. They will do everything they can to 
find a way around the kid. Those are two different contexts. 
One of them you want to hang. And the other one you want to 
work with.
    Now, however we term that, this is that issue of is 
everything too blunt. Assuming that you have taught us the 
lesson that we are accountable and that we have got to be 
transparent, this tension between state, district, and school 
has to come to a new level of maturity where I am holding the 
right issues and people accountable for the right attitudes and 
intervention strategies. That is the best answer I can give 
you.
    Mrs. Davis of California. Anybody else want to address 
that? Okay. It is obviously a difficulty in the community. It 
is a huge difficulty for schools. And I was just curious to see 
how many people have----
    Mr. McWalters. But I want to say again. I have communities 
also that want those kids then to be isolated. That is the good 
part of NCLB is that these are all our kids. And to the extent 
we are on task to solve that problem, I need to be an incenter, 
a rewarder, and a partner. If you are avoiding those kids at 
the community or district level, I need to be the hammer.
    Mrs. Davis of California. Yes. And perhaps this is an 
expansion on that a little bit because we know that there are 
certain sub-groups that are more likely in some school 
districts to not meet the requirements. And there is this 
tension, as you say, with identifying certain sub-groups. Is 
there a growth model, though, for those sub-groups that might 
be more pertinent really within the context?
    For example, in English language learners, you may have a 
classroom where you are moving the kids out of that classroom. 
The fact that that classroom isn't showing improvement isn't 
because the kids aren't improving. It is because the kids who 
did improve moved out of the classroom.
    Mr. McWalters. Yes.
    Mrs. Davis of California. How can we best demonstrate this 
concern? And in many ways, is there just a downside to the 
growth model as well?
    Ms. Woodruff. If I could respond, actually when we designed 
ours--and our growth model will give schools and teachers 
within schools more specific information about individual 
students. And in particular, one of our directors for special 
education in one of our local school districts is really intent 
on this particular model because we will be able to see if a 
student is making that kind of progress and then they can then 
examine what needs to be done for that particular child.
    One of the other things--so I think the growth model does 
really incentivize and provide additional information, more in-
depth information for schools and districts to be able to act. 
And I think that is important.
    The other thing that we are finding as we look at the 
issues around English language learners and special education 
is--and we have done a lot of work to try to build the capacity 
of local districts. That even though you may have schools 
within a district that are kind of going up and down, that it 
is a district level issue that needs to be dealt with. And we 
need to help them intervene across the district, not just in 
individual schools so that you can stop some of that 
fluctuation.
    Mrs. Davis of California. And, Mr. Olson, perhaps if you 
want to come up really quickly. We are running out of time. But 
I just wonder is there good cooperation between states with 
this data sharing and in developing the longitudinal work that 
is being done? Do you see some really good examples that we 
could look at?
    And, Mr. Olson, did you want to comment on the last one 
real quickly? Mr. Olson, I am sorry. Did you want to comment on 
that last comment?
    Mr. Olson. I wanted to comment on your earlier question. 
Given our work, students typically will take a test two or 
three, four times during a year giving accurate information on 
the growth measured. When a student moves from one classroom to 
another, we have the data that follows the child.
    The other interesting thing is that with the kind of 
quality that we bring, the information we bring, we can begin 
studying the effects of moving a child from classroom to 
classroom.
    So it may or may not be--you know, if the student achieves 
somewhat less or less growth, it may not be the teacher. It may 
be the fact that the child was moved from class to class or 
from school to school. But the quality of data allows us to 
begin understanding issues like that within the school system.
    Mrs. Davis of California. Thank you.
    Mr. Dougherty. In answer to your question about cooperation 
across state, states are ravenous for information about how 
other states are doing it. One of our most popular things that 
we have got in the data quality campaign has been to do a 
lessons learned series where we have gone out and done detailed 
site visits in specific states and said, ``How did they go 
through the process?''
    This is stuff that is difficult to record in a survey, so 
it is their nuanced experiences. This has been in very high 
demand in other states.
    Chairman Miller. Thank you.
    Mr. Souder?
    Mr. Souder. I have a couple of questions. But I didn't hear 
a clear answer to Mr. Castle's earlier question.
    And maybe, Mr. Olson, you could take first crack. How much 
roughly does a growth model increase the costs?
    Mr. Olson. We don't have all the data about the costs for 
each school district, I mean, each state. But our measures in 
all likelihood could be put in place for a state at a cost real 
similar to what a state has gained for measure for one time a 
year under the current model.
    Mr. Souder. Thank you.
    We spend--and schools spend even more--millions on IDEA and 
developing individual education plans that supposedly are 
advancing those special needs students at the best rate 
possible.
    Does the growth model accommodate that? Is anybody talking 
about how to integrate what we are spending with the right hand 
into the left-hand measurements?
    Mr. Olson. Well, I would just make one comment. And earlier 
my remarks focused on two things. One is measuring growth. And 
two, measuring individual children accurately enough to measure 
growth. With the computerized adaptive measure, which we use--
and there will be other methodologies.
    But when you are measuring children accurately, we can 
measure academic growth of children about 98 percent of the 
children within the normal population. Which means we are 
measuring accurately academic growth of most of the special 
education population as well as most of our most talented 
children. So a real good accurate measure plus growth is 
applicable to those programs.
    Mr. McWalters. Can I comment on that?
    Mr. Souder. Yes. I would also like to see how that is 
integrated in, then, to the individual education plans and 
whether these two things are actually linked at all in the real 
world.
    Mr. McWalters. I think that is the right question that has 
a complicated answer. One is it is No Child Left Behind that 
finally got on the table that other than for a small number of 
students we should have the same standards for all kids.
    I am the parent of a special needs student who finally 
graduated from college sum cum laude in math who could not 
possibly pass any of these tests as a 4th-, 5th-or 6th-grader. 
So the issues of adaptations are very real. But the issues of 
common standard expectations need to be pounded on. That is the 
right place to be.
    Now, having said that, the instrumentation for changing 
expectations and changing classroom practice is we have so far 
to go that the AYP exercise right now is almost likely to pick 
up all of the common cultural heritage that we didn't expect 
these kids to do anything. So the intervention strategies now 
have to be comprehensive. They have to be intensive.
    But we have to be realistic about where we are starting. 
And I do think working that back into the individual 
improvement plan strategy and logic is a pretty powerful 
institutional problem that we are facing. And that is the only 
way you are going to bring assessment and IEPs into kind of a 
common mission.
    Mr. Souder. Because most of the schools in my district who 
are failing in the standards are either special needs, or the 
second is ESL. Because clearly, you can almost tell uniformly 
ESL mix even in Indiana. It varies even in a district. Some 
buildings will have 80 percent, and others will have a small 
percent.
    Some, very few, that are failing--I mean, a school can 
waiver a certain amount. But most of the schools that are 
having problems are way over the amount that they are allowed 
for a waiver. What we have been talking about today--how does 
that integrate with the English as a second language?
    Mr. Dougherty. I want to mention that an ESL kid is 
particularly likely--one who is just learning English--is 
particularly likely to be very far below proficiency on an 
English language test, since he can't read the test, at the 
beginning and then is likely to make very rapid progress. So 
you should note a rapid growth trajectory for such a student.
    Some states, of course, do have tests that measure the 
kid's progress in learning English. And school systems use 
those tests as part of their diagnostic understanding of why 
the kid isn't proficient on the English language test. It is 
because they are not proficient on the test of English 
proficiency. California is a great example of a state that has 
really been conscientious in developing a test that tracks 
kids' progress in learning English.
    Mr. McWalters. The huge difference is in grade spans. If 
you are somebody coming in here at 2nd grade coming from some 
schooling, first of all, by age you are developmentally more 
likely to respond to whatever the treatment is. If you come 
into the 10th grade with no schooling, that is a different 
treatment.
    I think we shouldn't confuse measuring the measurement of 
capacity or fluidity in a language with the other issues behind 
the individual child. This is much more about program 
treatment, the integrity of good program treatment in the ELL 
world while we are figuring out the different ways to measure 
what it is, language capacity or language fluidity, either 
readiness or in English. I think those are--we have to separate 
those issues and go after the integrity of program treatment 
because there is tremendous variability on these children.
    Mr. Souder. I had a young student from, I believe it was, 
Southside High School in Fort Wayne, Indiana, who had come in 
from Somalia where we have a lot of refugees coming in from 
Eastern Africa. And he said first off, he was given the test 30 
days after he arrived and spoke no English. And then even after 
he learned English, they had never taught math in Somalia. So 
even after he became proficient in English for his grade level, 
he was substantially behind.
    Mr. McWalters. Right, right.
    Mr. Souder. These nuances are just devastating to some of 
the morale to the teachers. I mean, I want accountability. But 
it is devastating to the morale of the school and the teachers 
when they are being measured and told they are failing based on 
those kinds of standards.
    Mr. McWalters. Right. But we have also many students in our 
country that are American, as in born here. And they are 
growing up in second language homes and neighborhoods. And they 
are not doing well in our tests, either. That isn't about 
measurement. That is about program quality. And this is about 
the intervention strategies.
    So I think the whole ESL question is the right question on 
the table. And the issues about language facility in their own 
language and in our language--all of that I think we have 
measures of that. But how you fold that into an accountability 
system and a program intervention question--it is not solved in 
the timelines that are in the NCLB exercise.
    Chairman Miller. Mr. Hinojosa?
    Mr. Hinojosa. Thank you, Mr. Chairman.
    I thank the panelists for coming into visit with us today 
and telling us what your thoughts are on No Child Left Behind.
    My first question I am going to direct to Peter McWalters 
and to Valerie Woodruff. No Child Left Behind already requires 
a growth model in one area. And that is for limited English 
proficient students. States are required to have benchmarks for 
English language proficiency that are aligned to the state's 
academic content standards.
    They are also required to annually measure students' 
progress toward proficiency. Share with us what steps your 
state has taken in implementing these provisions and how your 
experience with LEP students might inform our approach to 
growth models of accountability.
    Mr. McWalters. We are part of a national consortium to try 
to come up with both assessments and treatment. And as I said 
just a minute ago, so far to protect the interests of 
everybody, we have all of them tested in state testing, and we 
report them as disaggregation so that it is still currently 
transparent. It is only through that exercise that I think that 
I have not got the other layers of information, which is I have 
some students where that is a good measure of the system's 
failure to treat them.
    I have other students that shouldn't be taking that test. 
And it is almost a keen sense of the obvious when you see that. 
So I am trying to help people understand we have got to figure 
out the measurement instrument, which is necessary. But I think 
we also have to know that in some cases it isn't about the 
measurement.
    It is about the program that that child is in and either 
the integrity of its delivery or the fact that he shouldn't or 
she shouldn't be in that program. And I am trying to play that 
out right now both ways. But I am using straightforward state 
assessments to do it. And that is why that cohort isn't moving 
because many of them will not show significant enough 
improvement fast enough to get that program--or those kids off 
that list.
    Mr. Hinojosa. Valerie, what is your state doing?
    Ms. Woodruff. We certainly are measuring the students' 
proficiency in English. And I would agree with Peter. There are 
a number of children who, because of their varying 
circumstances, 12 years old, no schooling in their native 
language coming to us and then we are trying to catch them up, 
who really should not be participating in the state 
assessments. They do to the extent that they can. And that 
certainly tells us where they are.
    But we really need to be held accountable, in my mind, for 
particularly those older children and whether or not they are 
meeting proficiency in English first and then become part of 
our state assessment system. So, we actually implemented a test 
of English proficiency before No Child Left Behind and required 
our districts to track them. Also, once those children become 
proficient, we require our districts to continue to monitor how 
those children are doing. And if they begin to falter, then to 
intervene and provide additional support.
    So that has been something that has been kind of on the 
books and in practice in our state for a while. But we continue 
to be concerned with the frustration level of the children who 
are required to take an assessment that they cannot begin to 
understand and much less, be proficient on.
    Mr. Hinojosa. The next question I want to direct to Harold 
Doran and to Allan Olson. No Child Left Behind's accountability 
measures are least effective in high schools and is proven by 
how we are competing internationally. Our high schools are way, 
way down on the list as compared to China and Singapore and all 
those others.
    What are your recommendations for meaningful accountability 
at the high-school level that would include multiple measures, 
readiness for both secondary opportunities, and real progress 
on improving graduation rates?
    Mr. Doran. I have a couple thoughts. And I was wondering 
actually if that question would come up in today's 
conversation. Bill Gates gave testimony here a week or two ago, 
and this issue was highlighted. And there have been some recent 
studies that I think have been illustrative of exactly what you 
are talking about.
    I think there are a couple of things that I have learned by 
looking at the literature recently that have evaluated state 
assessment systems that tell an interesting story. I may get my 
numbers slightly wrong, but I think the number is something 
like this coming from Project Achieve and some studies they 
have recently done.
    I think it is eight states have aligned their graduation 
requirements with expectations for post-secondary education or 
the workforce. Twenty-six states have their assessments, high-
school assessments in place that only measure skills that 
measure 8th-, 9th-and 10th-grade skills. And those don't 
necessarily translate into skills that would guarantee that 
students are successful post-high school.
    There is an interesting model that 11 states have recently 
bought into. And they have formed a consortium around an 
Algebra II test. And the idea here is that when students 
demonstrate competency in Algebra II that that guarantees--or 
at least that gives them a higher probability that they will be 
successful post-high school. And in some of those 11 states 
that test will be a graduation requirement. In some other 
states, it will not.
    But I think one of the things that we can do from a policy 
perspective is ask the following question. What do we want for 
our children, and how do we know we are getting it? And so, one 
of the things that we ought to--that we want for our children 
is success post-high school. We need to operationalize and 
define what that means.
    Eleven states--there are more probably doing it, but I can 
cite the example of 11 states. They have said we value Algebra 
II. How do we know we are getting there? Well, we are going to 
measure their progress on that core content area because 11 
states--we believe that should students demonstrate competency 
in that particular content area, they are likely to be 
successful in high school.
    So I think we can start with something simple. Ask the 
question what do we want for our children. We want success in 
post-secondary education. And what does that mean? And then 
implement systems that measure that.
    Mr. Hinojosa. Thank you.
    Chairman Miller. Thank you.
    The gentleman's time is expired.
    Mr. Heller?
    Mr. Heller. Thank you, Mr. Chairman.
    Just a couple of questions here. And I appreciate the panel 
being here. I really do appreciate your input. You guys are the 
experts. I am not. My wife is a school teacher, so every once 
in a while she does chew on my ears a little bit, especially on 
this particular topic.
    And one of the issues probably reflects what the ranking 
member was saying and her concern about the amount of time you 
spend testing children as opposed to the amount of time you 
actually teach children. And it flows over.
    For example, I represent Northern Nevada. And the 
elementary school that my children go to, because of the amount 
of teaching--excuse me, the amount of testing that goes on, 
they have dropped certain curriculum. For example, they don't 
teach history any more in the elementary school level because 
it is not tested under NCLB.
    They have dropped geography. They have dropped social 
studies. And that doesn't include other curriculum or 
activities like the music programs. They are dropping all these 
programs because they are so concerned about these core issues 
that need to be taught and tested that they don't have time to 
teach others.
    And I was wondering perhaps, Ms. Woodruff, if you could 
comment on that.
    Ms. Woodruff. I would be happy to.
    Mr. Heller. Thank you.
    Ms. Woodruff. I think those schools are wrong in their 
dropping of those other curricular areas. Interestingly enough, 
in Delaware we assess both science and social studies and have 
been doing so at the elementary-, middle-and high-school level 
since 2000. And we will continue to assess social studies as 
well.
    The other piece of that is that I think that when schools 
begin to eliminate the social sciences, when they begin to 
eliminate the arts programs, they are failing to see that there 
is another context within which children can learn reading and 
mathematics.
    Mr. Heller. I agree.
    Ms. Woodruff. When children see the relationship to other 
kinds of--to the rest of their lives and to other kinds of 
learning, they are much more likely to be successful than if 
they are being constantly bombarded with only two or three 
particular subject areas. There are a number of research 
studies about the arts and so forth. I just think that it is 
something that I am very happy to tell you the schools in 
Delaware have not done and that we encourage them to understand 
how those linkages can be made.
    Mr. Heller. And I agree with you because I think that is an 
imperative part of a child's education, are some of these 
social skills that they learn in this process.
    Ms. Woodruff. Right.
    Mr. Heller. And I guess my concern, Mr. Chairman, is that 
we are limiting the curriculum of these children or are careful 
not to limit the curriculum of these students because I think 
music programs do offer value. I think history offers a lot of 
value as does geography and other social studies areas.
    Ms. Woodruff. If I could comment further, one of the things 
that we have deliberately done is we have standards in about 17 
different areas, including career and technical education. And 
we have done crosswalks, if you will, between standards in one 
area and in another so that the people see the relationship.
    We are also in the process of developing a statewide 
recommended curriculum with model units. And many of those 
units are integrated so that the teachers have something that 
they can use. And then they have embedded assessments that are 
directly related to the instruction that then just flow out of 
the whole teaching and learning process and are not seen as 
some stand-alone test that they don't feel has any sense and 
context of the school itself and of their ongoing work. So it 
is really an exciting opportunity for us.
    Mr. Heller. Okay.
    Ms. Woodruff. And our teachers are helping us build it and 
are embracing it.
    Mr. McWalters. I think this is a wonderful opportunity to 
get out of the silos by the cross-mapping. I am assuming most 
people would still want reading and math assessed because they 
are so central. The idea that they are displacing something or 
teaching to the tests as in drill and kill obviously is the 
wrong place to be.
    But when you start helping people map across the subjects, 
then all the activity, the actual hands-on applied learning 
exhibitions become instrumental in improving those two scores. 
That is one of the only ways we are going to change the 
structure of schooling. Otherwise you are going to end up with 
more separation, more discrete testing. And it will still be 
factual recall rather than application.
    Mr. Heller. Thank you, Mr. Chairman.
    Chairman Miller. Thank you very much. I guess I would argue 
that when schools start to implode on narrowing the curriculum 
it may be one of the first indicators of the lack of capacity, 
that you really are now watching an institution that is 
atrophying to such an extent and lost an understanding of what 
a learning environment is.
    I mean, I have been involved with a number of schools all 
across this country that have now taken music and made it an 
absolute gateway to mathematics and the understanding of 
mathematics. And I mean, it is replicated time and again in so 
many areas that that might be a red flag that you would not 
want to ignore in terms of the talent of that group of teachers 
and administrators.
    Next is Mr. Courtney.
    Mr. Courtney. Thank you, Mr. Chairman.
    And I was out of the room for a minute, and you may have 
covered some areas while I was gone. But I am from the state 
that is suing the federal government over No Child Left Behind, 
which is the way I was introducing myself at a lot of workshops 
for freshman members. And to be honest with you, it was 
actually a fairly----
    Chairman Miller. That would be Connecticut, right?
    Mr. Courtney. That is correct. Sorry.
    And, you know, listening to the presentation, which 
obviously all of you put a lot of thought into what is the 
goals--which I think everybody agrees on. But I have to say 
there really--at least in the state that has distinguished 
itself in terms of the hostility and adversarial relationship 
with this program--it is a very popular thing that the attorney 
general is doing.
    He is the kind of guy who sues everybody, pharmaceutical 
companies, banks, insurance companies. He has said that there 
has been no action of his office that has ever garnered the 
kind of public response as his decision to challenge NCLB.
    And, Mr. McWalters, who is a close neighbor of my 
district----
    Mr. McWalters. I am?
    Mr. Courtney. You sort of started to get into whether or 
not there is sort of a redundancy factor about what we are sort 
of learning from tests. And, you know, what I see in 
Connecticut is that when the test results come back in, the 
schools that are not succeeding are Title I schools. And, I 
mean, it doesn't take a rocket scientist to figure out that 
Greenwich High School is not going to have any problems 
succeeding. Whereas New London or Willington or Hartford or 
Bridgeport or New Haven are going to--I mean, and at some point 
people really question about, you know, why is this effort and 
expense worth it?
    Because it is almost common sense that tells you what the 
results are, which is we know where the problems are. It is 
poor school districts who, by the way, are the ones who have 
been getting shortchanged on Title I funding over the last 
couple of years. I mean, it is almost perverse to see the cuts 
that these districts are having to absorb over the last few 
years in terms of resources.
    At the same time, the government is identifying them as not 
succeeding. So, you know, I guess the question is is there a 
way to do this a little more intelligently without sort of, 
again, really damaging the public's belief and credibility in a 
process that they see as the tail wagging the dog.
    You have to go back now to the beginning because the 
issue--some of us experienced this between the law as passed by 
a bipartisan Congress with an executive branch that was drawing 
a new line in the sand for accountability and transparency. 
That is good. Disaggregation, good.
    Many of us were in states that had systems that pre-dated 
that. Valerie spoke to it. I spoke to it. I can clearly 
remember sitting with the department going, ``Wow, what an 
opportunity.''
    If you came in and assessed where each state was in terms 
of its own integrity to do the right thing--as in we had just 
got a law. I was into disaggregation. I was into the beginning 
of intervention. It didn't line up perfectly, but I was there.
    Instead of leveraging me forward, I spent 18 months 
regrouping. That was a mistake. But I write it off because I 
think the impatience of Congress from a nation at risk to goals 
2000 was such that you didn't want to hear it anymore. 
Connecticut was a perfect example, a high-performing state with 
some of the biggest gaps, some of the most urban 
concentrations.
    The way to call that question between the law and the 
department to focus in on what needed to be called apparently 
didn't happen. I was one of those states that said I don't need 
more state testing to know the sick place. But I have learned 
to appreciate grade-level testing as an instrument of 
improvement at the school and district level. I couldn't have 
untangled that 5 years ago.
    But I think we are all saying whatever lessons we needed to 
learn about accountability and capacity and transparency--if it 
hasn't been learned, then you need to authorize your department 
to go after that state. But for states that have stepped toward 
this and they are trying to sort out state needs from district 
needs to school needs to growth, individual, instructional 
needs, we have got to get that sophisticated pretty quickly.
    Ms. Woodruff. Well, our experience has been that many of 
our Title I schools are some of our highest-performing schools. 
And we, for a very small state, are thrilled that we have had a 
number of national blue ribbon schools. High-poverty schools 
with high-risk populations, including English language 
learners, particularly at the elementary level who are doing 
incredibly well.
    I think that where we are seeing No Child Left Behind 
really shining the light on places that may have a somewhat 
homogenous population and smaller numbers of the sub-groups and 
shining the light on those places and saying you are not doing 
what you need to be doing for all children has been helpful in 
many ways. And I think that Title I schools for many years have 
received a great deal of money.
    I am not happy with the way Title I schools have to hold 
back certain amounts of money in case of choice and in case of 
supplemental educational services that should be, in my mind, 
going to programs and to children rather than being held back 
for some of those reasons. But our experience with Title I, 
non-Title I has been a little different than what you 
described.
    Chairman Miller. The gentleman's time is up.
    Mr. Fortuno?
    Mr. Fortuno. Thank you, Mr. Chairman. First of all, I want 
to thank you for holding this hearing today, and the ranking 
member as well.
    And thank you all for being here. I am sorry I had to step 
out for a while. But as I was following everything that you 
have said--and I was here through all of your presentations 
today--it is clear that there are different states at different 
levels of achievement. Some states have really benefited from 
this process. And actually all their resources have been 
focused in trying to do what needs to happen.
    The other states like Connecticut--I would love to share 
some thoughts with you afterwards, if we may--certainly are not 
as happy. In my case in Puerto Rico actually, the latest was 
that the AYP measurements or actually requirements are not 
being met, and Puerto Rico was just fined a couple of weeks ago 
on this.
    And actually, if I may, Mr. Chairman, I would love to 
introduce that letter just to show that indeed there are 
different jurisdictions at different levels of achievement. So 
if you don't have any problems with that, I would love to 
introduce in the record the letter from the Department of 
Education.
    And I am just wondering when you have this disparity--and I 
am asking everyone--how do you handle--from here, how do you 
handle that disparity. We want some levels of measurement. 
There are some states achieving--or actually some districts 
that are at a very high level of achievement. And there are 
other places like my district where that is not happening, 
clearly not happening.
    So I would love to hear your insights as to what you 
recommend we do from this end to try to do something that fits 
everyone. But actually it is impossible to fit everyone.
    Mr. McWalters. I want to step right up on that one. I think 
the law was trying to protect the rights of children to get--to 
access to a quality education. And thank God, it holds states 
accountable for that. That is the right place to be.
    But having said, I am the smallest geographic state, but I 
am the second most densely populated state in the union. We 
have about the same number of population. We were comparing 
demographics earlier.
    Every one of these places has a very different issue. And I 
would still submit that this law does not address what the 
original Title I law was trying to do, which was become an 
issue--I am going to call it the urban agenda--in concentration 
and size. They make a difference.
    If you are dealing with New York City, Chicago, 
Philadelphia or Los Angeles or Providence, which is a small big 
city, the issues you are dealing with to get the individual 
school and student access in quality instruction is complicated 
by distance, size, and density.
    I think the law hints at that, but I don't think there is 
enough understanding that for me to get a child in New York 
City access and performance to standard is to deal with all of 
the issues from state house through district to community. And 
it is somewhere in the differentiated instruction. It is the 
same in Connecticut.
    Connecticut's issue is basically urban. Now, I don't know 
whether the state had an urban agenda. As a superintendent, I 
don't know many states that did, at least not really. Because 
if it is an urban agenda, it is more complicated than simply 
school improvement, as necessary as the school improvement 
infrastructure is.
    Mr. Fortuno. Anybody else have a comment? The weather is 
great in Puerto Rico this time of the year. And if anyone wants 
to come down, I guarantee good weather.
    Ms. Woodruff. I think your point that the law--although the 
goals of the law are certainly well-intended--that just as we 
know that every school has its own unique needs and issues, 
every state is in a different place. And I think that Peter 
mentioned earlier that, you know, if you have a state that has 
put systems in place and is moving forward and getting the 
agenda taken care of, then we ought to be allowed to do that 
and to be given some freedom and flexibility in order to do it.
    And those folks who are seeking to improve, such as Puerto 
Rico, need to be given technical assistance and support. It is 
part of what we keep talking about in terms of a federal, state 
partnership. And a partnership is you shake hands, and you 
figure out where you are going, and you help each other get 
there. It is not a one-size-fits-all and everybody lines up and 
you are either yes or no and put in a box. That is part of what 
has been frustrating.
    I believe that when No Child Left Behind was passed that 
Delaware could have made minor changes in our existing law, and 
we could have been much further along today than we are because 
we had to sit back and regroup. And I was told point blank by 
counsel at the Department of Education that our law was too 
restrictive and it needed to be changed. Our law was changed. 
And we are now in compliance with No Child Left Behind. We 
would be in a very different place today, I believe.
    Mr. Fortuno. Thank you.
    Mr. Olson. I would like to make an observation. The law 
when it was passed was passed with the intent to use an 
accountability process to help schools and states get better. 
Well-intended, well-conceived. But a message within the panel 
today is that there is also a need for Congress to reflect on 
how might that law be more helpful to the processes of 
improving learning and instruction and school organization and 
things like that.
    I think if you reflect on that question, given the 
resources that are being put in and the issues that you have 
within--the consequential issues you have within the law--and I 
am not suggesting--that reflection--I am not suggesting walking 
away from any of the requirements. But how might the law change 
in small ways to make it easier for schools to put the energy 
into constant improvement over time?
    Mr. Sarbanes. Actually I want to pick up on that idea--
excuse me--and talk about and ask you a couple questions about 
this relationship between resources and an accountability 
framework, which is usually put in the context of well, we have 
the accountability and we just need to get more resources in 
there behind what people are doing so that they can actually 
achieve the goal.
    But what I am interested in having you speak to is whether, 
for example, you think a growth model that has been discussed 
in contrast to this status model, whether that can actually 
result in more efficient use of resources.
    I mean, I had the opportunity to be part of rolling No 
Child Left Behind out in the state and the district and the 
district of schools and now within schools. And as you know, 
the current system is such that when you don't meet AYP 
particularly for certain periods of time, it triggers all kinds 
of technical assistance and other resources and requirements on 
the system and then on schools in terms of developing school 
improvement plans and restructuring plans and all this other 
kind of stuff.
    So that is an obvious place where if a growth model brought 
more flexibility into the system and the accountability system 
you might not start a school or a system or a series of schools 
jumping through those hoops that then generate a lot of 
resources as quickly. So you could speak to that.
    But then the other question is just in the delivery of 
resources do you think a growth model is going to encourage the 
resources to be directed better than they are being directed 
now?
    So I would love to have you all react to that question.
    Mr. Olson. I would like to make a brief comment.
    Mr. Sarbanes. Yes.
    Mr. Olson. And I will go back. There have been a number of 
observations that schools, say, were tested too much. What we 
find in our work is that once people are administering tests 
that are useful, helpful and drive improvement of their 
decision making, all of a sudden they think of testing as being 
desirable. So a lot of it has to do with the utility and the 
accuracy and the helpfulness, if you will, of the measure.
    So if you go to a measure that is more accurate than that 
which is commonly used, right away you, if you will, free up 
resources and you change the resource allocation. You change 
the energy. You change the decision making.
    If we have improved information about growth and about 
growth of individual children, we will then also know more 
about the factors of the resources and which are making a 
difference. And so, we can make better decisions which to use, 
which to modify and how to use them. The growth measure, a 
good, accurate growth measure will, in fact, influence resource 
allocation over time.
    Mr. Sarbanes. All right. Thank you.
    Ms. Woodruff. I would agree. We know that with the 
implementation of our growth model--and we have already done a 
few test cases and given information to some of our schools--
that they are able then to hone in on specific children a 
little differently. And to go to the gentleman's question a 
little while ago about the use of IDEA funds and so forth, we 
foresee--and I think this will hold true, continue to hold 
true--that the allocation of resources toward specific needs of 
not only groups of children, but individual children will be 
more targeted.
    I think then that as that happens, we ought to be given 
some flexibility in how to utilize those funds a little 
differently than perhaps we are required to do today. And I 
think that the whole issue of resources needs to be examined in 
terms of the efficiency with which schools and districts are 
using the resources at hand. Not to say that we couldn't use 
additional resources, but the examination of efficiencies is 
always important.
    Mr. Sarbanes. Right.
    Anybody else?
    Mr. Doran. Yes. The interesting thing about the growth 
model is it can tell a very different story about a school. And 
this would very directly interpret or suggest how we would do 
resource allocation. For example, the current system says you 
don't cross the threshold, you might be low-performing. If you 
are above the threshold, you are making AYP or some might 
perceive that as being high-performing. But it may, in fact, be 
the opposite story that we want to be told.
    In fact, we may have students who are very high-performing 
but they are dropping in their performance. The school is not 
actually doing a good job with those kids, but they are staying 
above the proficiency bar. On the other hand, we might have a 
school that is doing a very remarkable job with low-performing 
kids. They are not getting them to cross that proficiency bar 
just yet. They will, but they didn't just yet.
    Now, in fact, it is the school that appears to be high-
performing under the status model that actually needs resources 
targeted to it. And it is the other school that is doing a 
remarkable job with its struggling kids that appears to be 
doing okay. We do the opposite right now, not in all cases, but 
in many cases. And so, that would have a direct relationship.
    You know, we see this happen in other fields. And I talk to 
educators often about how to make good use of data and 
definitely explore different statistical methods. We recently 
saw this happen in a book that illustrated how, when you have 
the autonomy to look at statistics and data and mine your data, 
how you can figure out how to build better teams.
    I think we would see the same kind of thing happen in 
education. The autonomy to use better and newer statistical 
methods will allow for us to figure out how to build better 
schools.
    Mr. Sarbanes. Thank you.
    Chairman Miller. The gentleman's time is expired.
    I would just comment on the gentleman's question because I 
think it is a critical question, one as to flexibility of the 
use of resources and how you use the data. But, you know, in 
every other segment of the economy, people have been plowing 
the resources to developing data so that they can make smarter 
use of human capital or capital budgets and all of the rest of 
this.
    I mean, all across the board that is the competition that 
is taking place within the economy. And this is one of the--
this and health care are sort of the last areas to decide that 
data can really improve the deployment of resources and the 
efficiency of those resources.
    Mr. Keller?
    Mr. Keller. Thank you, Mr. Chairman.
    Mr. Chairman, I think it is critical that we get this bill 
in the strike zone or it is going to be in trouble.
    On the right, conservatives don't like the large role of 
the federal government. On the left, many teachers' unions have 
concerns about the testing components. And so, I think we need 
to make several positive changes. And I see several of those 
being made. I can see us making some improvements in the way we 
measure students with special education needs. I see some 
positive changes in the way we deal with children with limited 
English proficiency. I see the growth model being used at least 
as a supplement, if not more.
    But the biggest remaining complaint that I hear about No 
Child Left Behind in Florida is the inconsistency between the 
state and the federal accountability systems. And I am very 
interested in hearing from you about how the states and the 
feds can better align their dual accountability systems to 
ensure that parents are given clear and consistent information 
about their children's schools.
    Let me just give you an example. In Florida, we use one 
test called the FCAT both for the state's program called the A 
Plus program and for the federal No Child Left Behind program. 
Approximately 90 percent of the schools get a passing score 
under the state plan. And approximately 90 percent of the 
schools fail to meet AYP under the federal plan.
    So a parent moves into a school district and says, ``Is 
this a good school?'' Well, it is failing under the federal 
program, and it is an A school under the state program. And I 
think we have got to bring those in line.
    And so, I want to ask you. Let me start with Mr. McWalters.
    Are you also concerned as we go through reauthorization 
about this all or nothing approach to measuring progress for 
AYP? And if so, do you think we should go with a more graduated 
approach in terms of bringing the states and the feds more in 
line?
    Mr. McWalters. More graduated. However, I want to go on 
record. I think the feds need to stay in this business. We 
wouldn't be having this conversation if states either had the 
capacity or the will five generations ago to get us to where we 
are now.
    Mr. Keller. Nobody is questioning that.
    Mr. McWalters. So having said that, now I am talking about 
the spirit of the law versus the way it is administered.
    Mr. Keller. But I have only got a limited amount of time. I 
just want that--do you think we should go to a more graduated 
approach instead of----
    Mr. McWalters. But it has to have a peer review structure 
that is transparent.
    Mr. Keller. Right.
    Mr. McWalters. Because when my proposal is being reviewed, 
it is being reviewed in a way that appreciates the context from 
which I am coming.
    Mr. Keller. Let me stop you there. I hate to, but I have 
just got a little amount of time.
    Secretary Woodruff, do you believe that we should continue 
with this all or nothing sort of approach with AYP? Or would 
you prefer a more graduated approach?
    Ms. Woodruff. Absolutely a more graduated approach.
    Mr. Keller. And do you have any ideas how states and the 
federal government can bring their dual accountability systems 
more in line?
    Ms. Woodruff. Well, again, I think that, you know, in the 
reauthorization if you set some criteria around which--a 
framework within which we have to work and then allow us to 
bring forward our proposals that are measured then against this 
criteria, it makes sense.
    In Delaware, for example, we use both AYP and a growth 
component for our school rating. And we use growth of all 
children at all levels in reading, mathematics, science, and 
social studies as a part of that because we continue to value 
all four content areas, not just reading and math.
    Mr. Keller. Well, I met with our local bureaucrats at our 
Florida Department of Education. I asked them how could we 
bring them in line. And they did the data analysis for me. And 
if you meet 90 percent of the AYP criteria and call that 
excellent, say, that equals almost identical to schools who get 
an a. If you meet 80 percent of the criteria, we will call that 
good. That meets almost identical the schools that get a B.
    If you meet 70 percent and call it average, that meets 
almost identical the number of schools to get a C. But I am 
told when talking to folks on both sides of the aisle that if 
we did that sort of evaluative process on AYP that that would 
hurt some schools' feelings, that, you know, they are only 
average or good.
    And so, let me ask you, Mr. Olson, do you like that sort of 
graduated approach. Or do you think we should stay with the all 
or nothing approach to AYP?
    Mr. Olson. I would prefer the graduated. I also think that 
it is important to maintain some of the richness that state 
systems have. And I wouldn't be a real strong fan of adding 
many additional measures inside the calculation of AYP. I think 
that the schools should have multiple measures. I think states 
have the position and obligation to put those in place.
    And when you have a richer state system than you would want 
to fund and put in place from a federal system, I think you 
will have disparity from time to time. But I think the states 
have the flexibility also to create some means by which they 
appear more consistent.
    Mr. Keller. My time is expired unfortunately. I yield back.
    Chairman Miller. Mr. Payne?
    Mr. Payne. Thank you very much.
    You know, when this legislation first came--and like Mr. 
Courtney said, I was troubled because I knew that schools that 
had poor fiscal conditions, unqualified teachers, over-crowded 
classes, which are primarily in urban areas like mine in 
Newark, New Jersey and other urban places, I was somewhat 
opposed, disturbed by highstakes testing because I knew that 
they were going to show up at the bottom because of not having 
the opportunity to learn, which was a part of legislation in 
the past.
    But the majority that was in control for the last 12 years 
took out opportunity to learn. So if you were failing, that is 
your problem. It wasn't that you were not provided with the 
opportunity to learn.
    Secondly, I knew that there would be some problems with the 
suburban communities that might send large numbers of children 
to colleges. However, with No Child Left Behind it sort of 
disaggregated.
    And therefore, you could see that there were children being 
left behind because this legislation showed that there were 
minority kids, English proficiency language and special needs 
kids who were being left behind by these school districts that 
sent the majority of their kids off to wherever they would go 
after high school, but there was very little acknowledgement 
for the others. So I was kind of conflicted with knowing the 
testing was going to show negatively, on the other hand, 
knowing that the testing would show that there were almost 
discrimination to other kids.
    The whole question of states' rights--I mean, that is why 
we were so far behind. That is why we had to start with a 
national lunch program because states weren't taking care of 
people when World War II started. Title I, because they didn't 
deal with low-income school districts.
    So the federal government said, well, put this in. And the 
states who still have some of those old trends about not 
wanting government to intervene is because of things like 
public accommodations, the old Jim Crow laws, the old voting 
rights. And they don't people to expose the discrimination that 
still exists.
    Having said that, though, let us get back to the topic on-
hand. Let me ask a quick question. First 3 years of No Child 
Left Behind, growth models were generally not considered to be 
consistent with certain statutory provisions of the law. 
However, as you all know, in 2005, the secretary of education 
reversed course and announced that a pilot project under which 
up to 10 states would be allowed to use growth models to make 
AYP determined for that school year of 2005, 2006.
    Do you feel that the growth models overstate progress or 
appropriate credit improving schools? And you could also, if 
you have any comment or disagreement with my previous 
statement, you may certainly want to run in that all in about 
another 2 minutes.
    Mr. Olson. From what I have seen in the data, it does not 
seem to have any negative effects relative to the requirements 
of the law. There are relatively few schools that are making 
AYP with the growth model that weren't before. So that hasn't 
shifted much. I think it is very important to know that it is 
important to measure growth just because it is the best 
indicator of effectiveness of systems.
    I don't believe states are moving to measuring growth so 
fewer schools would be identified in that category. I haven't 
heard that in any of the conversations in any state. And I 
believe that they are functioning with a great deal of 
integrity. So I think it is all a positive move.
    Ms. Woodruff. What I am going to be interested in seeing is 
that once we put this growth model in place and we have more 
definitive information that schools receive--I want to see then 
what the effect of that is and their ability to intervene and 
do more for the individual children and groups of children so 
that they are moving either out of school improvement or 
continue on a trajectory to continue to meet the target. So I 
think that will not be known until we see this over probably at 
least a 3-year term relative to the examination of the data and 
what happens. But it is not an attempt to duck the system at 
all.
    Mr. Doran. The growth models are entirely consistent with 
the idea of what it means to learn. When a kid is learning, we 
know that the student is growing and changing. And so, growth 
models, when properly developed, reflect that notion.
    Dr. Dougherty and I serve on the secretary's peer review 
panel. And I think that panel worked very well in this last 
round. In fact, there were some growth models that 
statistically may have allowed for some schools to over-express 
growth. And they were met with some concern and comment from 
whether they were defensible or not.
    And I think if these growth models are to be allowed, that 
this peer review process that scrutinized the statistical 
methods that were being used and whether they would do exactly 
what you are asking--would they over-credit schools--needs to 
be emphasized and needs to continue to be in place to guard 
against exactly the point that you are mentioning. I do think 
growth models should be applied because they are the right 
thing to do. But I also think they should be subject to 
statistical scrutiny and whether they fit reasonably within a 
policy context.
    Mr. Dougherty. And I will mention that there was a lot of 
conversation in the panel about over time validating the growth 
models to see how many of the kids who were predicted to be 
proficient are on track to proficient actually end up being 
proficient.
    Mr. McWalters. I obviously support growth models. But don't 
substitute the instrument of measurement for the causation of 
change. Your issues about concentrated student need--the growth 
model is just going to help us see it. It is not going to 
answer how you treat it.
    Chairman Miller. The gentleman's time is expired.
    If I might follow on with a second round of questioning 
here, although I see we--excuse me. Mr. Ehlers? I am sorry.
    Mr. Ehlers. Thank you.
    Chairman Miller. The gentleman is recognized.
    Mr. Ehlers. As a token scientist here, I am used to being 
overlooked. But also as a token scientist, I have to ask a 
question about science education or my colleagues will think I 
have lost my ability.
    At any rate, Dr. Dougherty, I noticed that you taught in 
elementary school, taught science. And you are aware, of 
course, that schools have to begin testing for science in 2007, 
2008. But these tests under current law do not count toward 
AYP. I am proposing that they should. And I would appreciate 
your comment on that and whether you think that is an 
appropriate thing.
    Mr. Dougherty. I think they should. I think that--just 
going back to my experience, back in the day, a lot of times 
districts didn't have science curricula for elementary schools. 
In Texas, teachers, the science teachers actually requested 
that the tests count in the state accountability system because 
otherwise the school systems wouldn't pay enough attention to 
teaching science. So I think making science count is important.
    Mr. Ehlers. I appreciate that. And I, in fact, have 
introduced a bill to add that to No Child Left Behind. I hope 
it is included in the reauthorization.
    Let me go beyond that now. Some of you have made comments 
about the multitude of tests, the variability in the tests. My 
colleague who just left, Mr. Keller, raised the point that it 
was hard to keep track of who was doing well and who didn't 
because of the testing methods.
    I have introduced a bill to provide voluntary educational 
standards, math and science standards. And schools would not be 
required to use them, but obviously we would encourage them to 
use them. And I have a reason for that. You might argue it 
would be better to have national standards in other areas, but 
certainly, in the science and math because it is sequential in 
nature. And because of the variability of textbooks, the 
schedules and coupled with the mobility of families and 
students in today's world, it is very possible for students to 
get messed up.
    For example, if a student is attending a school that 
teaches fractions in the fall, percentages in the spring, and 
in January transfers to a school that teaches percentages in 
the fall, fractions in the spring, they get a double dose of 
fractions and never learn percentages. That is not an uncommon 
problem. I have seen it in a number of schools.
    Do you think it makes sense that we have a system of 
voluntary standards? And particularly, this came about not 
because so much of the sequencing, but when I looked at test 
results and this recent comparison that came out, comparison 
between how students did on the NAEP test compared to how they 
did on the state's tests, my own state got a D-in terms of how 
well the students were performing on the NAEP test compared to 
how they performed on the state test. And Michigan is an 
outstanding state, has a good school system.
    So there is something wrong if we don't have a better 
national standard so that we can compare apples to oranges 
related to AYP in different states. Any comments on that?
    Mr. Olson. I think maybe everyone on the panel will want to 
comment on it. Dr. Doran made a comment earlier that made 
reference to how states establish their benchmark, their 
requirement for proficiency. As far as I know, states have put 
proficiency statements in place that have no relative 
relationship to anything real in the world. NAEP is an example 
of that.
    So if we do move to voluntary standards, which I would be 
in favor of personally, that we do it in such a way that we ask 
the question what is it in the real world that should create 
that anchor of expectation and make that common across the 
states. The NAEP standard probably is not that standard. And 
so, I would suggest some serious thought. And to the extent 
that common standards or voluntary standards spread across 
other academic areas, the same question would be raised.
    Mr. Ehlers. That is a good idea, good comment.
    Others? Yes?
    Ms. Woodruff. I think that it absolutely is time for us to 
have voluntary national standards. And by that, I don't mean 
federal standards. I mean standards that we come together, we 
agree what the standards are. And we have to be thinking more 
clearly about serving the needs of our students, who are a much 
more mobile population today than they have ever been. So the 
conversation around national standards is timely, appropriate, 
and we ought to have it.
    Mr. Dougherty. I think such standards would be tremendously 
influential, which means if they are very good standards, very 
strong standards, they would be very positively influential. So 
it would be very, very important, particularly important, to 
get them right and have them to be strong. I suggest one of the 
anchors should be the aim that students be ready for college, 
and skill careers be a target for those standards.
    Mr. McWalters. I am from Rhode Island. So we have voluntary 
cooperative standards with two other states. I advocate it. I 
think it is got to be voluntarily. I am more interested in the 
measurements, the instrumentations of the standards and how we 
use measurement to actually get students to hit standards that 
are comprehensive. Don't confuse the standards with the need 
for multiple measures of them.
    Mr. Ehlers. Thank you.
    Dr. Doran, any comments?
    Mr. Doran. I do have comments. I mentioned this in my 
testimony and with relationship to the NAEP specifically. I do 
think--and I am a strong supporter--of voluntary national 
standards. I think the question is why do we have so much 
variability in the states' performance levels, and can we do a 
better job in bringing some coherence into our educational 
accountability system because of the reason that you mentioned, 
that we have a very mobile society. So for that reason, I am, 
in fact, very supportive of voluntary national standards.
    I do want to dovetail on what Allan Olson mentioned a 
moment ago. And that is that if voluntary national standards 
are created, especially as we look toward the high school, 
those standards should begin the conversation of connecting 
those standards with skills required to be successful post-high 
school.
    Mr. Ehlers. Thank you very much. I appreciate that.
    Chairman Miller. It probably would be too logical of a 
conclusion. But we will try it.
    Let me just ask a question. I am sorry. We have a vote, and 
I don't want to hold you for that vote.
    But, Mr. Dougherty, you indicated that there are 27 states 
that now have in place a data system that you think is 
acceptable so that they can move to a growth model. Is that a 
fair statement of your testimony?
    Mr. Dougherty. That is a fair statement. We didn't look at 
their assessment system, but we looked at their data system.
    Chairman Miller. So if the decision is made to go out and 
to embrace a growth model--and I assume we are all talking 
about a growth model toward proficiency, that this is a growth 
model to take you somewhere, that that is the kind of model. 
And there is obviously multiple growth models available, as I 
understand it, with integrity and with credibility for the 
results that we sort of have in this common conversation about 
what we want to achieve. So how do we start that transition? 
What do you do with it?
    I notice my state is not on that list of a state that has a 
data systems acceptable. And they just got a report, just a 
huge report that they have waited 3 years for that essentially 
one of the components has told them that their data system is 
in a shambles. They really know very little about their 
customers at all, where they are, what they are doing or how 
they are coming and going.
    What happens to them in this transition period? I mean, do 
we go through the process that we have been going through? You 
are on the secretary's peer review. States continue to make 
applications, and they are deemed adequate. And that is the 
process by which they get through.
    And I don't know, Secretary Woodruff, if you have had your 
experience with that process.
    But if you might, outline that, those who feel confident to 
do so.
    Mr. Dougherty. I would comment that is a very good process. 
It basically causes--it is voluntary. States step up to the 
plate. Everybody pretty much wants to have a growth model. And 
so, it is kind of like do you qualify.
    Chairman Miller. Yes, but a lot of people want a growth 
model because they think it is a silver bullet.
    Mr. Dougherty. Yes.
    Chairman Miller. You can hear it----
    Mr. Dougherty. It is not a silver bullet.
    Chairman Miller. You can hear it in their voice sometimes 
when they talk to you about it.
    Mr. Dougherty. They are going to be surprised how few 
additional schools qualify for AYP, as North Carolina, I think, 
has found, Delaware is likely to find. It is not a silver 
bullet. But from the point of view of improving the evaluation 
of the effectiveness of educational programs, providing 
guidance for school improvement, it is not necessarily a silver 
bullet. But it is definitely something that ought to be in the 
armory.
    Chairman Miller. Thank you.
    Mr. Doran?
    Mr. Doran. I will follow that and say that I didn't serve 
on the first round of peer reviews, but I did serve on the most 
recent rounds. I like the process that is currently in place. 
So to get from A to B if the flexibility were awarded such that 
states could implement the growth model, I think those growth 
models need to be submitted in the application process. This is 
very similar to the way states did this at the very beginning.
    When NCLB was first established, they had to establish an 
accountability workbook, and they had to go through the process 
of how they were going to compute AYP and so forth. I would use 
that same process, that states would have to describe how they 
are going to implement the growth model, how they are going to 
use it within their accountability system. It should then be 
scrutinized, modified, if needed.
    And I would also support the notion that it didn't turn out 
to be a silver bullet in either Tennessee or North Carolina. I 
think Tennessee had seven additional schools that made AYP as a 
result. And North Carolina, I believe, had none, if I remember 
my facts correctly.
    Mr. Dougherty. And in contract with the accountability 
workbook process, which is pretty much mandatory for 50 states, 
I wouldn't make having a growth model be a mandatory--you will 
get more enthusiastic participation if it is voluntary and 
probably more ingenuity of the ones who apply.
    Chairman Miller. Mr. Olson, let me ask you this. In your 
testimony you obviously lay out, you know, a substantial track 
record of looking at these systems and administering these 
longitudinal tests and the results. And you find this all 
compatible with your experience that states would be able to 
adapt to a system that would be able to allow them to mine this 
kind of information from these models that are--I guess I want 
to say--currently under consideration?
    Mr. Olson. Yes, I do. The thing I would come back to is 
that the states are allowed to assess even more accurately the 
wealth and the information and the value of the information 
will become increasingly useful and give us an opportunity to 
target and improve decision makings on many people inside the 
educational system in contrast to, you know, just the district 
level or just the state level.
    Chairman Miller. Let me ask you if you might, just quickly, 
what is the red flag we should be looking for in terms of when 
people describe to us the process they, their state, would like 
to go through to get to the other side. Is there a red flag 
that you have watched in the secretary's process or in 
experience of people who--I always worry that people embrace a 
concept but then their vision of the concept is a little 
skewed.
    Mr. McWalters. I think I can answer that from a state's 
perspective. If I have a growth model or not right now and I 
pass the review of the experts, my gap to 2014 is not going to 
get smaller with a growth model. So the issue of understanding 
how far we are as a nation from wrestling with proficiency at 
real levels without softening the bar--none of us want to 
soften the bar.
    So when you have a growth model or not, the gap is real. 
And the intervention capacity question is still the part that 
is missing for me. I don't want to hide how far I have to go. I 
want to change my capacity to get there.
    Chairman Miller. Anyone else?
    Secretary?
    Ms. Woodruff. As far as the whole growth model issue is 
concerned, I think that it is very important that the whole 
process is clear, understandable, and transparent.
    Chairman Miller. That is the congressional process.
    Ms. Woodruff. So that there is absolutely no question about 
what the criteria are, how they are going to be judged, and 
that the conversation is iterative. And as far as I am 
concerned, if there are 27 states----
    Chairman Miller. You are talking about the approval process 
for that growth model.
    Ms. Woodruff. I am talking about what is it you have to do 
and what are the steps that must be taken and then how are you 
going to be judged. I don't want to know what the test is after 
I have taken the course. I would like to know ahead of time 
what I am going to be judged on. And I think that has been the 
concern that a number of us had.
    If there are 27 states ready now, let them go. And then we 
will help the other states understand what the mechanisms are 
and the hurdles are to get there. I think we are in a state of 
today where nationally we really help each other and step up to 
do that on a regular basis.
    Chairman Miller. Thank you. That may be a good place to 
interrupt this conversation. I hope that we will be able to 
continue it as the committee gets deeper into the 
reauthorization process.
    Thank you so much for your time and your expertise and your 
experiences. I think this was very, very helpful to the members 
of the committee.
    The hearing record will stay open for 14 days. If there are 
others who want to make submissions, we would certainly take 
them under consideration.
    [The prepared statement of Mr. Altmire follows:]

Prepared Statement of Hon. Jason Altmire, a Representative in Congress 
                     From the State of Pennsylvania

    Thank you, Mr. Chairman, for holding this hearing to examine how we 
can improve No Child Left Behind's measures of progress.
    I would like to extend a warm welcome to today's witnesses. I 
appreciate all of you for taking the time to be here and look forward 
to hearing from you.
    Measuring whether or not students are making Adequate Yearly 
Progress is fundamental to how NCLB works. We must have indicators that 
accurately measure student knowledge and track their academic 
achievement to determine which schools are truly in need of 
intervention and to determine exactly what interventions are needed.
    I am particularly interested in hearing our witnesses' comments on 
growth models. Pennsylvania's proposal to institute a growth-based 
accountability model has just begun the peer review process. Assessing 
student achievement in this way may have the potential to improve how 
we measure Adequate Yearly Progress because it allows for the tracking 
of individual students' academic gain on a yearly basis. However, I am 
aware that there are different types of growth models and would be 
interested in hearing about the best practices in this area.
    Thank you again, Mr. Chairman. I yield back the balance of my time.
                                 ______
                                 
    [Additional materials submitted by Chairman Miller follow:]
    [The prepared statement of Prof. Hammond follows:]

   Prepared Statement of Linda Darling-Hammond, Charles E. Ducommun 
           Professor, Stanford University School of Education

    I thank Chairman Miller and the members of the Committee for the 
opportunity to offer testimony on the re-authorization of ESEA, in 
particular the ways in which we measure and encourage school progress 
and improvement. My perspective on these issues is informed by my 
research, my work with states and national organizations on standards 
development, and my work with local schools. I have studied the 
implementation of No Child Left Behind,\1\ as well as testing and 
accountability systems within the United States and abroad.\2\ I have 
also served as past Chair of the New York State Council on Curriculum 
and Assessment and of the Chief State School Officers' INTASC Standards 
Development Committee. I work closely with a number of school districts 
and local schools on education improvement efforts, including several 
new urban high schools that I have helped to launch. Thus, I have 
encountered the issues of school improvement from both a system-wide 
and local school vantage point.
    I am hopeful that this re-authorization can build on the strengths 
and opportunities offered by No Child Left Behind, while addressing 
needs that have emerged during the first years of the law's 
implementation. Among the strengths of the law is its focus on 
improving the academic achievement of all students, which triggers 
attention to school performance and to the needs of students who have 
been underserved, and its insistence that all students are entitled to 
qualified teachers, which has stimulated recruitment efforts in states 
where many disadvantaged students previously lacked this key resource 
for learning.
    The law has succeeded in getting states, districts, and local 
schools to pay attention to achievement. The next important step is to 
ensure that the range of things schools and states pay attention to 
actually helps them improve both the quality of education they offer to 
every student and the quality of the overall schooling enterprise. In 
order to accomplish this, I would ask you to actively encourage states 
to:
     Develop accountability systems that use multiple measures 
of learning and other important aspects of school performance in 
evaluating school progress;
     Differentiate school improvement strategies for schools 
based on a comprehensive analysis of their instructional quality and 
conditions for learning.
Why Use Multiple Measures?
    There are at least three reasons to gauge student and school 
progress based on multiple measures of learning and school performance:
     To direct schools' attention and effort to the range of 
measures that are associated with high-quality education and 
improvement;
     To avoid dysfunctional consequences that can encourage 
schools, districts, or states to emphasize one important outcome at the 
expense of another; for example, focusing on a narrow set of skills at 
the expense of others that are equally critical, or boosting test 
scores by excluding students from school; and
     To capture an adequate and accurate picture of student 
learning and attainment that both measures and promotes the kinds of 
outcomes we need from schools.
Directing Attention to Measures Associated with School Quality
    One of the central concepts of NCLB's approach is that schools and 
systems will organize their efforts around the measures for which they 
are held accountable. Because attending to any one measure can be both 
partial and problematic, the concept of multiple measures is routinely 
used by policymakers to make critical decisions about such matters as 
employment and economic forecasting (for example, the Dow Jones Index 
or the GNP) and admission to college, where grades, essays, activities, 
and accomplishments are considered along with test scores.
    Successful businesses use a ``dashboard'' set of indicators to 
evaluate their health and progress, aware that no single indicator is 
sufficient to understand or guide their operations. This approach is 
designed to focus attention on those aspects of the business that 
describe elements of the business's current health and future 
prospects, and to provide information that employees can act on in 
areas that make a difference for improvement. So, for example, a 
balanced scorecard is likely to include among its financial indicators 
not only a statement of profits, but also cash flow, dividends, costs 
and accounts receivable, assets, inventory, and so on. Business leaders 
understand that efforts to maximize profits alone could lead to 
behaviors that undermine the long-term health of the enterprise.
    Similarly, a single measure approach in education creates some 
unintended negative consequences and fails to focus schools on doing 
those things that can improve their long-term health and the education 
of their students. Although No Child Left Behind calls for multiple 
measures of student performance, the implementation of the law has not 
promoted the use of such measures for evaluating school progress. As I 
describe in the next section, the focus on single, often narrow, test 
scores in many states has created unintended negative consequences for 
the nature of teaching and learning, for access to education for the 
most vulnerable students, and for the appropriate identification of 
schools that are in need of improvement.
    A multiple measures approach that incorporates the right 
``dashboard'' of indicators would support a shift toward ``holding 
states and localities accountable for making the systemic changes that 
improve student achievement'' as has been urged by the Forum on 
Education and Accountability. This group of 116 education and civil 
rights organizations--which include the National Urban League, NAACP, 
League of United Latin American Citizens, Aspira, Children's Defense 
Fund, National Alliance of Black School Educators, and Council for 
Exceptional Children, as well as the National School Boards 
Association, National Education Association, and American Association 
of School Administrators--has offered a set of proposals for NCLB that 
would focus schools, districts, and states on developing better 
teaching, a stronger curriculum, and supports for school improvement.
Avoiding Dysfunctional Consequences
    Another reason to use a multiple measures approach is to avoid the 
negative consequences that occur when one measure is used to drive 
organizational behavior.
    The current accountability provisions of the Act, which are focused 
almost exclusively on school average scores on annual tests, actually 
create large incentives for schools to keep students out and to hold 
back or push out students who are not doing well. A number of studies 
have found that systems that reward or sanction schools based on 
average student scores create incentives for pushing low-scorers into 
special education so that their scores won't count in school 
reports,\3\ retaining students in grade so that their grade-level 
scores will look better,\4\ excluding low-scoring students from 
admissions,\5\ and encouraging such students to leave schools or drop 
out.\6\
    Studies in New York,\7\ Texas,\8\ and Massachusetts,\9\ among 
others, have showed how schools have raised their test scores while 
``losing'' large numbers of low-scoring students. For example, a recent 
study in a large Texas city found that student dropouts and push outs 
accounted for most of the gains in high school student test scores, 
especially for minority students. The introduction of a high-stakes 
test linked to school ratings in the 10th grade led to sharp increases 
in 9th grade student retention and student dropout and disappearance. 
Of the large share of students held back in the 9th grade, most of them 
African American and Latino, only 12% ever took the 10th grade test 
that drove school rewards. Schools that retained more students at grade 
9 and lost more through dropouts and disappearances boosted their 
accountability ratings the most. Overall, fewer than half of all 
students who started 9th grade graduated within 5 years, even as test 
scores soared.\10\
    Paradoxically, NCLB's requirement for disaggregating data and 
tracking progress for each subgroup of students increases the 
incentives for eliminating those at the bottom of each subgroup, 
especially where schools have little capacity to improve the quality of 
services such students receive. Table 1 shows how this can happen. At 
``King Middle School,'' average scores increased from the 70th to the 
72nd percentile between the 2002 and 2003 school year, and the 
proportion of students in attendance who met the proficiency standard 
(a score of 65) increased from 66% to 80%--the kind of performance that 
a test-based accountability system would reward. Looking at subgroup 
performance, the proportion of Latino students meeting the standard 
increased from 33% to 50%, a steep increase.
    However, not a single student at King improved his or her score 
between 2002 and 2003. In fact, the scores of every single student in 
the school went down over the course of the year. How could these steep 
improvements in the school's average scores and proficiency rates have 
occurred? A close look at Table 1 shows that the major change between 
the two years was that the lowest-scoring student, Raul, disappeared. 
As has occurred in many states with high stakes-testing programs, 
students who do poorly on the tests--special needs students, new 
English language learners, those with poor attendance, health, or 
family problems--are increasingly likely to be excluded by being 
counseled out, transferred, expelled, or by dropping out.
                               TABLE 1.--KING MIDDLE SCHOOL: REWARDS OR SANCTIONS?
                      [The Relationship between Test Score Trends and Student Populations]
----------------------------------------------------------------------------------------------------------------
                                                                    2002-03                     2003-04
----------------------------------------------------------------------------------------------------------------
Laura...................................................                        100                          90
James...................................................                         90                          80
Felipe..................................................                         80                          70
Kisha...................................................                         70                          65
Jose....................................................                         60                          55
Raul....................................................                         20   ..........................
                                                                   Ave. Score = 70%            Ave. Score = 72%
                                                             meeting standard = 66%      meeting standard = 80%
----------------------------------------------------------------------------------------------------------------

    This kind of result is not limited to education. When one state 
decided to rank cardiac surgeons based on their mortality rates, a 
follow up investigation found that surgeons' ratings went up as they 
stopped taking on high-risk clients. These patients were referred out 
of state if they were wealthy, or were not served, if they were poor.
    The three national professional organizations of measurement 
experts have called attention to such problems in their joint Standards 
for Educational and Psychological Testing, which note that:
    Beyond any intended policy goals, it is important to consider 
potential unintended effects that may result from large-scale testing 
programs. Concerns have been raised, for instance, about narrowing the 
curriculum to focus only on the objectives tested, restricting the 
range of instructional approaches to correspond to the testing format, 
increasing the number of dropouts among students who do not pass the 
test, and encouraging other instructional or administrative practices 
that may raise test scores without affecting the quality of education. 
It is important for those who mandate tests to consider and monitor 
their consequences and to identify and minimize the potential of 
negative consequences.\11\
    Professional testing standards emphasize that no test is 
sufficiently reliable and valid to be the sole source of important 
decisions about student placements, promotions, or graduation, but that 
such decisions should be made on the basis of several different kinds 
of evidence about student learning and performance in the classroom. 
For example, Standard 13.7 states:
    In educational settings, a decision or characterization that will 
have major impact on a student should not be made on the basis of a 
single test score. Other relevant information should be taken into 
account if it will enhance the overall validity of the decision.\12\
    The Psychological Standards for Testing describe several kinds of 
information that should be considered in making judgments about what a 
student knows and can do, including alternative assessments that 
provide other information about performance and evidence from samples 
of school work and other aspects of the school record, such as grades 
and classroom observations. These are particularly important for 
students for whom traditional assessments are not generally valid, such 
as English language learners and special education students. Similarly, 
when evaluating schools, it is important to include measures of student 
progress through school, coursework and grades, and graduation, as part 
of the record about school accomplishments.
Evaluating Learning Well
    Indicators beyond a single test score are important not only for 
reasons of validity and fairness in making decisions, but also to 
assess important skills that most standardized tests do not measure. 
Current accountability reforms are based on the idea that standards can 
serve as a catalyst for states to be explicit about learning goals, and 
the act of measuring progress toward meeting these standards is an 
important force toward developing high levels of achievement for all 
students. However, an on-demand test taken in a limited period of time 
on a single day cannot measure all that is important for students to 
know and be able to do. A credible accountability system must rest on 
assessments that are balanced and comprehensive with respect to state 
standards. Multiple-choice and short-answer tests that are currently 
used to measure standards in many states do not adequately measure the 
complex thinking, communication, and problem solving skills that are 
represented in national and state content standards.
    Research on high-stakes accountability systems shows that, ``what 
is tested is what is taught,'' and those standards that are not 
represented on the high stakes assessment tend to be given short shrift 
in the curriculum.\13\ Students are less likely to engage in extended 
research, writing, complex problem-solving, and experimentation when 
the accountability system emphasizes short-answer responses to 
formulaic problems. These higher order thinking skills are those very 
skills that often are cited as essential to maintaining America's 
competitive edge and necessary for succeeding on the job, in college, 
and in life. As described by Achieve, a national organization of 
governors, business leaders, and education leaders, the problem with 
measures of traditional on-demand tests is that they cannot measure 
many of the skills that matter most for success in the worlds of work 
and higher education:
    States * * * will need to move beyond large-scale assessments 
because, as critical as they are, they cannot measure everything that 
matters in a young person's education. The ability to make effective 
oral arguments and conduct significant research projects are considered 
essential skills by both employers and postsecondary educators, but 
these skills are very difficult to assess on a paper-and pencil 
test.\14\
    One of the reasons that U.S. students fall further and further 
behind their international counterparts as they go through school is 
because of differences in curriculum and assessment systems. 
International studies have found that the U.S. curriculum focuses more 
on superficial coverage of too many topics, without the kinds of in-
depth study, research, and writing needed to secure deep understanding. 
To focus on understanding, the assessment systems used in most high-
achieving countries around the world emphasize essay questions, 
research projects, scientific experiments, oral exhibitions and 
performances that encourage students to master complex skills as they 
apply them in practice, rather than multiple-choice tests.
    As indicators of the growing distance between what our education 
system emphasizes and what leading countries are accomplishing 
educationally, the U.S. currently ranks 28th of 40 countries in the 
world in math achievement--right above Latvia--and 19th of 40 in 
reading achievement on the international PISA tests that measure 
higher-order thinking skills. And while the top-scoring nations--
including previously low-achievers like Finland and South Korea--now 
graduate more than 95% of their students from high school, the U.S. is 
graduating about 75%, a figure that has been stagnant for a quarter 
century and, according to a recent ETS study, is now declining. The 
U.S. has also dropped from 1st in the world in higher education 
participation to 13th, as other countries invest more resources in 
their children's futures.
    Most high-achieving nations' examination systems include multiple 
samples of student learning at the local level as well as the state or 
national level. Students' scores are a composite of their performance 
on examinations they take in different content areas--featuring 
primarily open-ended items that require written responses and problem 
solutions--plus their work on a set of classroom tasks scored by their 
teachers according to a common set of standards. These tasks require 
them to conduct apply knowledge to a range of tasks that represent what 
they need to be able to do in different fields: find and analyze 
information, solve multi-step real-world problems in mathematics, 
develop computer models, demonstrate practical applications of science 
methods, design and conduct investigations and evaluate their results, 
and present and defend their ideas in a variety of ways. Teaching to 
these assessments prepares students for the real expectations of 
college and of highly skilled work.
    These assessments are not used to rank or punish schools, or to 
deny promotion or diplomas to students. In fact, several countries have 
explicit proscriptions against such practices. They are used to 
evaluate curriculum and guide investments in professional learning--in 
short, to help schools improve. By asking students to show what they 
know through real-world applications of knowledge, these nations' 
assessment systems encourage serious intellectual activities on a 
regular basis. The systems not only measure important learning, they 
help teachers learn how to design curriculum and instruction to 
accomplish this learning.
    It is worth noting that a number of states in the U.S. have 
developed similar systems that combine evidence from state and local 
standards-based assessments to ensure that multiple indicators of 
learning are used to make decisions about individual students and, 
sometimes, schools. These include Connecticut, Kentucky, Maine, 
Nebraska, New Hampshire, Oregon, Rhode Island, Pennsylvania, Vermont, 
and Wyoming, among others. However, many of these elements of state 
systems are not currently allowed to be used to gauge school progress 
under NCLB.
    Encouraging these kinds of practices could help improve learning 
and guide schools toward more productive instruction. Studies have 
found that performance assessments that are administered and scored 
locally help teachers better understand students' strengths, needs, and 
approaches to learning, as well as how to meet state standards.\15\ 
Teachers who have been involved in developing and scoring performance 
assessments with other colleagues have reported that the experience was 
extremely valuable in informing their practice. They report changes in 
both the curriculum and their instruction as a result of thinking 
through with colleagues what good student performance looks like and 
how to better support student learning on specific kinds of tasks.
    These goals are not well served by external testing programs that 
send secret, secured tests into the school and whisk them out again for 
machine scoring that produces numerical quotients many months later. 
Local performance assessments provide teachers with much more useful 
classroom information as they engage teachers in evaluating how and 
what students know and can do in authentic situations. These kinds of 
assessment strategies create the possibility that teachers will not 
only teach more challenging performance skills but that they will also 
be able to use the resulting information about student learning to 
modify their teaching to meet the needs of individual students. Schools 
and districts can use these kinds of assessments to develop shared 
expectations and create an engine for school improvement around student 
work.
    Research on the strong gains in achievement shown in Connecticut, 
Kentucky, and Vermont in the 1990s attributed these gains in 
substantial part to these states' performance-based assessment systems, 
which include such local components, and related investments in 
teaching quality.\16\ Other studies in states like California, Maine, 
Maryland, and Washington,\17\ found that teachers assigned more 
ambitious writing and mathematical problem solving, and student 
performance improved, when assessments included extended writing and 
mathematics portfolios and performance tasks. Encouraging these kinds 
of measures of student performance is critical to getting the kind of 
learning we need in schools.
    Not incidentally, more authentic measures of learning that go 
beyond on-demand standardized tests to look directly at performance are 
especially needed to gain accurate measures of achievement for English 
language learners and special needs students for whom traditional tests 
are least likely to provide valid measures of understanding.\18\
What Indicators Might be Used to Gauge School Progress?
    A key issue is what measures should be used to determine Adequate 
Yearly Progress (AYP) or the alternative tools that are used for 
addressing NCLB's primary goals, e.g. assuring high expectations for 
all students, and helping schools address the needs of all students. 
Current AYP measures are too narrow in several respects: They are based 
exclusively on tests which are often not sufficient measures of our 
educational goals; they ignore other equally important student 
outcomes, including staying in school and engaging in rigorous 
coursework; they ignore the growth made by students who are moving 
toward but not yet at a proficiency benchmark, as well as the gains 
made by students who have already passed the proficiency benchmark; and 
they do not provide information or motivation to help schools, 
districts, and states improve critical learning conditions.
    This analysis suggests that school progress should be evaluated on 
multiple measures of student learning--including local and state 
performance assessments that provide evidence about what students can 
actually do with their knowledge--and on indicators of other student 
outcomes, including such factors as student progress and continuation 
through school, graduation, and success in rigorous courses. The 
importance of these indicators is to encourage schools to keep students 
in school and provide them with high-quality learning opportunities--
elements that will improve educational opportunities and attainment, 
not just average test scores.
    To these two categories of indicators, I would add indicators of 
learning conditions that point attention to both learning opportunities 
available to students (e.g. rigorous courses, well-qualified teachers) 
and to how well the school operates. In the business world, these kinds 
of measures are called leading indicators, which represent those things 
that employees can control and improve upon. These typically include 
evidence of customer satisfaction, such as survey data, complaints and 
repeat orders; as well as of employee satisfaction and productivity, 
such as employee turnover, project delays, evidence of quality and 
efficiency in getting work done; reports of work conditions and 
supports, and evidence of product quality.
    Educational versions of these kinds of indicators are available in 
many state accountability systems. For example, State Superintendent 
Peter McWalters noted in his testimony to this committee that Rhode 
Island uses several means to measure school learning conditions. Among 
them is an annual survey to all students, teachers, and parents that 
provides data on ``Learning Support Indicators'' measuring school 
climate, instructional practices, and parental involvement. In 
addition, Rhode Island, like many other states, conducts visits to 
review every school in the state every five years, not unlike the 
Inspectorate system that is used in many other countries. These kinds 
of reviews can examine teaching practices, the availability and 
equitable allocation of school resources, and the quality of the 
curriculum, as it is enacted.
    Ideally, evaluation of school progress would be based on a 
combination of these three kinds of measures and would emphasize gains 
and improvement over time, both for the individual students in the 
school and for the school as a whole. Along with data about student 
characteristics, an indicator system could include:
     Measures of student learning: both state tests and local 
assessments, including performance measures that assess higher-order 
thinking skills and understanding, including student work samples, 
projects, exhibitions, or portfolios.
     Measures of additional student outcomes: data about 
attendance, student grade-to-grade progress (promotion / retention 
rates) and continuation through school (ongoing enrollment), 
graduation, and course success (e.g. students enrolled in, passing, and 
completing rigorous courses of study).
     Measures of learning conditions, data about school 
capacity, such as teacher and other staff quality, availability of 
learning materials, school climate (gauged by students', parents', and 
teachers' responses to surveys), instructional practices, teacher 
development, and parental engagement.
    These elements should be considered in the context of student data, 
including information about student mobility, health, and welfare 
(poverty, homelessness, foster care, health care), as well as language 
background, race / ethnicity, and special learning needs--not a basis 
for accepting differential effort or outcomes, but as a basis for 
providing information needed to interpret and improve schools' 
operations and outcomes.
How Might Indicators be Used to Determine School Progress and 
        Improvement Strategies?
    The rationale for these multiple indicators is to build a more 
powerful engine for educational improvement by understanding what is 
really going on with students and focusing on the elements of the 
system that need to change if learning is to improve. High-performing 
systems need a regular flow of useful information to evaluate and 
modify what they are doing to produce stronger results. State and local 
officials need a range of data to understand what is happening in 
schools and what they should do to improve outcomes. Many problems in 
local schools are constructed or constrained by district and state 
decisions that need to be highlighted along with school-level concerns. 
Similarly, at the school level, teachers and leaders need information 
about how they are doing and how their students are doing, based in 
part on high-quality local assessments that provide rich, timely 
insights about student performance.
    Some states and districts have successfully put some of these 
indicators in place. The federal government could play a leadership 
role by not only encouraging multiple measures for assessing school 
progress and conditions for learning but by providing supports for 
states to build comprehensive databases to track these indicators over 
time, and to support valid, comprehensive information systems at all 
levels.\19\
    If we think comprehensively about the approach to evaluation that 
would encourage fundamental improvements in schools, several goals 
emerge. First, determinations of school progress should reflect an 
analysis of schools' performance and progress along several key 
dimensions. Student learning should be evaluated using multiple 
measures that provide comprehensive and valid information for all 
subpopulations. Targets should be based on sensible goals for student 
learning, examining growth from where students start, setting growth 
targets in relation to that starting point, and pegging ``proficiency'' 
at a level that represents a challenging but realistic standard, 
perhaps at the median of current state proficiency standards. Targets 
should also ensure appropriate assessment for special education 
students and English language learners and credit for the gains these 
students make over time. And analysis of learning conditions including 
the availability of materials, facilities, curriculum opportunities, 
teaching, and leadership should accompany assessments of student 
learning.
    A number of states already have developed comprehensive indicator 
systems that can be sources of such data, and the federal government 
should encourage states to propose different means for how to aggregate 
and combine these data. In addition, many states' existing assessment 
systems already provide different ways to score and combine state 
reference tests with local testing systems, locally administered 
performance tasks (which are often scored using state standards), and 
portfolios.\20\
    For evaluating annual progress, one likely approach would be to use 
an index of indicators, such as California's Academic Performance 
Index, which can include a weighted combination of data about state and 
local tests and assessments as well as other student outcome indicators 
like attendance, graduation, promotion rates, participation and pass 
rates or grades for academic courses. Assessment data from multiple 
sources and evidence of student progression through / graduation from 
school would be required components. Key conditions of learning, such 
as teacher qualifications, might also be required. Other specific 
indicators might be left to states, along with the decision of how much 
weight to give each component, perhaps within certain parameters (for 
example, that at least 50 percent of a weighted index would reflect the 
results of assessment data).
    Within this index, disaggregated data by race/ethnicity and income 
could be monitored on the index score, or on components of the overall 
index, so that they system pays ongoing attention to progress for 
groups of students. Wherever possible these measures should look at 
progress of a constant cohort of students from year to year, so that 
actual gains are observed, rather than changes in averages due to 
changes in the composition of the student population. Furthermore, 
gains for English language learners and special education students 
should be evaluated on a growth model that ensures appropriate testing 
based on professional standards and measures individual student growth 
in relation to student starting points.
    Non-academic measures such as improved learning climate (as 
measured by standard surveys, for example, to allow trend analysis over 
time), instructional capacity (indicators regarding the quality of 
curriculum, teaching, and leadership), resources, and other 
contributors to learning could be included in a separate index on 
Learning Conditions, on which progress is also evaluated annually as 
part of both school, district, and state assessment.
    Once school progress indicators are available, a judgment must be 
made about whether a school has made adequate progress on the index or 
set of indicators. If the law is to focus on supporting improvement it 
will be important to look at continuous progress for all students in a 
school rather than the ``status model'' that has been used in the past. 
A progress model would recognize the reasonable success of schools that 
deserve it. Rather than identifying a school as requiring intervention 
when a single target is missed (for example, if 94% of economically 
disadvantaged students take the mathematics test one year instead of 
95%), a progress model would gauge whether the overall index score 
increases, with the proviso that the progress of key subgroups 
continues to be examined, with lack of progress a flag for 
intervention.
    The additional use of the indicators schools and districts have 
assembled would be in the determination of what kind of action is 
needed if a school does not make sufficient progress in a year. To use 
resources wisely, the law should establish a graduated system of 
classification for schools and districts based on their rate of 
progress, ranging from state review to corrective actions to eventual 
reconstitution if such efforts fail over a period of time. States 
should identify schools and districts as requiring intervention based 
both on information about the overall extent of progress from the prior 
year(s) and on information about specific measures in the system of 
indicators--for example, how many progress indicators have lagged for 
how long. This additional scrutiny would involve a school review by an 
expert team--much like the inspectorate systems in other countries--
that conducts an inspection of the school or LEA and analyzes a range 
of data, including evidence of individual and collective student growth 
or progress on multiple measures; analysis of student needs, mobility, 
and population changes; and evaluation of school practices and 
conditions. Based on the findings of this review, a determination would 
be made about the nature of the problem and the type of school 
improvement plan needed. The law should include the explicit 
expectation that state and district investments in ensuring adequate 
conditions for learning must be part of this plan.
    The overarching goal of the ESEA should be to improve the quality 
of education students receive, especially those traditionally least 
well served by the current system. To accomplish this, the measures 
used to gauge school progress must motivate continuous improvement and 
attend to the range of school outcomes and conditions that are needed 
to ensure that all students are educated to higher levels.
                                endnotes
    \1\ See, e.g. L. Darling-Hammond, No Child Left Behind and High 
School Reform, Harvard Education Review, 76, 4 (Winter 2006), pp. 642-
667. http://www.edreview.org/harvard06/2006/wi06/w06darli.htm
    L. Darling-Hammond, From `Separate but Equal' to `No Child Left 
Behind': The Collision of New Standards and Old Inequalities. In 
Deborah Meier and George Wood (eds.), Many Children Left Behind, pp. 3-
32. NY: Beacon Press, 2004.
    \2\ Linda Darling-Hammond, Elle Rustique-Forrester, & Raymond 
Pecheone (2005). Multiple measures approaches to high school 
graduation: A review of state student assessment policies. Stanford, 
CA: Stanford University, School Redesign Network.
    \3\ Allington, R. L. & McGill-Franzen, A. (1992). Unintended 
effects of educational reform in New York, Educational Policy, 6 (4): 
397-414; Figlio, D.N. & Getzler, L.S. (2002, April). Accountability, 
ability, and disability: Gaming the system? National Bureau of Economic 
Research.
    \4\ W. Haney (2000). The myth of the Texas miracle in education. 
Education Policy Analysis Archives, 8 (41): Retrieved Jul. 23, 07 from: 
http://epaa.asu.edu/epaa/v8n41/
    \5\ Smith, F., et al. (1986). High school admission and the 
improvement of schooling. NY: New York City Board of Education; 
Darling-Hammond, L. (1991). The Implications of Testing Policy for 
Quality and Equality, Phi Delta Kappan, November 1991: 220-225; Heilig, 
J. V. (2005), An analysis of accountability system outcomes. Stanford 
University.
    \6\ For recent studies examining the increases in dropout rates 
associated with high-stakes testing systems, see Advocates for Children 
(2002). Pushing out at-risk students: An analysis of high school 
discharge figures--a joint report by AFC and the Public Advocate. 
http://www.advocatesforchildren.org/pubs/pushout-11-20-02.html; W. 
Haney (2002). Lake Wobegone guaranteed: Misuse of test scores in 
Massachusetts, Part 1. Education Policy Analysis Archives, 10(24). 
http://epaa.asu.edu/epaa/v10n24/; J. Heubert & R. Hauser (eds.) (1999). 
High stakes: Testing for tracking, promotion, and graduation. A report 
of the National Research Council. Washington, D.C.: National Academy 
Press; B.A. Jacob (2001). Getting tough? The impact of high school 
graduation exams. Education and Evaluation and Policy Analysis 23 (2): 
99-122; D. Lilliard, & P. DeCicca (2001). Higher standards, more 
dropouts? Evidence within and across time. Economics of Education 
Review, 20(5): 459-73;G. Orfield, D. Losen, J. Wald, & C.B. Swanson 
(2004). Losing our future: How minority youth are being left behind by 
the graduation rate crisis. Retrieved July 23, 2007 from: http://
www.urban.org/url.cfm?ID=410936; M. Roderick, A.S. Bryk, B.A. Jacob, 
J.Q. Easton, & E. Allensworth (1999). Ending social promotion: Results 
from the first two years. Chicago: Consortium on Chicago School 
Research; R. Rumberger & K. Larson (1998). Student mobility and the 
increased risk of high school dropout. American Journal of Education, 
107: 1-35; E. Rustique-Forrester (in press). Accountability and the 
pressures to exclude: A cautionary tale from England. Education Policy 
Analysis Archives; A. Wheelock (2003). School awards programs and 
accountability in Massachusetts.
    \7\ Advocates for Children (2002), Pushing out at-risk students; 
Heilig (2005), An analysis of accountability system outcomes; Wheelock 
(2003), School awards programs and accountability.
    \8\ Heilig, 2005.
    \9\ Wheelock, 2003
    \10\ Heilig, 2005.
    \11\ American Educational Research Association, American 
Psychological Association, & National Council on Measurement in 
Education, Standards for Educational and Psychological Testing, 
Washington DC: American Educational Research Association, 1999, p.142.
    \12\ AERA, APA, NCME, Standards for Educational and Psychological 
Testing., p.146.
    \13\ See for example, Haney (2000). The myth of the Texas miracle; 
J.L. Herman & S. Golan (1993). Effects of standardized testing on 
teaching and schools. Educational Measurement: Issues and Practice, 
12(4): 20-25, 41-42; B.D. Jones & R. J. Egley (2004). Voices from the 
frontlines: Teachers' perceptions of high-stakes testing. Education 
Policy Analysis Archives, 12 (39). Retrieved August 10, 2004 from 
http://epaa.asu.edu/epaa/v12n39/; M.G. Jones, B.D. Jones, B. Hardin, L. 
Chapman, & T. Yarbrough (1999). The impact of high-stakes testing on 
teachers and students in North Carolina. Phi Delta Kappan, 81(3): 199-
203; Klein, S.P., Hamilton, L.S., McCaffrey, D.F., & Stetcher, B.M. 
(2000). What do test scores in Texas tell us? Santa Monica: The RAND 
Corporation; D. Koretz & S. I. Barron (1998). The validity of gains on 
the Kentucky Instructional Results Information System (KIRIS). Santa 
Monica, CA: RAND, MR-1014-EDU; D. Koretz, R.L. Linn, S.B. Dunbar, & 
L.A. Shepard (1991, April). The effects of high-stakes testing: 
Preliminary evidence about generalization across tests, in R. L. Linn 
(chair), The Effects of high stakes testing. Symposium presented at the 
annual meeting of the American Educational Research Association and the 
National Council on Measurement in Education, Chicago; R.L. Linn 
(2000). Assessments and accountability. Educational Researcher, 29 (2), 
4-16; R.L. Linn, M.E. Graue, & N.M. Sanders (1990). Comparing state and 
district test results to national norms: The validity of claims that 
``everyone is above average.'' Educational Measurement: Issues and 
Practice, 9, 5-14; W. J. Popham (1999). Why Standardized Test Scores 
Don't Measure Educational Quality. Educational Leadership, 56(6): 8-15; 
M.L. Smith (2001). Put to the test: The effects of external testing on 
teachers. Educational Researcher, 20(5): 8-11.
    \14\ Achieve, Do graduation tests measure up? A closer look at 
state high school exit exams. Executive summary. Washington, DC: 
Achieve, Inc.
    \15\ L. Darling-Hammond & J. Ancess (1994). Authentic assessment 
and school development. NY: National Center for Restructuring 
Education, Schools, and Teaching, Teachers College, Columbia 
University; B. Falk & S. Ort (1998, September). Sitting down to score: 
Teacher learning through assessment. Phi Delta Kappan, 80(1): 59-64. 
G.L. Goldberg & B.S. Rosewell (2000). From perception to practice: The 
impact of teachers' scoring experience on the performance based 
instruction and classroom practice. Educational Assessment, 6: 257-290; 
R. Murnane & F. Levy (1996). Teaching the new basic skills. NY: The 
Free Press.
    \16\ J.B. Baron (1999). Exploring high and improving reading 
achievement in Connecticut. Washington: National Educational Goals 
Panel. Murnane & Levy (1996); B.M. Stecher, S. Barron, T. Kaganoff, & 
J. Goodwin (1998). The effects of standards-based assessment on 
classroom practices: Results of the 1996-97 RAND survey of Kentucky 
teachers of mathematics and writing. CSE Technical Report. Los Angeles: 
UCLA National Center for Research on Evaluation, Standards, and Student 
Testing; S. Wilson, L. Darling-Hammond, & B. Berry (2001). A case of 
successful teaching policy: Connecticut's long-term efforts to improve 
teaching and learning. Seattle: Center for the Study of Teaching and 
Policy, University of Washington.
    \17\ C. Chapman (1991, June). What have we learned from writing 
assessment that can be applied to performance assessment?. Presentation 
at ECS/CDE Alternative Assessment Conference, Breckenbridge, CO; 
J.L.Herman, D.C. Klein, T.M. Heath, S.T. Wakai (1995). A first look: 
Are claims for alternative assessment holding up? CSE Technical Report. 
Los Angeles: UCLA National Center for Research on Evaluation, 
Standards, and Student Testing; D. Koretz, K., J. Mitchell, S.I. 
Barron, & S. Keith (1996). Final Report: Perceived effects of the 
Maryland school performance assessment program CSE Technical Report. 
Los Angeles: UCLA National Center for Research on Evaluation, 
Standards, and Student Testing; W.A. Firestone, D. Mayrowetz, & J. 
Fairman (1998, Summer). Performance-based assessment and instructional 
change: The effects of testing in Maine and Maryland. Educational 
Evaluation and Policy Analysis, 20: 95-113; S. Lane, C.A. Stone, C.S. 
Parke, M.A. Hansen, & T.L. Cerrillo (2000, April). Consequential 
evidence for MSPAP from the teacher, principal and student perspective. 
Paper presented at the annual meeting of the National Council on 
Measurement in Education, New Orleans, LA; B. Stecher, S. Baron, T. 
Chun, T., & K. Ross (2000) The effects of the Washington state 
education reform on schools and classroom. CSE Technical Report. Los 
Angeles: UCLA National Center for Research on Evaluation, Standards, 
and Student Testing.
    \18\ Darling-Hammond, Rustique-Forrester, and Pecheone, Multiple 
Measures.
    \19\ M. Smith paper (2007). Standards-based education reform: What 
we've learned, where we need to go. Consortium for Policy Research in 
Education.
    \20\ At least 27 states consider student academic records, 
coursework, portfolios of student work, and performance assessments, 
like research papers, scientific experiments, essays, and senior 
projects in making the graduation decision. Darling-Hammond, Rustique-
Forrester, and Pecheone, Multiple Measures.
                                 ______
                                 
    [National School Boards Association (NSBA) letter follows:]

                                                    March 20, 2007.
Hon. George Miller, Chair,
Committee on Education and Labor, U.S. House of Representatives, 
        Washington, DC.

Re: Hearing of the House Education and Labor Committee on Adequate 
        Yearly Progress, March 21, 2007; National School Boards 
        Association Statement for the Record.
    Dear Chairman Miller: The National School Boards Association 
(NSBA), representing over 95,000 local school board members across the 
nation, commends you for your strong support to reauthorize the 
Elementary and Secondary Education Act (ESEA)/No Child Left Behind 
(NCLB) Act during the 110th Congress, and for establishing an 
aggressive schedule for congressional hearings over the coming weeks. 
NSBA looks forward to participating in future hearings and very much 
appreciates the opportunity to submit written testimony for the record.
    Local school boards across the nation continue to support the goals 
of NCLB--including increased accountability for student performance. 
However, of utmost concern to local school boards is the belief that 
the current accountability framework does not accurately or fairly 
assess student, school, or school district performance.
    Although the sponsors of the No Child Left Behind Act intended to 
establish a responsive accountability system for the nation's public 
schools, what has evolved in the name of accountability is a 
measurement framework that bases its assessment of school quality on a 
student's performance on a single assessment; and mandates a series of 
overbroad sanctions not always targeted to the students needing the 
services.
    Five years after enactment of the federal law, local school 
districts continue to struggle to comply with the language of the law 
at a time when the unintended consequences of this complex law are 
imposing far more dysfunctional and illogical implementation problems 
than had been anticipated by the sponsors of the legislation. NSBA 
believes that the NCLB law can be amended to improve the accountability 
system in a way that restores public confidence in the law and results 
in significant improvement in the academic achievement of all students.
    In January 2005, NSBA officially unveiled its bill, the No Child 
Left Behind Improvements Act of 2005. The bill contains over 40 
provisions that would improve the implementation of the current federal 
law. In June, 2006, Representative Don Young (R-AK) introduced H.R. 
5709, the No Child Left Behind Improvements Act of 2006, which 
incorporated all of the NSBA recommendations. Co-sponsors of H.R. 5709 
included Representatives Steven R. Rothman (D-NJ-9), Rob Bishop (R-UT-
1), Todd Platts (R-PA-19), and Jo Bonner (R-AL-1). In January 2007, 
Rep. Young re-introduced his bill as the No Child Left Behind Act of 
2007, H.R. 648. The bill's co-sponsors to date include Representatives 
Charlie Melancon (D-LA-3), Steven Rothman (D-NJ-9), Jo Bonner (R-AL-1), 
Thaddeus McCotter (R-MI-11), and Todd
    Platts (R-PA-19), verifying strong bi-partisan support for these 
important improvements to the current law. This comprehensive bill 
addresses the key concerns of local school boards, including those 
provisions related to the accountability and the adequate yearly 
progress (AYP) framework. This bill would:
    Increase the flexibility for states to measure adequate yearly 
progress (AYP), including growth models.
    Grant more flexibility in establishing goals and determining AYP 
targets.
    Create a student testing participation range, providing flexibility 
for uncontrollable variations in student attendance.
    Allow schools to target resources to those student populations who 
need the most attention by applying sanctions only when the same 
student group fails to make adequate yearly progress (AYP) in the same 
subject for two consecutive years.
    Ensure that students are counted properly in AYP reporting systems.
    NSBA encourages you to review the No Child Left Behind Improvements 
Act of 2007, H.R. 648 in its entirety. However, for your convenience we 
have enclosed a copy of our Quick Reference Guide to the bill that 
provides the recommended provisions and a brief rationale.
    NSBA very much appreciates the opportunity to submit a written 
statement for the Record, and we look forward to working closely with 
you and your staffs to complete the reauthorization process during this 
First Session of the 110th Congress. We will also provide you with 
recommended legislative language which should be helpful to your staff 
in drafting the new bill.
    Questions concerning our specific recommendations may be directed 
to Reginald M. Felton, director of federal relations.
            Sincerely,
                                        Michael A. Resnick,
                                      Associate Executive Director.
                                 ______
                                 
    Chairman Miller. And, with that, the committee will stand 
adjourned. And, again, thank you so very much.
    [Whereupon, at 12:51 p.m., the committee was adjourned.]