sbacAn eye opening report Issues and Recommendation for Resolution of the General Assembly Regarding Validity and Reliability of the Smarter Balanced Assessments Scheduled for Missouri in Spring 2015″ authored by Dr. Mary Byrne of the Missouri Coalition Against Common Core, in consultation with other teachers and a test development expert, shows that the SBAC test Missouri schools are poised to give this spring has no external validity or reliability. In laymen terms this means that the test developers have no corroborating outside confirmation to prove that their test questions measure what they claim to measure or can produce consistent results in repeated administrations. All they have is their own claim of validity and a plan to develop this external validity some time in the future. This means that no meaningful conclusions can be drawn from student scores on this exam. Despite the fact that SBAC piloted both the test items AND the delivery system simultaneously, making determination of why a student may have missed an answer extremely difficult to tease out, SBAC went ahead and set cut scores from data collected during the pilot tests given last spring. Further, by design, those cut scores have been set so that 62% of the children will score below proficient according to this EdWeek article.

From the report Introduction:

“Tests that are valid and reliable are the only legally defensible tests that can be incorporated into any evaluation plan of students, teachers, and districts. Missouri’s State Board of Education and Department of Elementary and Secondary Education (DESE) are responsible for ensuring that statewide assessments administered in Missouri are valid and reliable, yet, they committed Missouri as a governing member of the Smarter Balanced Assessment Consortium (SBAC) in April 2010,[1] before the test was developed, and even before the project manager for developing the pioneering assessment system was named..[2] In October 2013, DESE contracted with McGraw-Hill to administer tests aligned to the Common Core State Standards[3] despite restrictions “. . . that no funds shall be used to implement the Common Core Standards” as per HB 002 Section 2.050. McGraw-Hill has also contracted with SBAC to produce the SBAC test items.[4] The SBAC summative assessments dates are Mar. 30-May 22, 2015. According to DESE staff, student assessments will be scored, but not used in teacher evaluation or district accreditation. An SBAC memo dated September 12, 2014[5] indicates evidence of adequate validity and reliability is not available to support administration of the SBAC in spring 2015 and interpretation of scores.”

[1] http://www.moagainstcommoncore.com/documents
[2] http://www.smarterbalanced.org/news/wested-selected-as-project-manager-for-smarter-balanced-assessment-consortium/
[3] http://dese.mo.gov/communications/news-releases/missouri-education-department-chooses-vendor-assessments
[4]   http://www.ednetinsight.com/news-alerts/prek-12-market-headlines/sbac-selects-ctb-mcgraw-hill-to-develop-next-generation-of-assessments.html
[5] https://www.sde.idaho.gov/site/commonAssessment/docs/Memo_Validity_Overview_2014-09-11.pdf

It is clear from SBAC documents that they have had concerns about this very issue for a long time. The problem was the artificial timetable set for delivery of the tests. This is what caused the simultaneous testing of infrastructure and test items. This timetable was set with the ARRA money that the USDoED received back in 2009 which they used to set up the two testing consortia. The $330 million that was funding both PARCC and SBAC was set to run out in September 2014. That meant that they had to have an operational test ready to go for 2015. In the world of test development this was delivery at mach 10. They cut a lot of corners to get where we are.

On page two of the memo the the K12 LEADS of SBAC it states,

Test reliability will initially be modeled through simulations using the item pool after item review, which is due to be completed December 31, 2014. Operational test reliability will be reported in the technical manual following the first operational administration in spring 2015. . . .[emphasis added]

. . . Because this type of evidence continues to be gathered through the operational administration of the assessments, this table mostly reflects future plans [emphasis added] for external validity research.

There is no current reliability of the test items. That will be available some time in the future. The best they can do at this point is use a statistical model to estimate reliability. District AYP for 2015 will be calculated using a predictive model, not actual data. I can’t help but think of another model being used to set public policy that has never been accurate [see Climate Models Fail]

Questioning the Assessment’s Alignment with Common Core Standards

In the SBAC General Item specifications it states:

“CCSS were not specifically developed for assessment and contain a great deal of rationale and information about instruction. Therefore, following a practice many states have used in the past, Smarter Balanced distilled from the CCSS a set of content specifications expressly created to guide assessment development. (p. 9)[1]

[1] http://www.smarterbalanced.org/wordpress/wp-content/uploads/2012/05/TaskItemSpecifications/ItemSpecifications/GeneralItemSpecifications.pdf

Dr. Byrne explains, “The statement indicates that problems of establishing validity and reliability may be attributable to the Common Core State Standards themselves, because they were actually insufficient for guiding the development of test items. Development of a comprehensive set of content specifications would not be necessary if the language of the standards was as clear and concise as proponents claim. Rather, the statement suggests that writing test items for SBAC requires rewriting and expanding the standards themselves. This also means that the real standards students will be measured against are not from CCSS, but rather result from closed discussions from the SBAC development team.”

Specific Problems with the Assessments

Reviewers of the 600+ page document submitted by Commissioner Nicastro in September 2014 to the Missouri Senate President Pro Tem and the Speaker of the House which was supposed to show evidence of validity and reliability noted several problems including:

  • grade level blueprints… clearly shows that the mathematics assessment cannot be considered reliable or having content validity. Having at most 15 items (15 points, some items may be worth more than 1 point) to address the broad scope of the priority clusters at various grades seems clearly insufficient. There is no reasonable way that 15 items can adequately sample the content of grade-level mathematics with meaningful content validity or sufficient reliability. Further, the mathematics assessment in total has too few items, even including items addressing supporting clusters and process skills.
  • Appendix R where it is clearly visible that most standards will get a single test item to assess them, if any at all. This is grossly inadequate.
  • No evidence is reported that test results are predictive of anything, including college or workforce readiness. Minimally, test results should relate to current classroom performance or grade point average (GPA), but this important validity evidence is not found in this report.

Missouri reviewers are not the only ones to voice serious concerns about the SBAC test’s ability to deliver valid and reliable results.

“Mr. Doug McRae,[1] a former McGraw-Hill executive specializing in test development, has persistently questioned Smarter Balance’s methodology and timetable in testimony before the California State Board of Education and published articles. [2],[3]

McRae commented in 2013, “ . . . developing “next generation” assessments has been slow. The idea that we will have a fully functional college and/or career readiness assessment instrument by 2014-15 is fanciful thinking.[4] He participated in the online SBAC cut score setting activity in October 2014 and was skeptical about its technical adequacy

[1] Doug McRae, is a retired measurement specialist who worked at CTB/McGraw-Hill — one of the nation’s foremost education test publishers — for 15 years, first as a senior evaluation consultant and later as vice president of publishing. While at CTB/McGraw-Hill, he was in charge of design and development of K-12 tests widely used across the U.S. From 1994 to 2008 he was self-employed as a consultant who advised on the initial design and development of the California STAR Program and advising policymakers on the design and implementation of K-12 standardized tests in California. He has a Ph.D. in Quantitative Psychology from the University of North Carolina, Chapel Hill.”

[2] http://toped.svefoundation.org/2011/03/07/common-core-groups-should-be-asked-plenty-of-questions/
[3] http://montereycountyschools.blogspot.com/2013/01/more-on-proposed-revamping-of-testing.html
[4] http://montereycountyschools.blogspot.com/2013/01/more-on-proposed-revamping-of-testing.html

In addition, HB1490 Section 161.096.2 stipulates “Quantifiable student performance data shall only include performance on locally developed or locally approved assessments, including but not limited to formative assessments developed by classroom teachers.” In an apparent violation of this language in the statute, the Smarter Balanced Assessment given this spring will contain performance items developed by SBAC. “The performance tasks [commonly seen as short answer or essay questions] will be administered on a matrix-sampling basis.” Teachers have received notice that at least one of the performance items will be a collaborative response with instructions given by “facilitators,” and for which students will only have a brief window to ask clarifying questions of the facilitators. The student teams will have two days to complete that task. How scorers are going to determine the level of collaboration when all they have is a final product to judge is a great unknown. The language in HB 1490 is meant to protect Missouri’s students from the experimental nature of the SBAC performance test items and the persistent tensions among test developers in its design.

What To Do?

According to HB1490, which in turn cited previous Missouri statute so this is not something new to DESE, the Commissioner is supposed to deliver a report of any new or modified statewide assessement’s validity and reliability to the Pro Tem and Speaker. The legislature then has 60 legislative days to veto the implementation of the test. That clock began ticking on January 7th when the legislature opened the 2015 session. The report written by Dr. Byrne contains a recommended resolution that both chambers can pass to stop the pointless implementation of a test that is illegal by Missouri statute and will provide no meaningful results.

Recommendations

“The following recommendations for a resolution to veto implementation of the SBAC in Missouri and investigate decisions made by the commissioner of education and state board of education to implement the SBAC without assurance of its technical adequacy are made on the grounds that state education leaders did not exercise due diligence in the adoption of membership in SBAC prior to reviewing evidence of assessment validity and reliability; and on their insistence that school districts administer SBAC to evaluate teachers, as well as, earn points toward accreditation in MSIP5, without assuring districts of SBAC assessments’ legal defensibility:

  1. Legislative leadership should, within the number of legislative days allowed by HB 1490, support a resolution in both the House of Representatives and the Senate to stop implementation of the Smarter Balanced Assessments in Missouri and withdraw Missouri from SBAC as per Missouri’s original memorandum of understanding with SBAC;
  2. The resolution should also request an independent review of technical adequacy by a testing expert having experience in development and evaluation of nationally used, standardized tests, and is not associated with the SBAC or PARCC consortia, state or federal departments of education, or testing companies that contract with any of the aforementioned entities for the purpose of detecting fraud;
  3. The resolution should request a thorough investigation of appropriate uses of statewide standardized tests in Missouri (for example, whether student growth models relying on standardized test results, as required by the U.S. DoE’s No Child Left Behind Waiver Renewal application, are appropriate) for the purpose of challenging the conditions of the ESEA waiver offered by the U.S. DoE;
  4. The resolution should request an immediate audit of the DESE and State Board of Education to determine why the a contract with McGraw-Hill was negotiated when the Attorney General was suing McGraw-Hill.

Other important issues are covered in the report including: the ongoing lawsuit challenging the legality of the consortium as an interstate compact, DESE’s ignoring of the prohibition in HB2 2013 against entering into any  contracts to implement common core, and the vulnerability of the tests to hacking.  The report has been delivered to the Speaker and Pro Tem and been positively received. Missourians should be bringing this report to their representatives attention and encouraging them to support a veto of the SBAC implementation.

Anne Gassel

Anne has been writing on MEW since 2012 and has been a citizen lobbyist on Common Core since 2013. Some day she would like to see a national Hippocratic oath for educators “I will remember that there is an art to teaching as well as science, and that warmth, sympathy and understanding are sometimes more important than policy or what the data say. My first priority is to do no harm to the children entrusted to my temporary care.”

Facebook Twitter