The Importance of Having Data; Or What Would Sherlock Holmes Do?

“‘Data! Data! Data!’ he cried impatiently, ‘I cannot make bricks without clay!’”
— “The Adventure of the Copper Beeches” by Sir Arthur Conan Doyle

Sherlock Holmes isn’t the only one relying on data. As anyone in the education world—researchers, parents, teacher, principals, and students—can tell you, decision-making in education is increasingly based on data that shows us what is and isn’t working. So what happens when we don’t have the data we need? Schools that receive federal and state education funds often have specific data reporting requirements, making centralized data collection and analysis relatively convenient. But early childhood education, fragmented across states, localities, programs, and sectors, presents a challenge to the data wonk.

Sherlock Holmes statue

Sherlock Holmes muses on some data points.

Lisa Guernsey and her team at the New America Foundation’s Early Education Initiative have been leading a recent charge to improve data collection. Last month, they released their updated Federal Education Budget Project (FEBP), an impressive database of federal data on education spending and enrollment at the district level. For the first time, FEBP sought to provide pre-K spending data at the state and district level, but noted that many state-funded pre-K programs are not necessarily governed by the same district borders as are K-12. Their policy brief accompanying FEBP’s release sums it up:

“Pre-K and kindergarten data at the local level are labyrinthine and disorganized, hampering any ability to craft policies for equitable access and funding. States must collect more complete and comparable data from school districts and CBOs if policymakers and the public are to understand the state of education for young children in their communities and states.”

So how do we improve early education data? Elementary, my dear Watson.

Improve Existing Information Collection

We don’t just need more data, we need more of the right data, presented in a clear and timely way. The U.S. Census Bureau asks about pre-K participation in its American Community Survey. However, this question suffers from several methodological short-comings: it relies on parent reporting of participation (rather than data from the schools) and includes children ages 3 to 4 enrolled in nursery school or preschool during the previous two months, which may then include children who do not remain enrolled for most of the year, while excluding children enrolled earlier or later. Ultimately, as acknowledged by Alex Holt at the New America Foundation, this question “is so convoluted that we consider the data from it to be effectively useless. Even at the federal level, the U.S. government has no idea how many children are enrolled in pre-K.” Likewise, the information collected on prekindergarten enrollment by the National Center for Education Statistics through its schools survey includes only those children served in programs operated in public elementary schools, without differentiating between 3- and 4-year-olds.  Across all data sets there is considerable uncertainty regarding the extent to which we can accurately identify all classroom participation regardless of the name attached (child care, special education, state pre-K, local public school, Head Start, private preschool, etc.) and even more uncertainty regarding whether we can identify types of programs.  Even separating public and private is difficult because of ambiguities. (For example, many state pre-K programs are operated by private providers, and even Head Start providers are mostly private non-profits.)  So, information is widely available to researchers but it may not answer the questions they’re asking.

Develop Comprehensive Data Systems

The Early Childhood Data Collaborative advocates for coordinated longitudinal early childhood data systems, which are state efforts to collect data to track children’s progress from early childhood and beyond. Their 10 fundamentals of data systems seek to improve data collection and allow stakeholders to link information both longitudinally and to other key programs, while ensuring the system is well-managed, secure, and maintains privacy. Their recent brief on those states who addressed longitudinal data systems in their Race to the Top – Early Learning Challenge (RTT-ELC) applications highlighted important trends, including filling gaps in current data (including information on the workforce) and collaborating across early childhood education systems and agencies.

The very inclusion of data systems as an optional component of RTT-ELC indicates the need for data has been elevated to a place of important within the federal government and hopefully drives continued collaboration among states to improve their current systems.

Fund Quality Data Collection Efforts

Finally, as a field, we need to continue supporting high-quality research collecting data on policies within early childhood education programs. In a piece at The Huffington Post, Lisa Guernsey writes in support of NIEER’s State Preschool Yearbooks, noting:

“The idea behind the yearbooks, Barnett said, was ‘to create an archived data set that would be consistent across the states.’ By making the information available to all, he explained, reporters and policymakers who wanted data would not have to call all 50 states, ‘and state officials could provide information that was comparable to what was provided by the state next door.’ NIEER … sought to halt the spread of misinformation about which states were offering good pre-K programs and enrolling high numbers of children, and which ones weren’t.”

After the Pew Charitable Trusts ended their 10-year investment in the Yearbooks, NIEER has been seeking for a new funder for what’s become one of the most well-respected, well-cited data sources on American early education. Guernsey refers to the times before the Yearbook as “the dark ages,” and it’s hard to imagine going back to a time without it, without media coverage from CBS and NBC and the support of the U.S. Secretary of Education. We’ve seen tremendous growth in not only the media attention on pre-K, but in state-funded pre-K itself: by the 2010-2011 year, nine more pre-K programs were available than in the 2001-2002 year, and quality standards have increased overall even as the Great Recession has worn away at program funds.

Our annual Yearbook publication is a true labor of love, one we’re proud to produce each year, and we’re overwhelmed by the positive response of the early childhood community in supporting and sharing our work. Yet, even this work only covers one of the major segments of the field. We need good data to make the right decisions for early education and the future of America’s students. Only by supporting, collecting, analyzing, and sharing this information with the field will we be able to live up to this advice from the esteemed Detective Holmes: “No, no: I never guess. It is a shocking habit, destructive to the logical faculty.”

– Megan Carolan, Policy Research Coordinator, NIEER


  1. Pingback: Data Sleuthing Around Early Ed | Megan Carolan