Now in its third year, the Data Science for the Public Good Young Scholars Program engages undergraduate and graduate students to work on projects that address local and state government challenges. This summer, 12 students participated in the intensive training program, which was led by Iowa State University and ISU Extension and Outreach.
In September, three ISU graduate students (Harun Çelik, history; Romina Tafazzoli, community and regional planning; and Muskan Tantia, computer science) as well as Nayha Hussain, junior in computer science and economics at Clemson University, will present their team projects at the Learning and Doing Data for Good Conference at the University of Washington.
“We live in a world of data that takes on a variety of forms and often requires a diverse set of tools and technologies to allow us to analyze and visualize the information,” said Christopher J. Seeger, professor of landscape architecture and director of ISU Extension and Outreach’s Indicator Program.
“Just as important, however, is that we have a diverse set of educational backgrounds and experiences to help work through the data. This year, the 12 DSPG students represented nine different majors. The DSPG program brings together a mix of skill sets and disciplines and roots them around core data science principles to help build a future workforce that is prepared to take on new challenges in agriculture, economic development, health, education, community planning and society in general.”
The 2022 students’ final presentations can be viewed on the DSPG website, which includes projects from previous years.
Students worked on the following projects this summer:
Successful employment for Iowans with disabilities
Iowa’s policymakers, advocates, and grant-seekers often struggle to find and access data on disabilities. This makes it difficult to fully understand the complexities surrounding Iowans with disabilities and their success in employment.
To tackle data accessibility, students worked to discover, profile and present data pertaining to successful employment for Iowans with disabilities and to serve as a foundation for connecting various data sources on employment through the assessment of public services available to Iowans with disabilities.
The student team, led by Çelik:
- Identified the counties that were above or below the state average for employment for persons with a disability.
- Analyzed median earnings by disability type at the county level and found a negative relationship between median earnings and disability type related to self-care, and a positive relationship between median earnings and hearing disability type.
- Analyzed Iowa Vocational Rehabilitation Services data at the county level for the last 13 years and found that individuals who gained employment at the end of their case averaged a $9-$10 hourly wage increase and that this was consistent over the last 13 years.
- Created a complete listing of all the data sources that would be useful in evaluating disability in Iowa (focused on demographics and employment) at the county and regional level.
Exploring measures to analyze local housing needs
Local housing decision-makers have three key concerns: availability, affordability and accessibility of housing for diverse demographic groups.
They need reliable and current data to help set housing policy priorities, but this information is often difficult to find all in one location. This process can also be complicated by time and budget limits, inflexible tabulation formats, lack of detail and time sensitivity in data sources.
The student team explored how much and what types of data are available to describe local housing markets and analyzed data sets from a wide variety of housing-related sources. Students developed a dashboard to provide a summary of the findings and visuals of housing-related indicators.
The student team, led by Tafazzoli:
- Found that rural counties in southern Iowa appear to have a higher incidence of homes that have been vacant 36 months or longer and that long-term vacancy may be associated with homes having physical/structural issues that make them less desirable.
- Encountered several instances in which two or more different data sources measuring similar types of housing activity did not correlate as closely as expected, suggesting the importance of seeking multiple sources to characterize local housing markets.
- Found that housing data for small communities and rural areas are both less available and less reliable than data for metropolitan areas.
Wholesale local food price benchmarking
Students developed a data platform that can be used to aggregate localized and up-to-date benchmarks on pricing of products in retail and wholesale spaces. This came in response to needs from the Iowa State Farm Food and Enterprise Development team, which is frequently asked for this information. Many specialty crop producers across Iowa are operating in direct-to-consumer retail spaces, making these data a necessity.
The student team, led by Tantia:
- Used Google Trends to identify the top five commodities to focus on in this study and determine the temporal interest in these products over the past five years as a basis for comparison in price fluctuations.
- Reviewed data currently available from the AMS and USDA (including the Agricultural Census) and compared the difference between USDA and local/grocery/food hubs price data.
- Developed a demonstrative web-scraping scheduler to provide end-to-end data automation processes created for data collection.
- Forecasted prices and the most important variables affecting prices using weather and economic factors.
Three more DSPG student teams worked on the following projects:
- Evaluating community transportation data: Students identified best practices to manage and evaluate survey results and transcripts of focus group discussions collected over multiple years from several communities engaged in transportation projects. The project challenged students to consider how changes over time to surveys or public participation may implicate how analysis can be done for a broad data set and to inform future changes that will improve future analysis.
- Interactive commodity reports for agricultural marketing: Students created customizable agricultural marketing commodity reports that can be used as preliminary research to determine current production, market analysis, demographic data and price points when applying for grants, financial institution loans and other types of funding. Reports include demographic data (population, race, ethnicity, family structure, income), food data (food deserts, food security, market kinds – schools, restaurants), interactive visualizations and improved geographic filters to offer national- and state-level data that can be downloaded as a PDF.
- Beginning farmer asset mapping: Students created an interactive dashboard for beginning farmers to see updated, localized information on specialty crops, soil information and climate data that can help them make the right choices for their framing practices.