Plant Phenomics First Author Insights (02/04/2020)
Chenyong Miao on "Semantic Segmentation of Sorghum using Hyperspectral Data Identifies Genetic Associations"
In today's Plant Phenomics First Author Insights, we invited Chenyong Miao to share his insights as the first author of Semantic Segmentation of Sorghum using Hyperspectral Data Identifies Genetic Associations published in Plant Phenomics.
Tell us about yourself a little bit, what's your current position, your education experience and how did you get into science.
Currently I am a graduate student working in Dr. Schnable's lab at University of Nebraska-Lincoln (UNL) as a Research Assistant.
I received my BA in Biotechnology and MA in Bioinformatics from Henan University of Science & Technology (HUST) and Fujian Agriculture & Forestry University (FAFU) respectively in China. Then I came to the US to pursue my PhD at the University of Nebraska-Lincoln (UNL) in Dr. Schnable’s Lab. Although I major in Agronomy here, I consider myself as a computational biology guy.
I was born in a very small town in Henan Province which is one of the most populous provinces and I think it was also the biggest farming province in China at that time. So I grew up seeing different crops, mainly wheat and rice planted in different seasons every year. I love crops and the endless landscape of farmland in my hometown. I started getting into real scientific research when I was a senior undergraduate at HUST. I joined a research team mainly doing wheat breeding and was supervised by Dr. Chunping Wang who is still actively dedicated in the wheat breeding programs in China. During that period, I learned how to extract DNAs from frozen leaf tissues and also realized the importance of statistical methods and computational biology tools in a successful research program.
Other than the research job in the lab, I am interested in different physical activities such as hiking, playing basketball and soccer. I also like to watch football games but never have a chance to get my hands dirty.
What was the significant issue(s) in your paper? Why did you and your team care about it?
A wide range of plant morphological traits are of interest and of use to plant breeders and plant biologists such as the stalk length and the harvest index - the ratio of grain mass to total plant mass at harvest.
What was the problem(s) to be solved and your proposed solution?
However, these parameters are currently quantified using low throughput and labor intensive methodologies, limiting the feasibility of constructing models for large numbers of genotypes.
Here we explore the viability of using hyperspectral data to classify images of sorghum plants into separate organs with pixel level resolution. Using individual pixel labels generated using the crowdsourcing platform Zooniverse, leaves, stalks, and panicles are demonstrated to have distinct spectral signatures, which can be used for the organ classification using supervised classification algorithms and most of them provide high classification accuracy on this problem in our tests. Semantic segmentation that distinguishes different plant organs increases the feasibility of computationally estimating many of the morphological traits.
What was the contribution(s) of this study and who could benefit from it?
In this study, the organ level semantic segmentation data for a sorghum association population is employed to conduct several genome-wide association studies (GWAS). The identification of known genes controlling phenotypic variation for previously measured traits is recapitulated and trait-associated SNPs are also identified for novel traits. Overall, the data, methods and pipeline introduced in this paper can aid further efforts to identify genes controlling variation in important morphological traits in both sorghum and other grain crop species.
Are there any interesting stories behind the paper?
The inspiration of this semantic segmentation approach using hyperspectral data came from an example adopted in a lot of machine learning introduction tutorials. The problem is to classify Iris flower types based on the flower features such as petal and sepal dimensions. Many of you may know this example if you start digging into machine learning knowledge online. When I first got hands on the hyperspectral data, I was surprised that it has so many bands across a wide range of wavelength, which means it captures more information of the object than the normal RGB images with only three bands -Red, Green, Blue. Then I asked myself if I can treat each band as a feature for a pixel, just like the petal length as a feature for a flower. Then treat the organ type the pixel belongs to as the label information. With features and labels are available, different models can be tested such as Random Forest and Support Vector Machine. Surprisingly, it worked and resulted in this paper finally.