Kaiyu Guan Charts the Course from Blue Waters to Delta

10/25/2022 NCSA

Guan is a researcher with lofty goals – he hopes to monitor, model and ultimately optimize every farmland.

Written by NCSA

Illinois CS affiliate faculty member Kaiyu Guan picture taken outside, in the Illinois countryside.
Kaiyu Guan is the Blue Waters Associate Professor in ecohydrology and remote sensing in the Department of Natural Resources and Environmental Sciences (NRES), College of Agricultural, Consumer and Environmental Sciences (ACES), Department of Computer Science, and at the National Center for Supercomputing Applications (NCSA).

Kaiyu Guan is a researcher with lofty goals – he hopes to monitor, model and ultimately optimize every farmland. Guan aims to achieve these goals in the coming decade or so. He’s a researcher with a mission; of helping create tools so farmers can check on and manage their crops – every single field – in real-time to maintain a healthy and productive growth cycle. But simply reaching that goal isn’t enough. Guan also hopes to achieve co-sustainability of environment quality and food security. It’s quite the task, and he’s been using the supercomputing resources at NCSA to tackle the issues surrounding both aspects of his mission, one piece of research at a time. He’s also in the unique position of being one of the researchers at UIUC who’s had experience using NCSA’s retired supercomputer, Blue Waters, and its new cutting-edge GPU-processing resource, Delta.  

Blue Waters, the Supercomputer That Could

Blue Waters had a long and storied history at NCSA. In 2013, when Blue Waters came online, it was the fastest supercomputer ever built and it remained a powerhouse in continuous operation until 2021. Scientists and researchers made great use of its processing power helping capture the first image of a black hole, modeling galactic evolution and counting every tree in the entire West African drylands. In fact, more than 39 billion core hours were used by scientists and researchers around the country.

Researchers on campus also used Blue Waters over the years discovering what can be achieved with the help of a supercomputer. Guan is just one of these researchers. A Blue Waters specialist from the University of Illinois and NCSA, Guan knows supercomputing resources are essential to the type of research he and his team do at the Department of Natural Resources and Environmental Sciences (NRES).

NCSA Blue Waters Computation Facility
NCSA Blue Waters Computation Facility

One scientist’s journey with Blue Waters

Guan has been with UIUC since 2016. His specialty in environmental science and interdisciplinary nexus of plant, water and nutrients made him a perfect fit for NRES. He was also a Blue Waters professor and has worked extensively with that supercomputer.

The research group that Guan leads at UIUC focuses on advancing the science and developing solutions for sensing, modeling, and managing agroecosystem productivity and environmental sustainability. What his group has been able to achieve with Blue Waters shows one of the many ways supercomputers aid in a variety of research. In Guan’s case, his work with sensing and modeling stands out.

Researchers working on sensing are taking vast amounts of data and training the computer to find something specific within that data. It’s like training a computer to examine all the straw in a haystack and find the needles. In the case of counting all the trees in the West African drylands, the researchers needed to train Blue Waters on what a tree looked like on a high-definition satellite image. Training the computer required them to examine a satellite image and their spectral signals, and identify the characteristics of a tree – maybe it’s so many pixels wide and has a certain shape and color. After many examples are fed to the computer, it starts to recognize the patterns that repeat in each sample and can start to identify trees on its own within a certain margin of error. The supercomputer can “sense” what a tree is from an image.

Guan’s team has used various sensing technology to make a number of breakthroughs. One such project was to evaluate the amount of marginal land available for bioenergy crops. Each year, farmers decide what they’ll grow the following year. A number of factors are involved in this process, and it’s a lot more complicated than it sounds. Trending prices, subsidies and weather projections are all considered each year before planting. Marginal land also comes into consideration, as these are farmlands without the best yields. Not all land is created equal, as anyone who has tried to have a home garden can attest to. The back corner of your yard might be perfect for tomatoes, but maybe the area along the garage is too wet and shady. Farmland is a scaled-up version of your home garden. Some land isn’t worth farming unless there’s a strong need, and food-producing plants often require high-quality land to grow the excellent yields needed to meet consumer demand. However, bioenergy crops don’t need the best land to grow on, so knowing how much marginal land is available is important, especially if there’s a concern about the two types of crops competing for space.

Guan’s team worked on a study led by Chonya Jiang, University of Illinois Research Scientist at the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), published in Environmental Science and Technology. What they discovered is that there is far less marginal land than was previously thought. Guan’s team taught Blue Waters to recognize the identifying characteristics of marginal land and the supercomputer analyzed satellite data to determine if the land was actually marginal.

“We turned to satellite data,” said Guan, project lead of the published paper. “Our team looked at which satellite pixels were permanently cropland, which were permanently grasslands, and which were frequently changing from one use to another. The pixels that were frequently changing are more likely candidates for economically marginal land.”

These huge datasets are exactly the kind of data that high-performance computing is good with. “We have processed nine billion pixels for each of the eight years of satellite data. It is a huge amount of work that only a supercomputer – the Blue Waters Supercomputer in our case – can enable us to do,” Jiang said.

Sensing is only one of Guan’s concentrations. His work also relies on modeling made possible by supercomputers. Guan’s team uses ecosystem models, which are a complex mathematical representation of an ecosystem. Imagine you could take all the variables that affect the growth cycle of plants and activities of soil microbes – water, sunlight, soil nutrients, management practices – and create mathematical formulas to capture all of them within a computer program that will simulate crop growth in any farmland. You could then adjust the variables – more water, different crop rotation, reduced fertilizer – and see how that affects the plant and soil microbes. 

When scientists like Guan model, they do so on a much more intricate and accurate scale. Guan’s model aims to capture the aboveground and belowground dynamics from a single plant to millions of farmlands in the U.S. Midwest, and ultimately globally. These kinds of models can’t be done without intensive computer processing to keep track of all the formulas, variables and other data needed to create the model accurately. This is where Blue Waters came into play.

To understand the breadth of how Guan’s work contributes to agricultural sustainability and what some of those variables are in his models, it’s important to know what a plant’s function is in the ecosystem, starting with the basics. Carbon is the main building block for all known life. We have yet to discover any lifeform that doesn’t use carbon. But carbon is also part of carbon dioxide (CO2), which is emitted by using fossil fuels and causing global warming. It’s important to remember that plants photosynthesize to consume carbon dioxide and draw carbon from the atmosphere to the land. Guan’s team works with agriculture, so their work to improve crop growth and increase soil carbon storage also effectively improves the environment by helping remove carbon dioxide from the air.

By measuring carbon, Guan’s team can predict with great accuracy how much a crop will yield. While most carbon is found in soil, the amount can fluctuate, concerning scientists like Guan. Lower soil organic carbon (SOC) can lead to lower crop yields. Growing crops has its own strain on the environment, so the goal is to grow more in less area. Growing cover crops during the fallow season on the land also achieves draws more CO2 from the atmosphere to the land. Making sure carbon isn’t depleted in soil is one of the ways farming can remain sustainable.

One of Guan's goals is to grow more crops in less area, as growing crops has its own strain on the environment. 
One of Guan's goals is to grow more crops in less area, as growing crops has its own strain on the environment. 

To measure how much SOC is used up by agriculture, Guan’s team leveraged an advanced agroecosystem model called ecosys. This is the most comprehensive model of the agroecosystem to date and can accurately simulate the energy, water, carbon and nutrients flowing in and out of the agroecosystem. Guan’s SMARTFARM Project team was able to constrain the ecosys model even more to get highly precise data about photosynthesis, and they did all their measurements from satellite data.

“What really makes our modeling solution exciting,” Guan said, “is that we use the most advanced observations from satellites to constrain a powerful agroecosystem model, and we demonstrate that this can achieve the highest performance in estimating different carbon components. This is like creating a weather forecasting capability, but for predicting the carbon budget of every farm in the landscape.”

Guan’s team published a paper on the results of their work in Agricultural and Forest Meteorology and has leveraged their research in several more projects and papers.

“This is state of the art for quantifying carbon budget and carbon credit,” Guan said. “We want to show people what is possible and set a high standard going forward. We let rigorous science speak for itself. I believe that’s the most powerful way to say things as scientists.”

Blue Waters was essential for the work Guan was doing. For supercomputers, the nine years Blue Waters was in operation is more than an era. Predicted to be in operation for five years, its usefulness exceeded the planned lifecycle by five years through careful management and support of NCSA and the National Science Foundation (NSF). But advancements in technology push ever onward. Processors are faster than ever before, more efficient at using power, easier to cool and maintain, and there are high-performance computational options that didn’t exist in 2007 when the idea and planning for Blue Waters began. We can’t diminish or take away the results from Blue Waters’ computing resources – they are many and the research using Blue Waters will continue to have a lasting impact long after Blue Waters’ retirement. However, we can look forward and usher in the era of a new computing resource housed at NCSA, with the expectation that Delta will facilitate its own great achievements and discoveries.

Delta’s time to shine

Delta is more than just a new supercomputer, it’s one of the largest NSF-funded supercomputers. While Delta performs all of the same functions as most other supercomputers due to its 132 CPU nodes, under the hood Delta is different from most NSF-funded systems in how it balances performance. Delta is a Graphics Processing Unit (GPU) heavy computing resource. With 206 GPU nodes, it has 848 modern NVIDIA A40 and A100 GPUs doing the work.

The Delta system at NCSA’s National Petascale Computing Facility.
The Delta system at NCSA’s National Petascale Computing Facility.

Perhaps you’ve heard about GPUs through your gaming friends and their talk of framerates and textures, or even in reference to mining cryptocurrencies. But GPUs offer significant research advantages when certain types of calculations come into play. GPUs happen to be exceptional at performing the same, simple task in parallel over large data sets. If a research project can be broken up into small, repetitive tasks, this is where the GPU is going to shine over the CPU, which can do a lot more complex tasks, but can only do relatively few of them at a time. Because a GPU has an abundance of cores, thousands compared to the CPU’s tens, it can do a lot of simple tasks simultaneously, making it extremely efficient when working on science simulations and machine learning.

Delta is currently the most performant GPU processing resource in NSF’s portfolio, but there was still some work to do before it was announced as the newest resource to join ACCESS (Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support). To prepare, local researchers were invited in to test its capabilities. With his expertise working with Blue Waters, Guan had the knowledge and experience to test the full capabilities of Delta.

He’s been using Delta for several months now, and he’s personally experienced how beneficial it is to keep NCSA’s cyberinfrastructure updated with the latest technologies. We asked him about this experience with NCSA’s newest resource, and how it compared to Blue Waters.

"Delta plays critical roles in my group’s research, as we are primarily computational scientists using remote sensing, process-based modeling with HPC for our work. We have made significant inclusion of GPU-based supercomputing, as we are heavily leveraging Artificial Intelligence and Machine Learning technologies to advance our research," Guan said.

Guan’s team has fully incorporated Delta into their research agenda already. The most recent example of work on Delta builds on Guan’s mission of sustainable farming. The SMARTFARM program Guan is a part of was awarded a $4.5 million grant from the U.S. Department of Energy’s Advanced Research Projects Agency-Energy (ARPA-E), along with the FFAR’s project of $2.1 million grant on innovating airborne hyperspectral sensing to quantify agricultural practices and carbon in crop and soil. The team has been using these fundings to help find better solutions to quantify carbon outcomes and to help farmers be better engaged in the emerging voluntary carbon market and improve their nutrient management.

“We are developing novel technology to enable a thriving market-based solution to promote agricultural sustainability. Our developed technology will enable a carbon credit market, which will incentivize farmers to adopt management practices that benefit the soil, the environment, and then reward themselves. For example, if a certain practice sequesters more carbon, farmers can reduce their carbon emissions and earn financial rewards. These credits can be collected and then used in the open market to offset others toward meeting lower carbon-emission goals,” said Gaun.

The SMARTFARM project Guan’s team is working on is called SYMFONI, which refers to a holistic solution to quantify farm-level carbon credit based on sensing and modeling. The team uses Delta to create rapid analysis of a farm to determine the carbon intensity of each field. They’ve future-proofed their system by using a framework flexible enough to incorporate new sensor technologies as they become available, technologies that Guan’s team is likely to develop utilizing Delta’s capabilities.

"As we are moving our research to integrate more Artificial Intelligence, our work has been evolving from only CPU-heavy to be both GPU- and CPU-heavy. Delta provides the perfect platform to enable this transformation. We are working hard to develop the first generation of a digital twin of agroecosystem in Delta in the coming few years, which is an unprecedented task that we believe would revolutionize how we optimize every farmland on the planet," Guan said.

Guan has found working with Delta to be streamlined given the prior extensive experience of working with Blue Waters, but Delta has further advancement. The interface designed for Delta was created with accessibility in mind, making it easier for researchers who have a range of experience with supercomputers to work with it. NCSA’s hope is that Delta is broadly used, and that scientists who may have never used a supercomputer would find it a friendly experience. “We’ve found Delta to be very useful and easy to work with,” Guan said. “Because of that we are using it for more applications than we did with Blue Waters.”

Guan’s work is evidence that Delta is full of potential. Just as there have been advancements in supercomputing since Blue Waters was built, research has also continued to grow and take advantage of new ways to analyze and process data. More researchers are beginning to incorporate supercomputers into their workflows. Guan’s story is just the first of many Delta will be involved with in the years to come.


Read the original article from NCSA.


Share this story

This story was published October 25, 2022.