CS-led team pioneers improvement of microservice-based applications

8/12/2024 Bruce Adams

CS professors Brighten Godfrey, Radhika Mittal, and Rayadurgam Srikant are directing an NSF-funded project that has the potential to broadly benefit cloud-hosted applications with significant performance, resource utilization, and cost improvements.

Written by Bruce Adams

When we use a phone or laptop application like messaging, a map, or video streaming, it is hosted in compute clouds. Important parts of their application logic, databases, and storage run on servers in large data centers in multiple sites across the country or the world. Many distinct software components perform dozens or hundreds of small tasks behind the scenes to collectively produce the web pages or streams we want. This method of building applications in separate components, which communicate and coordinate with each other, is known as a “microservice” architecture.

That usage is complex and unpredictable. There are many moving parts, making microservices hard to optimize and manage. Communication among the distributed parts of an application can cost performance and bandwidth fees. Controlling all the individual microservices dynamically can be error-prone. Consequently, computing resources and the energy associated with them are often reserved in inefficient and wasteful ways—70% waste, or even more, is common.

A project team at the University of Illinois Urbana-Champaign Grainger College of Engineering received a $1.1 M NSF grant. Computing science professors Brighten Godfrey, Radhika Mittal , and Rayadurgam Srikant are directing the effort to build what they call Service Layer Traffic Engineering (SLATE). This project has the potential to broadly benefit cloud-hosted applications with significant performance, resource utilization, and cost improvements.

Godfrey says, “Modern cloud-based applications have become highly distributed, so when you use one service, it might result in communication among tens or even hundreds of separate components. In a way, there’s not only a network between you and the cloud application – there’s also a network ‘inside’ the application. So, we’re bringing techniques from the field of networking to the world of microservice-based applications to help optimize and manage them reliably.”

SLATE will extend existing open-source service meshes, which provide networking functionality for microservice-based applications. The project will develop techniques to prioritize requests automatically across bottlenecks spanning computing and networks. The project aims to design methods to route requests in real-time among possible servers, intelligently trading off latency, cost, bandwidth, and outlier considerations in multi-cluster environments. The project will develop a theoretically grounded approach to decompose the service layer traffic engineering problem into a decentralized design, with local decisions enabling fast reaction and just enough global coordination to achieve optimality.

A graphic with a computer, laptop, and mobile phone communicating with the cloud.
Photo Credit: University of Illinois / Grainger Engineering by Sachin Ashok
Microservice applications are composed of multiple individual components that run as separate processes on a distributed set of machines in compute clouds. They communicate with each other to deliver the desired functionality.

 

An IBM-Illinois Discovery Accelerator Institute (IIDAI) grant was the precursor to the SLATE project. Godfrey explains, “We were fortunate to be supported by IBM IIDAI at an earlier stage of this work, exploring the intersection of networking and microservices. That project supported our work on prioritizing latency-sensitive requests in microservice systems. It also sparked a project on distributed tracing for microservices with our collaborators at IBM Research, Larisa Shwartz and Srinivasan Parthasarathy. All this put us in a good position to begin exploring the research space and develop a cohesive vision, which led to the NSF grant, which we’re very excited about.” This carries implications for the notorious electrical power usage by data centers. “Our project will improve resource allocation efficiency, with quick reaction to changes in demand, so that not as much overprovisioning is necessary. In the end, this means running a cloud application uses less compute resources and thus less power.”

“Microservice-based applications are a rapidly growing area,” Godfrey asserts. While they are convenient for deploying and dynamically scaling applications, we think they need to be dramatically easier to optimize and manage. That vision is what we plan to pioneer and can have a big impact.”

The investigators and their collaborators have presented their work at industry conferences and academic workshops, including two talks at KubeCon in November 2023 and a talk at HotNets in December 2023. “We’re also exploring how to manage microservice application deployments so they are more reliable and secure,” says Godfrey. “Bingzhe Liu, who just completed a PhD, developed a system called Kivi to formally verify microservice clusters managed by the Kubernetes cluster orchestrator to ensure they are controlled correctly.” This work is with PhD student Gangmuk Lim, Godfrey, and collaborator Ryan Beckett of Microsoft Research and was presented at USENIX ATC 2024. Karuna Grewal, a PhD student at Cornell collaborating with the group, presented work on developing expressive safety policies for microservices at ACM HotNets 2023. That work is a collaboration with Grewal, Prof. Justin Hsu also of Cornell, and Godfrey.

Godfrey adds, “One of the trickiest problems is understanding what happened when things go wrong – why is my application so slow?” He notes that in early August at “ACM SIGCOMM 2024 in Sydney – the top conference in computer networking – PhD student Sachin Ashok presented a system called TraceWeaver that helps trace what happened to each individual request across a distributed set of microservices without having to modify the application itself.”  That work is with the Illinois team and collaborators from the IBM IIDAI project: Ashok, Vipul Harsh (PhD ’24), Godfrey, Mittal, and Srinivasan Parthasarathy (IBM), and Larisa Shwartz (IBM).

The project will also help broaden participation in computing by mentoring Research Experiences for Undergraduates (REU) students, attracting applicants from underrepresented groups, and participating in the EECS Rising Stars workshops. Godfrey confirms, “We think this is a large space with more directions to explore—several other exciting projects are in the works with students in the group.”


Radhika Mittal and Rayadurgam Srikant are professors in the Department of Electrical and Computer Engineering


Share this story

This story was published August 12, 2024.