An IBM-Illinois Discovery Accelerator Institute (IIDAI) grant was the precursor to the SLATE project. Godfrey explains, “We were fortunate to be supported by IBM IIDAI at an earlier stage of this work, exploring the intersection of networking and microservices. That project supported our work on prioritizing latency-sensitive requests in microservice systems. It also sparked a project on distributed tracing for microservices with our collaborators at IBM Research, Larisa Shwartz and Srinivasan Parthasarathy. All this put us in a good position to begin exploring the research space and develop a cohesive vision, which led to the NSF grant, which we’re very excited about.” This carries implications for the notorious electrical power usage by data centers. “Our project will improve resource allocation efficiency, with quick reaction to changes in demand, so that not as much overprovisioning is necessary. In the end, this means running a cloud application uses less compute resources and thus less power.”
“Microservice-based applications are a rapidly growing area,” Godfrey asserts. While they are convenient for deploying and dynamically scaling applications, we think they need to be dramatically easier to optimize and manage. That vision is what we plan to pioneer and can have a big impact.”
The investigators and their collaborators have presented their work at industry conferences and academic workshops, including two talks at KubeCon in November 2023 and a talk at HotNets in December 2023. “We’re also exploring how to manage microservice application deployments so they are more reliable and secure,” says Godfrey. “Bingzhe Liu, who just completed a PhD, developed a system called Kivi to formally verify microservice clusters managed by the Kubernetes cluster orchestrator to ensure they are controlled correctly.” This work is with PhD student Gangmuk Lim, Godfrey, and collaborator Ryan Beckett of Microsoft Research and was presented at USENIX ATC 2024. Karuna Grewal, a PhD student at Cornell collaborating with the group, presented work on developing expressive safety policies for microservices at ACM HotNets 2023. That work is a collaboration with Grewal, Prof. Justin Hsu also of Cornell, and Godfrey.
Godfrey adds, “One of the trickiest problems is understanding what happened when things go wrong – why is my application so slow?” He notes that in early August at “ACM SIGCOMM 2024 in Sydney – the top conference in computer networking – PhD student Sachin Ashok presented a system called TraceWeaver that helps trace what happened to each individual request across a distributed set of microservices without having to modify the application itself.” That work is with the Illinois team and collaborators from the IBM IIDAI project: Ashok, Vipul Harsh (PhD ’24), Godfrey, Mittal, and Srinivasan Parthasarathy (IBM), and Larisa Shwartz (IBM).
The project will also help broaden participation in computing by mentoring Research Experiences for Undergraduates (REU) students, attracting applicants from underrepresented groups, and participating in the EECS Rising Stars workshops. Godfrey confirms, “We think this is a large space with more directions to explore—several other exciting projects are in the works with students in the group.”
Radhika Mittal and Rayadurgam Srikant are professors in the Department of Electrical and Computer Engineering