11/16/2016 Laura Schmitt
Written by Laura Schmitt
Passionate about applying research results to the real world, CS @ ILLINOIS faculty member Kevin Chen-Chuan Chang co-founded Cazoodle based on technology developed in his lab nearly 10 years ago. Cazoodle creates new and better search engines—like the online funding search and recommendation service GrantForward used by more than 200 universities and research institutions—by enabling deep data-aware vertical web searching that can crawl and transform unstructured HTML content into structured databases.
According to Chang, the real world is not only a playground to apply research results but also a great source for inspiring more research. Through his entrepreneurial experience, Chang identified several research problems that have led to new research funding.“In one instance, we wanted to see how the world responds to Cazoodle via social media, but it’s just too hard to try to crawl, filter, and process the social media to get the relevant signals,” he explained. “There ought to be an easier way.”
Recently, Chang received a three-year, $500,000 National Science Foundation (NSF) grant to pursue research on making such listening to the social universe easy. He and his students will develop SocialSense, a social media data platform that may fulfill the promise of gathering deep meaning and actionable information from the vast and noisy social universe. By initially focusing on Twitter users and messages, SocialSense will provide new insights for market research, emergency management, political analysis, finance, and science.
“Billions of messages are generated, but you can only search Tweets by key words,” said Chang, noting that such a search results in a list of individual Tweets. “In order to exploit the social universe, you really want to see a big picture of what people think about [a topic] and a single Tweet isn’t useful.”
According to Chang, SocialSense will focus on two underexplored research issues—discovery and profiling. Discovery involves figuring out which users and Tweets to listen to out of the 300 million users and the 500 million messages generated each day. Once SocialSense figures out who to listen to, the system will search for commonalities among the message senders—the profiling aspect of the research, to answer questions like ‘what types of users like iPhone 7,’ or ‘who supports the Obamacare policy.’
In many instances, users don’t provide demographic data about themselves nor do they use Twitter’s GPS tags in their tweets so Chang’s system will have to algorithmically find and infer those and other relevant attributes by identifying social motifs or metagraphs, which form patterns of similarity among users.
In order to test the effectiveness of SocialSense, Chang will work with researchers from the Advanced Digital Science Center in Singapore, who will help implement the software framework with that country’s ambitious Smart Nation project. Together, they’ll be able to address issues like why people in certain districts of the city-state don’t use public transportation.
Chang is also developing algorithms that address content-centric social discovery—identifying only those Tweets that are relevant to a specified topic. His approach, which incorporates Bayesian techniques, will hinge on a content graph that captures how tweets, queries, and their templates inter-relate on an iterative reasoning process.
Chang will test the effectiveness of SocialSense’s content-centric discovery through a collaboration with Professor Shaowen Wang at the Cyber Geographic Information Science and Systems (CyberGIS) at Illinois. The two will use Tweets to create enhanced social maps that not only identify points of interest, but also relay information about the people who live there. For example, by collecting Tweets from Siebel Center, they can discover that there are young people in the building who care about coding.
Another research problem that Chang identified from his business experience revolved around how difficult it was to use a database for Cazoodle’s enterprise management system. Chang improvised an invoice and payment solution for his company by programming a spreadsheet. However, he parlayed this experience into a second NSF research grant worth $1.8 million to bring spreadsheets and databases together for interactive big data management.