Why do we share posts on Facebook?
Are we seeking factual information, like the name of the plant taking over the front yard? Are we expressing frustration while seeking sympathy? Is it pure narcissism or narcissism by proxy, via our children? Is it bragging, or bragging’s sneaky cousin, humblebragging?
Or is it something worse?
Content with malicious intent presents a major problem for Facebook, which is searching for a way to rapidly identify and remove harmful posts, such as the livestreaming of the March 15 mass shooting in New Zealand, amid a volume of content too vast for humans to moderate.
Serge Belongie, professor of computer science at Cornell Tech, is studying what he calls “intentonomy” – the complex psycho-emotional landscape lurking behind Facebook and Instagram posts.
Belongie and his team are working with Facebook to define possible posting intentions – from benign to polarizing to hateful – and populate a dataset with examples. The goal is to create and train a machine learning system that can predict intent and, eventually, alert the social network about problematic posts in real time.
“Human nature and politics and tribal behavior, monetary incentives – there’s just a zillion things playing into this,” said Belongie, who received a $1.77 million, three-year grant from Facebook to work on projects related to identifying content with malicious intent. “The best we can do is provide tools so that if someone comes to the table with good faith, they can separate the information from the misinformation.”
In a separate project, Belongie’s team is working on machine-learning approaches to detecting forgeries. People who buy advertisements on Facebook must validate their accounts using identification; Belongie will use his expertise in computer vision – an area of artificial intelligence focused on teaching machines to see as humans do – to develop methods that could determine whether those IDs are fake.
“Conventional machine-learning approaches require you to have large training sets of real IDs, from every state, every year range, collected by a professional, and then you need a big volume of fake IDs,” Belongie said. “It’s very hard to get that kind of labeled data; there isn’t much of it.”
Instead, his approach will build on his group’s research into using computer vision to recognize fine-grained differences among plants, animals and mushrooms. A similar approach could be useful for finding tiny details revealing forged IDs, such as the wrong kind of comma or apostrophe.
“If somebody just gives me a bucket of data and most of it is correct, most of it is real, how do you find that needle in the haystack?” he said. “Our goal is anomaly detection – to find things that are out of place.”