Describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?
I literally grew up with computers all around me. I was born in a little town raised around the headquarters of Olivetti, one of the biggest tech companies of the last century: becoming a computer geek, in that place, at that time, was easier than usual! I have always been fascinated by the power of visuals and music to convey ideas. I loved to learn about history and the world through songs and movies. How to merge my love for computers with my passion for the audiovisual arts? I enrolled in Media Engineering studies, where, aside from the traditional Computer Engineering knowledge, I had the chance to learn more about media history and design. The main message? Multidisciplinarity is key. We cannot design intelligent multimedia technologies without deeply understanding how a media is created, perceived and distributed.
Talking about multidisciplinary, what do you think is the current state of multidisciplinarity in the multimedia community?
My impression is that, due to the inherent multimodality of our research, our community has developed a natural ability of blending techniques and theories from various domains. I believe we can push the boundaries of this multidisciplinarity even further. I am thinking, for example, of that MM subcommunity interested in mining subjective attributes from data, such as mood, sentiment, or beauty. I believe such research works could incredibly benefit from a collaboration between MM scientists and domain experts in psychology, cognitive science, visual perception, or visual arts.
Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?
My dream is to make multimedia science even more useful for society and for collective growth. Multimedia data allows to easily absorb and communicate knowledge, without language barriers. Producing and generating audiovisual content has never been easier: today, the potential of multimedia for learning and sharing human knowledge is unprecedented! Intelligent multimedia systems could be put in place to support editors communities in making free online encyclopedias like Wikipedia or collaborative knowledge bases like Wikidata more “visual” – and therefore less tied to individual languages. By doing so, we could increase the possibility for people around the world to freely access the sum of all knowledge.
I like your approach about making something useful for society. What do you think about the criticism that multimedia research is too applied?
For me, high-quality research means creative research. Where ‘creative’ means ‘new and valuable’. The coexistence of breath and depth in Multimedia allows to create novel and useful applied research works, thus making these, to me, as interesting as inspiring as more theoretical research works.
Can you profile your current research, its challenges, opportunities, and implications?
I work on responsible multimedia algorithms. I love building machines that can classify audiovisual and textual data according to subjective properties – for example, the informativeness of an image with respect to a topic, its epistemic value, the beauty of a photo, the creative degree of a video. Given the inherently subjective nature of these algorithms, one of the main challenges of my research is to make such models responsible, namely:
1) Diversity-Aware i.e. reflecting the real subjective perception of people with different cultural backgrounds; this is key to empower specific cultures, designing AI to grow diversified content and fill the knowledge gaps in online knowledge repositories.
2) Interpretable and Unbiased, namely not only able to classify content, but also able to say why the content was classified in a certain way (so that we can detect algorithmic bias). Such powerful algorithms can be used to study the visual preferences of users of web and social media platforms, and retrieve interesting content accordingly.
Do you think that one day we will have algorithms that truly understand human perception of beauty and art? Or will it always be depended on the data?
Philosophers have been triying for centuries to understand the true nature of aesthetic perception. In general, I do not believe in absolute truths. And I am not really confident that algorithms will be able to become great philosophers anytime soon.
How would you describe the role of women especially in the field of multimedia?
The role of women in multimedia is the role of any researcher in their scientific community: contribute to scientific development, push the boundaries of what is known, doubt the widely accepted notions, make this world a better place (no pressure!). Maintaining diversity (any kind of diversity – including gender, expertise, race, age) in the scientific discourse is crucial: as opposed to a single mono-culture, a diverse community gathers, elaborates and combines different perspectives, thus forcing a collective creative process of exchange and growth, which is essential to scientific development.
Do you think that female researchers are well presented in the multimedia community? For example, there was not female keynote speaker at ACM MM 2017.
I am not sure about the numbers, so I can’t say for sure the percentage of women and non-binary gender persons in the multimedia community. But I am sure that percentage is greater than 0. When filling positions of high visibility such as keynotes or committee members, I we should always keep in mind that one of our tasks is to inspire younger generations. Generations of young, brilliant, beautifully diverse researchers.
How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and in the future?
Since my early days in multimedia, when we were retrieving video shots of airplanes, until today, when we classify creative videos or interesting pictures, I would say that the main contribution of my research has been to “break the boundaries”.
We broke the scientific field boundaries. We designed multimedia algorithms inspired by the visual arts and psychology; we collaborated with experts from philosophy, media history, sociology; and we could deliver creative, interdisciplinary research works which would contribute to the advancement of multimedia and all the fields involved.
We broke the social network boundaries: with models able to quantify the intrinsic quality of images in a photo sharing platform. Furthermore, we showed that popularity-driven mechanisms, typical of social networks, fail to promote high-quality content, and that only content-based quality assessment tools could restore meritocracy in online media platforms.
We broke the cultural boundaries: together with an amazing multi-cultural research team, we were able to design computer vision models that can adapt to different cultures and language communities. While the effectiveness of our approaches and the scientific growth is per-se a main achievement, the publications resulting from this collaborative effort reached the top-level Computer Vision, Multimedia and Social media conferences (with a best paper award – ICWSM -and a multimodal best paper award – ICMR) and our work was featured by a number of tech journals and in a TedX presentation. Together with other scientists, we also started a number of initiatives to gather people from different communities who are interested in this area: a special session at ICMR 2017, a workshop at MM 2017, one at CVPR 2018, and, a special issue of ACM TOMM.
What are in your opinion the future topics in multimedia? Where is the community strong, and where could it improve or increase focus?
My feeling is that we should re-discover and empower the ‘multi-’ness of our research field.
I think the beauty of multimedia research is the ability to tell compelling multimodal stories from signals of very diverse nature, with a focus on the positive experience of the user. We are able to process multiple sources of information and use them, for example, to generate multi-sensorial artistic compositions, expose interesting findings about users and their behavior in multiple modalities, or provide tools to explore and align multimodal information, allowing easier knowledge absorption. We should not forget the diversity of modalities we are able to process (e.g. music or social signals, or traditional image data), the types of attributes we can draw from these modalities (e.g. sentiment or appeal, or more binary semantic labels), and the variety of applications scenarios we can imagine for our research works (e.g. arts, photography, cooking, or more consolidated use cases, such as image search or retrieval). And we should encourage emerging topics and applications towards these ‘multi-nesses’.
Beyond multidisciplinarity and multiple modalities, I would also hope to see more multi-cultural research works: given the beautifully diverse world we are part of, I believe multimedia research works and applications should model and take into account the multiple points of views, diverse perceptual responses, as well as the cultural and language differences of users around the world.
Over your distinguished career, what are your top lessons you want to share with the audience?
I am not sure if this is a real lesson, more something I deeply believe in. Stereotypes kill ideas. Stereotyping on others (colleagues, friends) might make communication, brainstorming, aor collective problem solving much harder, because it somehow influences the importance given to other people ideas. Also, stereotyping on oneself and one’s limits might constrain the possibilities and narrow one’s view on the shapes of possible future paths.
How was it to have a sister working in the same field of research? Is it motivation or pressure? Did you collaborate on some topics?
In one word: inspiring. We never officially collaborated in any research work. Unofficially, we’ve been ‘collaborating’ for 32 years (Interview with Judith Redi)