Report from QoE-Management 2019

The 3rd International Workshop on Quality of Experience Management (QoE-Management 2019) was a successful full day event held on February 18, 2019 in Paris, France, where it was co-located with the 22nd Conference on Innovation in Clouds, Internet and Networks (ICIN). After the success of the previous QoE-Management workshops, the third edition of the workshop was also endorsed by the QoE and Networking Initiative (http://qoe.community). It was organized by workshop co-chairs Michael Seufert (AIT, Austrian Institute of Technology, Austria, who is now at University of Würzburg, Germany), Lea Skorin-Kapov (University of Zagreb, Croatia) and Luigi Atzori (University of Cagliari, Italy). The workshop attracted 24 full paper and 3 short paper submissions. The Technical Program Committee consisted of 33 experts in the field of QoE Management, which provided at least three reviews per submitted paper. Eventually, 12 full papers and 1 short paper were accepted for publication, which gave an acceptance rate of 48%.

On the day of the workshop, the co-chairs welcomed 30 participants. The workshop started with a keynote given by Martín Varela (callstats.io, Finland) who elaborated on “Some things we might have missed along the way”. He presented open technical and business-related research challenges for the QoE Management community, which he supported with examples from his current research on the QoE monitoring of WebRTC video conferencing. Afterwards, the first two technical sessions focused on video streaming. Susanna Schwarzmann (TU Berlin, Germany) presented a discrete time analysis approach to compute QoE-relevant metrics for adaptive video streaming. Michael Seufert (AIT Austrian Institute of Technology, Austria) reported the results of an empirical comparison, which did not find any differences in the QoE between QUIC- and TCP-based video streaming for naïve end users. Anika Schwind (University of Würzburg, Germany) discussed the impact of virtualization on video streaming behavior in measurement studies. Maria Torres Vega (Ghent University, Belgium) presented a probabilistic approach for QoE assessment based on user’s gaze in 360° video streams with head mounted displays. Finally, Tatsuya Otoshi (Osaka University, Japan) outlined how quantum decision making-based recommendation methods for adaptive video streaming could be implemented.

The next session was centered around machine learning-based quality prediction. Pedro Casas (AIT Austrian Institute of Technology) presented a stream-based machine learning approach for detecting stalling in real-time from encrypted video traffic. Simone Porcu (University of Cagliari, Italy) reported on the results of a study investigating the potential of predicting QoE from facial expressions and gaze direction for video streaming services. Belmoukadam Othmane (Cote D’Azur University & INRIA Sophia Antipolis, France) introduced ACQUA, which is a lightweight platform for network monitoring and QoE forecasting from mobile devices. After the lunch break, Dario Rossi (Huawei, France) gave the second keynote, entitled “Human in the QoE loop (aka the Wolf in Sheep’s clothing)”. He used the main leitmotiv of Web browsing and showed relevant practical examples to discuss the challenges towards QoE-driven network management and data-driven QoE models based on machine learning.

The following technical session was focused on resource allocation. Tobias Hoßfeld (University of Würzburg, Germany) elaborated on the interplay between QoE, user behavior and system blocking in QoE management. Lea Skorin-Kapov (University of Zagreb, Croatia) presented studies on QoE-aware resource allocation for multiple cloud gaming users sharing a bottleneck link. Quality monitoring was the topic of the last technical session. Tomas Boros (Slovak University of Technology, Slovakia) reported how video streaming QoE could be improved by 5G network orchestration. Alessandro Floris (University of Cagliari, Italy) talked about the value of influence factors data for QoE-aware management. Finally, Antoine Saverimoutou (Orange, France) presented WebView, a measurement platform for web browsing QoE. The workshop co-chairs closed the day with a short recap and thanked all speakers and participants, who joined in the fruitful discussions. To summarize, the third edition of the QoE Management workshop proved to be very successful, as it brought together researchers from both academia and industry to discuss emerging concepts and challenges related to managing QoE for network services. As the workshop has proven to foster active collaborations in the research community over the past years, a fourth edition is planned in 2020.

We would like to thank all the authors, reviewers, and attendants for their precious contributions towards the successful organization of the workshop!

Michael Seufert, Lea Skorin-Kapov, Luigi Atzori
QoE-Management 2019 Workshop Co-Chairs

Report from ACM ICMR 2018 – by Cathal Gurrin

 

Multimedia computing, indexing, and retrieval continue to be one of the most exciting and fastest-growing research areas in the field of multimedia technology. ACM ICMR is the premier international conference that brings together experts and practitioners in the field for an annual conference. The eighth ACM International Conference on Multimedia Retrieval (ACM ICMR 2018) took place from June 11th to 14th, 2018 in Yokohama, Japan’s second most populous city. ACM ICMR 2018 featured a diverse range of activities including: Keynote talks, Demonstrations, Special Sessions and related Workshops, a Panel, a Doctoral Symposium, Industrial Talks, Tutorials, alongside regular conference papers in oral and poster session. The full ICMR2018 schedule can be found on the ICMR 2018 website <http://www.icmr2018.org/>. The organisers of ACM ICMR 2018 placed a large emphasis on generating a high-quality programme and in 2018; ICMR received 179 submissions to the main conference, with 21 accepted for oral presentation and 23 for poster presentation. A number of key themes emerged from the published papers at the conference: deep neural networks for content annotation; multimodal event detection and summarisation; novel multimedia applications; multimodal indexing and retrieval; and video retrieval from regular & social media sources. In addition, a strong emphasis on the user (in terms of end-user applications and user-predictive models) was noticeable throughout the ICMR 2018 programme. Indeed, the user theme was central to many of the components of the conference, from the panel discussion to the keynotes, workshops and special sessions. One of the most memorable elements of ICMR 2018 was a panel discussion on the ‘Top Five Problems in Multimedia Retrieval’ http://www.icmr2018.org/program_panel.html. The panel was composed of leading figures in the multimedia retrieval space: Tat-Seng Chua (National University of Singapore); Michael Houle (National Institute of Informatics); Ramesh Jain (University of California, Irvine); Nicu Sebe (University of Trento) and Rainer Lienhart (University of Augsburg). An engaging panel discussion was facilitated by Chong-Wah Ngo (City University of Hong Kong) and Vincent Oria (New Jersey Institute of Technology). The common theme was that multimedia retrieval is a hard challenge and that there are a number of fundamental topics that we need to make progress in, including bridging the semantic and user gaps, improving approaches to multimodal content fusion, neural network learning, addressing the challenge of processing at scale and the so called “curse of dimensionality”. ICMR2018 included two excellent keynote talks <http://www.icmr2018.org/program_keynote.html>. Firstly, Kohji Mitani, the Deputy Director of Science & Technology Research Laboratories NHK (Japan Broadcasting Corporation) explained about the ongoing evolution of broadcast technology and the efforts underway to create new (connected) broadcast services that can provide viewing experiences never before imagined and user experiences more attuned to daily life. The second keynote from Shunji Yamanaka, from The University of Tokyo discussed his experience of prototyping new user technologies and highlighted the importance of prototyping as a process that bridges an ever increasing gap between advanced technological solutions and societal users. During this entertaining and inspiring talk many prototypes developed in Yamanaka’s lab were introduced and the related vision explained to an eager audience. Three workshops were accepted for ACM ICMR 2018, covering the fields of lifelogging, art and real-estate technologies. Interestingly, all three workshops focused on domain specific applications in three emerging fields for multimedia analytics, all related to users and the user experience. The “LSC2018 – Lifelog Search Challenge”< http://lsc.dcu.ie/2018/> workshop was a novel and highly entertaining workshop modelled on the successful Video Browser Showdown series of participation workshops at the annual MMM conference. LSC was a participation workshop, which means that the participants wrote a paper describing a prototype interactive retrieval system for multimodal lifelog data. It was then evaluated during a live interactive search challenge during the workshop. Six prototype systems took part in the search challenge in front of an audience that reached fifty conference attendees. This was a popular and exciting workshop and could become a regular feature at future ICMR conferences. The second workshop was the MM-Art & ACM workshop <http://www.attractiveness-computing.org/mmart_acm2018/index.html>, which was a joint workshop that merged two existing workshops, the International Workshop on Multimedia Artworks Analysis (MMArt) and the International Workshop on Attractiveness Computing in Multimedia (ACM). The aim of the joint workshop was to enlarge the scope of discussion issues and inspire more works in related fields. The papers at the workshop focused on the creation, editing and retrieval of art-related multimedia content. The third workshop was RETech 2018 <https://sites.google.com/view/multimedia-for-retech/>, the first international workshop on multimedia for real estate tech. In recent years there has been a huge uptake of multimedia processing and retrieval technologies in the domain, but there are still a lot of challenges remaining, such as quality, cost, sensitivity, diversity, and attractiveness to users of content. In addition, ICMR 2018 included three tutorials <http://www.icmr2018.org/program_tutorial.html> on topical areas for the multimedia retrieval communities. The first was ‘Objects, Relationships and Context in Visual Data’ by Hanwang Zhang and Qianru Sun. The second was ‘Recommendation Technologies for Multimedia Content’ by Xiangnan He, Hanwang Zhang and Tat-Seng Chua and the final tutorial was ‘Multimedia Content Understanding, my Learning from very few Examples’ by Guo-Jun Qi. All tutorials were well received and feedback was very good. Other aspects of note from ICMR2018 were a doctoral symposium that attracted five authors and a dedicated industrial session that had four industrial talks highlighting the multimedia retrieval challenges faced by industry. It was interesting from the industrial talks to hear how the analytics and retrieval technologies developed over years and presented at venues such as ICMR were actually being deployed in real-world user applications by large organisations such as NEC and Hitachi. It is always a good idea to listen to the real-world applications of the research carried out by our community. The best paper session at ICMR 2018 had four top ranked works covering multimodal, audio and text retrieval. The best paper award went to ‘Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval’, by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury. The best Multi-Modal Paper Award winner was ‘Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing’ by Kevin Joslyn, Kai Li and Kien Hua. In addition, there were awards for best poster ‘PatternNet: Visual Pattern Mining with Deep Neural Network’ by Hongzhi Li, Joseph Ellis, Lei Zhang and Shih-Fu Chang, and best demo ‘Dynamic construction and manipulation of hierarchical quartic image graphs’ by Nico Hezel and Kai Uwe Barthel. Finally, although often overlooked, there were six reviewers commended for their outstanding reviews; Liqiang Nie, John Kender, Yasushi Makihara, Pascal Mettes, Jianquan Liu, and Yusuke Matsui. As with some other ACM sponsored conferences, ACM ICMR 2018 included an award for the most active social media commentator, which is how I ended up writing this report. There were a number of active social media commentators at ICMR 2018 each of which provided a valuable commentary on the proceedings and added to the historical archive.
fig1

Of course, the social side of a conference can be as important as the science. ICMR 2018 included two main social events, a welcome reception and the conference banquet. The welcome reception took place at the Fisherman’s Market, an Asian and ethnic dining experience with a wide selection of Japanese food available. The Conference Banquet took place in the Hotel New Grand, which was built in 1927 and has a long history of attracting famous guests. The venue is famed for the quality of the food and the spectacular panoramic views of the port of Yokohama. As with the rest of the conference, the banquet food was top-class with more than one of the attendees commenting that the Japanese beef on offer was the best they had ever tasted.

ICMR 2018 was an exciting and excellently organised conference and it is important to acknowledge the efforts of the general co-chairs: Kiyoharu Aizawa (The Univ. Of Tokyo), Michael Lew (Leiden Univ.) and Shin’ichi Satoh (National Inst. Of Informatics). They were ably assisted by the TPC co-chairs, Benoit Huet (Eurecom), Qi Tian (Univ. Of Texas at San Antonio) and Keiji Yanai (The Univ. Of Electro-Comm), who coordinated the reviews from a 111 person program committee in a double-blind manner, with an average of 3.8 reviews being prepared for every paper. ICMR 2019 will take place in Ottawa, Canada in June 2019 and ICMR 2020 will take place in Dublin, Ireland in June 2020. I hope to see you all there and continuing the tradition of excellent ICMR conferences.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

Shunji Yamanaka about to begin his keynote talk on Prototyping

Shunji Yamanaka about to begin his keynote talk on Prototyping

Kiyoharu Aizawa and Shin'ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

Kiyoharu Aizawa and Shin’ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

SISAP 2018: 11th International Conference on Similarity Search and Applications

The International Conference on Similarity Search and Applications (SISAP) is an annual forum for researchers and application developers in the area of similarity data management. It aims at the technological problems shared by numerous application domains, such as data mining, information retrieval, multimedia, computer vision, pattern recognition, computational biology, geography, biometrics, machine learning, and many others that make use of similarity search as a necessary supporting service.

From its roots as a regional workshop in metric indexing, SISAP has expanded to become the only international conference entirely devoted to the issues surrounding the theory, design, analysis, practice, and application of content-based and feature-based similarity search. The SISAP initiative has also created a repository serving the similarity search community, for the exchange of examples of real-world applications, the source code for similarity indexes, and experimental testbeds and benchmark data sets (http://www.sisap.org). The proceedings of SISAP are published by Springer as a volume in the Lecture Notes in Computer Science (LNCS) series.

The 2018 edition of SISAP was held at the Universidad de Ingeniería y Tecnología (UTEC) in one of the oldest neighborhoods of Lima, in a modern building just recently inaugurated. The conference was held back-to-back, with a shared session, with the International Symposium on String Processing and Information Retrieval (SPIRE), an independent symposium with some intersection with SISAP. The organization was smooth and with a strong technical program assembled by two co-chairs and sixty program committee members. Each paper was reviewed by at least three referees. The program was completed with three invited speakers of high caliber.

During this 11th edition of SISAP, the first invited speaker was Hanan Samet (http://www.cs.umd.edu/~hjs/) from the University of Maryland, a pioneer in the similarity search field, with several books published on the subject. Professor Samet presented a state of the art system for news search based on the geographical location of the user to get more accurate results. The second invited speaker was Alistair Moffat (https://people.eng.unimelb.edu.au/ammoffat/) from the University of Melbourne, who delivered a talk about a novel technique for building compressed indexes using Asymmetric Numeral Systems (ANS). The ANS is a curious case of a scientific breakthrough not published in a peer-reviewed journal. Although it is available only as an arXiv technical, it is widely used in the industry – from Google and Facebook to Amazon, the adoption has been widespread. The third keynote talk was delivered in the shared session with SPIRE by Moshe Vardi (https://www.cs.rice.edu/~vardi/) of Rice University, a most celebrated editor of Communications of the ACM. Professor Vardi’s talk was an eye-opening discussion of jobs conquered by machines and the perspectives in accepting technological changes in everyday life. In the same shared session, a keynote presentation of SPIRE was given by Nataša Przulj (http://www0.cs.ucl.ac.uk/staff/natasa/) of University College London, concerning molecular networks and the challenges researchers face in developing a better understanding of them. It is worth noting that roughly 10% of the SPIRE participants were inspired to attend the SISAP technical program.

As it is usually the case, SISAP 2018 included a program with papers exploring various similarity-aware data analysis and processing problems from multiple perspectives. The papers presented at the conference in 2018 studied the role of similarity processing in the context of metric search, visual search, nearest neighbor queries, clustering, outlier detection, and graph analysis. Some of the papers had a theoretical emphasis, while others had a systems perspective, presenting experimental evaluations comparing against state-of-the-art methods. An interesting event at the 2018 conference, as well as the two previous editions, was a poster session that included all accepted papers. This component of the conference generated many lively interactions between presenters and attendees, to not only learn more about the presented techniques but also to identify potential topics for future collaboration.

A shortlist for the Best Paper Award was created from those conference papers nominated by at least one of their 3 reviewers. An award committee of 3 researchers ranked the shortlisted papers, from which a final ranking was decided using Borda count. The Best Paper Award was presented during the Conference Dinner. In a tradition that began with the 2009 conference in Prague, extended versions of the top-ranked papers were invited for a Special Issue of the Information Systems journal.

The venue and the location of SISAP 2018 deserve a special mention. In addition to the excellent conference facilities at UTEC, we had many student volunteers who were ready to help ensure that the logistical aspects of the conference ran smoothly. Lima was a superb location for the conference. Our conference dinner was held at the Huaca Pucllana Restaurant, located on the site of amazing archaeological remains within the city itself. We also had many opportunities to enjoy excellently-prepared traditional Peruvian food and drink. Before and after the conference, many participants chose to visit Machu Picchu, voted as one of the New Seven Wonders of the World.

SISAP 2018 demonstrated that the SISAP community has a strong stable kernel of researchers, active in the field of similarity search and to fostering the growth of the community. Organizing SISAP is a smooth experience thanks to the support of the Steering Committee and dedicated participants.

SISAP 2019 will be organized in Newark (NJ, USA) by Professor Vincent Oria (NJIT). This attractive location in the New York City metropolitan area will allow for easy and convenient travel to and from the conference. One of the major challenges of the SISAP conference series is to continue to raise its profile in the landscape of scientific events related to information indexing, database and search systems.

Figure 1. The conference dinner at Pachacamac ruins

Figure 1. The conference dinner at Pachacamac ruins

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 3. Keynote by Vardi

Figure 3. Keynote by Vardi

Interview with Dr. Magda Ek Zarki and Dr. De-Yu Chen: winners of the Best MMsys’18 Workshop paper award

Abstract

The ACM Multimedia Systems conference (MMSys’18) was recently held in Amsterdam from 9-15 June 2018. The conferencs brings together researchers in multimedia systems. Four workshops were co-located with MMSys, namely PV’18, NOSSDAV’18, MMVE’18, and NetGames’18. In this column we interview Magda El Zarki and De-Yu Chen, the authors of the best workshop paper entitled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” that was presented at MMVE’18.

Introduction

The ACM Multimedia Systems Conference (MMSys) (mmsys2018.org) was held from the 12-15 June in Amsterdam, The Netherlands. The MMsys conference provides a forum for researchers to present and share their latest research findings in multimedia systems. MMSys is a venue for researchers who explore complete multimedia systems that provide a new kind of multimedia or overall performance improves the state-of-the-art. This touches aspects of many hot topics including but not limited to: adaptive streaming, games, virtual reality, augmented reality, mixed reality, 3D video, Ultra-HD, HDR, immersive systems, plenoptics, 360° video, multimedia IoT, multi- and many-core, GPGPUs, mobile multimedia and 5G, wearable multimedia, P2P, cloud-based multimedia, cyber-physical systems, multi-sensory experiences, smart cities, QoE.

Four workshops were co-located with MMSys in Amsterdam in June 2018. The paper titled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” by De-Yu Chen and Magda El-Zarki from University of California, Irvine was awarded the Comcast Best Workshop Paper Award for MMSys 2018, chosen from among papers from the following workshops: 

  • MMVE’18 (10th International Workshop on Immersive Mixed and Virtual Environment Systems)
  • NetGames’18 (16th Annual Workshop on Network and Systems Support for Games)
  • NOSSDAV’18 (28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video)
  • PV’18 (23rd Packet Video Workshop)

We approached the authors of the best workshop paper to learn about the research leading up to their paper. 

Could you please give a short summary of the paper that won the MMSys 2018 best workshop paper award?

In this paper we discussed our approach of an adaptive 3D cloud gaming framework. We utilized a collaborative rendering technique to generate partial content on the client, thus the network bandwidth required for streaming the content can be reduced. We also made use of progressive mesh so the system can dynamically adapt to changing performance requirements and resource availability, including network bandwidth and computing capacity. We conducted experiments that are focused on the system performance under unreliable network connections, e.g., when packets can be lost. Our experimental results show that the proposed framework is more resilient under such conditions, which indicates that the approach has potential advantage especially for mobile applications.

Does the work presented in the paper form part of some bigger research question / research project? If so, could you perhaps give some detail about the broader research that is being conducted?

A more complete discussion about the proposed framework can be found in our technical report, Improving the Quality and Efficiency of 3D Immersive Interactive Cloud Based Services by Providing an Adaptive Application Framework for Better Service Provisioning, where we discussed performance trade-off between video quality, network bandwidth, and local computation on the client. In this report, we also tried to tackle network latency issues by utilizing the 3D image warping technique. In another paper, Impact of information buffering on a flexible cloud gaming system, we further explored the potential performance improvement of our latency reduction approach, when more information can be cached and processed.

We received many valuable suggestions and identified a few important future directions. Unfortunately, De-Yu, graduated and decided to pursue a career in the industry. He will not likely to be able to continue working on this project in the near future.

Where do you see the impact of your research? What do you hope to accomplish?

Cloud gaming is an up-and-coming area. Major players like Microsoft and NVIDIA have already launched their own projects. However, it seems to me that there is not a good enough solution that is accepted by the users yet. By providing an alternative approach, we wanted to demonstrate that there are still many unsolved issues and research opportunities, and hopefully inspire further work in this area.

Describe your journey into the multimedia research. Why were you initially attracted to multimedia?

De-Yu: My research interest in cloud gaming system dated back to 2013 when I worked as a research assistant in Academia Sinica, Taiwan. When U first joined Dr. Kuan-Ta Chen’s lab, my background was in parallel and distributed computing. I joined the lab for a project that is aimed to provide a tool that help developers do load balancing on massively multiplayer online video games. Later on, I had the opportunity to participate in the lab’s other project, GamingAnywhere, which aimed to build the world’s first open-source cloud gaming system. Being an enthusiastic gamer myself, having the opportunity to work on such a project was really an enjoyable and valuable experience. That experience came to be the main reason for continuing to work in this area. 

Magda El Zarki: I have worked in multimedia research since the 1980’s when I worked for my PhD project on a project that involved the transmission of data, voice and video over a LAN. It was named MAGNET and was one of the first integrated LANs developed for multimedia transmission. My work continued in that direction with the transmission of Video over IP. In conjunction with several PhD students over the past 20—30 years I have developed several tools for the study of video transmission over IP (MPEGTool) and has several patents related to video over wireless networks. All the work focused on improving the quality of the video via pre and post processing of the signal.

Can you profile your current research, its challenges, opportunities, and implications?

There are quite some challenges in our research. First of all, our approach is an intrusive method. That means we need to modify the source code of the interactive applications, e.g. games, to apply our method. We found it very hard to find a suitable open source game whose source code is neat and clean and easy to modify. Developing our own fully functioning game is not a reasonable approach, alas, due to the complexity involved. We ended up building a 3D virtual environment walkthrough application to demonstrate our idea. Most reviewers have expressed concerns about synchronization issues in a real interactive game, where there may be AI controlled objects, non-deterministic processes, or even objects controlled by other players. We agree with the reviewers that this is a very important issue. But currently it is very hard for us to address it with our limited resources. Most of the other research work in this area faces similar problems to ours – lack of a viable open source game for researchers to modify. As a result, researchers are forced to build their own prototype application for performance evaluation purposes. This brings about another challenge: it is very hard for us to fairly compare the performance of different approaches given that we all use a different application for testing. However, these difficulties can also be deemed as opportunities. There are still many unsolved problems. Some of them may require a lot of time, effort, and resources, but even a little progress can mean a lot since cloud gaming is an area that is gaining more and more attention from industry to increase distribution of games over many platforms.

“3D immersive and interactive services” seems to encompass both massive multi-user online games as well augmented and virtual reality. What do you see as important problems for these fields? How can multimedia researchers help to address these problems?

When it comes to gaming or similar interactive applications, all comes down to the user experience. In the case of cloud gaming, there are many performance metrics that can affect user experience. Identifying what matters the most to the users would be one of the important problems. In my opinion, interactive latency would be the most difficult problem to solve among all performance metrics. There is no trivial way to reduce network latency unless you are willing to pay the cost for large bandwidth pipes. Edge computing may effectively reduce network latency, but it comes with high deployment cost.

As large companies start developing their own systems, it is getting harder and harder for independent researchers with limited funding and resources to make major contributions in this area. Still, we believe that there are a couple ways how independent researchers can make a difference. First, we can limit the scope of the research by simplifying the system, focusing on just one or a few features or components. Unlike corporations, independent researchers usually do not have the resources to build a fully functional system, but we also do not have the obligation to deliver one. That actually enables us to try out some interesting but not so realistic ideas. Second, be open to collaboration. Unlike corporations who need to keep their projects confidential, we have more freedom to share what we are doing, and potentially get more feedback from others. To sum up, I believe in an area that has already attracted a lot of interest from industry, researchers should try to find something that companies cannot or are not willing to do, instead of trying to compete with them.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

 The real question is: Is Cloud Gaming viable? It seems to make economic sense to try to offer it as companies try to reach a broader  and more remote audience. However, computing costs are cheaper than bandwidth costs, so maybe throwing computing power at the problem makes more sense – make more powerful end devices that can handle the computing load of a complex game and only use the network for player interactivity.

Biographies of MMSys’18 Best Workshop Paper Authors

Prof Magda El Zarki (Professor, University of California, Irvine):

Magda El Zarki

Prof. El Zarki’s lab focuses on multimedia transmission over the Internet. The work consists of both theoretical studies and practical implementations to test the algorithms and new mechanisms to improve quality of service on the user device. Both wireline and wireless networks and all types of video and audio media are considered. Recent work has shifted to networked games and massively multi user virtual environments (MMUVE). Focus is mostly on studying the quality of experience of players in applications where precision and time constraints are a major concern for game playability. A new effort also focuses on the development of games and virtual experiences in the arena of education and digital heritage.

De-Yu Chen (PhD candidate, University of California, Irvine):

De-Yu Chen

De-Yu Chen is a PhD candidate at UC Irvine. He received his M.S. in Computer Science from National Taiwan University in 2009, and his B.B.A. in Business Administration from National Taiwan University in 2006. His research interests include multimedia systems, computer graphics, big data analytics and visualization, parallel and distributed computing, cloud computing. His most current research project is focused on improving quality and flexibility of cloud gaming systems.

Report from ACM MMSYS 2018 – by Gwendal Simon

While I was attending the MMSys conference (last June in Amsterdam), I tweeted about my personal highlights of the conference, in the hope to share with those who did not have the opportunity to attend the conference. Fortunately, I have been chosen as “Best Social Media Reporter” of the conference, a new award given by ACM SIGMM chapter to promote the sharing among researchers on social networks. To celebrate this award, here is a more complete report on the conference!

When I first heard that this year’s edition of MMsys would be attended by around 200 people, I was a bit concerned whether the event would maintain its signature atmosphere. It was not long before I realized that fortunately it would. The core group of researchers who were instrumental in the take-off of the conference in the early 2010’s is still present, and these scientists keep on being sincerely happy to meet new researchers, to chat about the latest trends in the fast-evolving world of online multimedia, and to make sure everybody feels comfortable talking with each other.

mmsys_1

I attended my first MMSys in 2012 in North Carolina. Although I did not even submit any paper to MMSys’12, I decided to attend because the short welcoming text on the website was astonishingly aligned with my own feeling of the academic research world. I rarely read the usually boring and unpassionate conference welcoming texts, but this particular day I took time to read this particular MMSys text changed my research career. Before 2012, I felt like one lost researcher among thousands of other researchers, whose only motivation is to publish more papers whatever at stake. I used to publish sometimes in networking venues, sometimes in system venues, sometimes in multimedia venues… My production was then quite inconsistent, and my experiences attending conferences were not especially exciting.

The MMsys community matches my expectations for several reasons:

  • The size of a typical MMSys conference is human: when you meet someone the first day, you’ll surely meet this fellow again the next day.
  • Informal chat groups are diverse. I’ve the feeling that anybody can feel comfortable enough to chat with any other attendee regardless of gender, nationality, and seniority.
  • A responsible vision of what should be an academic event. The community is not into show-off in luxury resorts, but rather promotes decently cheap conferences in standard places while maximizing fun and interactions. It comes sometimes with the cost of organizing the conference in the facilities of the university (which necessarily means much more work for organizers and volunteers), but social events have never been neglected.
  • People share a set of “values” into their research activities.

This last point is of course the most significant aspect of MMSys. The main idea behind this conference is that multimedia services are not only multimedia but also networks, systems, and experiences. This commitment to a holistic vision of multimedia systems has at least two consequences. First, the typical contributions that are discussed in this conference have both some theoretical and experimental parts, and, to be accepted, papers have to find the right balance between both sides of the problem. It is definitely challenging, but it brings passionate researchers to the conference. Second, the line between industry and academia is very porous. As a matter of facts, many core researchers of MMSys are either (past or current) employees of research centers in a company or involved into standard groups and industrial forums. The presence of people being involved in the design of products nurtures the academic debates.

While MMSys significantly grows, year after year, I was curious to see if these “values” remain. Fortunately, it does. The growing reputation has not changed the spirit.

mmsys_2

The 2018 edition of the MMSys conference was held in the campus of CWI, near Downtown Amsterdam. Thanks to the impressive efforts of all volunteers and local organizers, the event went smoothly in the modern facilities near the Amsterdam University. As can be expected from a conference in the Netherlands, especially in June, biking to the conference was the obviously best solution to commute every morning from anywhere in Amsterdam.

mmsys_3The program contains a fairly high number of inspiring talks, which altogether reflected the “style” of MMsys. We got a mix of entertaining technological industry-oriented talks discussing state-of-the-art and beyond. The two main conference keynotes were given by stellar researchers (who unsurprisingly have a bright career in both academia and industry) on the two hottest topics of the conference. First Philip Chou (8i Labs) introduced holograms. Phil kind of lives in the future, somewhere five years later than now. And from there, Phil was kind enough to give us a glimpse of the anticipatory technologies that will be developed between our and his nows. Undoubtedly everybody will remember his flash-forwarding talk. Then Nuria Oliver (Vodafone) discussed the opportunities to combine IoT and multimedia in a talk that was powerful and energizing. The conference also featured so-called overview talks. The main idea is that expert researchers present the state-of-the-art in areas that have been especially under the spotlights in the past months. The topics this year were 360-degree videos, 5G networks, and per-title video encoding. The experts were from Tiledmedia, Netflix, Huawei and University of Illinois. With such a program, MMSys attendees had the opportunity to catch-up on everything they may have missed during the past couple of years.

mmsys_4

mmsys_5The MMSys conference has also a long history of commitment for open-source and demonstration. This year’s conference was a peak with an astonishing ratio of 45% papers awarded by a reproducibility badge, which means that the authors of these papers have accepted to share their dataset, their code, and to make sure that their work can be reproduced by other researchers. I am not aware of any other conference reaching such a ratio of reproducible papers. MMSys is all about sharing, and this reproducibility ratio demonstrates that the MMSys researchers see their peers as cooperating researchers rather than competitors.

 

mmsys_6My personal highlights would go for two papers: the first one is a work from researchers from UT Dallas and Mobiweb. It shows a novel efficient approach to generate human models (skeletal poses) with regular Kinect. This paper is a sign that Augmented Reality and Virtual Reality will soon be populated by user-generated content, not only synthetized 3D models but also digital captures of real humans. The road toward easy integration of avatars in multimedia scenes is paved and this work is a good example of it. The second work I would like to highlight in this column is a work from researchers from Université Cote d’Azur. The paper deals with head movement in 360-degree videos but instead of trying to predict movements, the authors propose to edit the content to guide user attention so that head movements are reduced. The approach, which is validated by a real prototype and code source sharing, comes from a multi-disciplinary collaboration with designers, engineers, and human interaction experts. Such multi-disciplinary work is also largely encouraged in MMSys conferences.

mmsys_7b

Finally, MMSys is also a full event with several associated workshops. This year, Packet Video (PV) was held with MMSys for the very first time and it was successful with regards to the number of people who attended it. Fortunately, PV has not interfered with Nossdav, which is still the main venue for high-quality innovative and provocative studies. In comparison, both MMVE and Netgames were less crowded, but the discussion in these events was intense and lively, as can be expected when so many experts sit in the same room. It is the purpose of workshops, isn’t it?

mmsys_8

A very last word on the social events. The social events in the 2018 edition were at the reputation of MMSys: original and friendly. But I won’t say more about them: what happens in MMSys social events stays at MMSys.

mmsys_9The 2019 edition of MMSys will be held on the East Coast of US, hosted by University of Massachusetts-Amherst. The multimedia community is in a very exciting time of its history. The attention of researchers is shifting from video delivery to immersion, experience, and attention. More than ever, multimedia systems should be studied from multiple interplaying perspectives (network, computation, interfaces). MMSys is thus a perfect place to discuss research challenges and to present breakthrough proposals.

[1] This means that I also had my bunch of rejected papers at MMSys and affiliated workshops. Reviewer #3, whoever you are, you ruined my life (for a couple of hours)

Report from ACM Multimedia 2017 – by Benoit Huet

 

Best #SIGMM Social Media Reporter Award! Me? Really?? fig_huet_1

This was my reaction after being informed by the SIGMM Social Media Editors that I was one of the two recipients following ACM Multimedia 2017! #ACMMM What a wonderful idea this is to encourage our community to communicate, both internally and to other related communities, about our events, our key research results and all the wonderful things the multimedia community stands for!  I have always been surprised by how limited social media engagement is within the multimedia community. Your initiative has all my support! Let’s disseminate our research interest and activities on social media! @SIGMM #Motivated

fig_huet_2

The SIGMM flagship conference took place on October 23-27 at the Computer History Museum in Mountain View California, USA. For its 25th edition, the organizing committee had prepared an attractive program cleverly mixing expected classics (i.e. Best Paper session, Grand Challenges, Open Source software competition, etc…) and brand new sessions (such as Fast Forward and Thematic Workshops, Business Idea Venture, and the Novel Topics Track). In this last edition, the conference adopted a single paper length, removing the boundary between long and short papers. The TPC Co-Chairs and Area Chairs had the responsibility of directing accepted papers to either an oral session or a thematic workshop.

Thematic workshops took the form of poster presentations. Presenters were asked to provide a short video briefly motivating their work with the intention of making them available online for reference after the conference (possibly with a link to the full paper and the poster!). However, this did not come through as publication permissions were not cleared out in time, but the idea is interesting and should be taken into account for future editions. Fast forward (or Thematic workshop pitches) are short targeted presentations aimed at attracting the audience to the Thematic Workshop where the papers are presented (in the form of posters in this case). While such short presentations allow conference attendees to efficiently identify which poster are relevant to them, it is crucial for presenters to be well prepared and concentrate on highlighting one key research idea, as time is very limited. It also gives more exposure to poster. I would be in favor of keeping such sessions for future ACM Multimedia editions.

The 25th edition of ACM MM wasn’t short of keynotes. No less than 6 industry keynotes had punctuated each of the conference half day. The first keynote by Achin Bhowmik from Starkey focused on Audio as a mean to “Enhancing and Augmenting Human Perception with Artificial Intelligence”. Bill Dally from NVidia presented “Efficient Methods and Hardware for Deep Learning”, in short why we all need GPUs! “Building Multi-Modal Interfaces for Smartphones” was the topic presented by Injong Rhee (Samsung Electronics), Scott Silver (YouTube) discussed the difficulties in “Bringing a Billion Hours to Life” (referring to the vast quantities of videos uploaded and viewed on the sharing platform, and the long tail). Ed. Chang from HTC presented “DeepQ: Advancing Healthcare Through AI and VR” and demonstrated how healthcare is and will benefit from AR, VR and AI. Danny Lange from Unity Technologies highlighted how important machine learning and deep learning are in the game industry in ”Bringing Gaming, VR, and AR to Life with Deep Learning”.  Personally, I would have preferred a mix of industry/academic keynotes as I found some of the keynotes not targeting an audience of computer scientists.

Arnold W. M. Smeulders received the SIGMM Technical Achievement Award for his outstanding and pioneeringfig_huet_3 contribution defining and bridging the semantic gap in content based image retrieval (his lecture is here: https://youtu.be/n8kLxKNjQ0A). His talk was sharp, enlightening and very well received by the audience.

The @sigmm rising star award went to Dr Liangliang Cao for his contribution to large-scale multimedia recognition and social media mining.

The conference was noticeably flavored with trendy topics such as AI, Human augmenting technologies, Virtual and Augmented Reality, and Machine (Deep) Learning, as can be noticed from the various works rewarded.

The Best Paper award was given to Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, Heng Tao Shen for their work on “Adversarial Cross-Modal Retrieval“.

Yuan Tian, Suraj Raghuraman, Thiru Annaswamy, Aleksander Borresen, Klara Nahrstedt, Balakrishnan Prabhakaran received the Best Student Paper award for the paper “H-TIME: Haptic-enabled Tele-Immersive Musculoskeletal Examination“.

The Best demo award went to “NexGenTV: Providing Real-Time Insight during Political Debates in a Second Screen Application” by Olfa Ben Ahmed, Gabriel Sargent, Florian Garnier, Benoit Huet, Vincent Claveau, Laurence Couturier, Raphaël Troncy, Guillaume Gravier, Philémon Bouzy and Fabrice Leménorel.

The Best Open source software award was received by Hao Dong, Akara Supratak, Luo Mai, Fangde Liu, Axel Oehmichen, Simiao Yu, Yike Guo for “TensorLayer: A Versatile Library for Efficient Deep Learning Development“.

The Best Grand Challenge Video Captioning Paper award went to “Knowing Yourself: Improving Video Caption via In-depth Recap“, by Qin Jin, Shizhe Chen, Jia Chen, Alexander Hauptmann.

The Best Grand Challenge Social Media Prediction Paper award went to Chih-Chung Hsu, Ying-Chin Lee, Ping-En Lu, Shian-Shin Lu, Hsiao-Ting Lai, Chihg-Chu Huang,Chun Wang, Yang-Jiun Lin, Weng-Tai Su for “Social Media Prediction Based on Residual Learning and Random Forest“.

Finally, the Best Brave New Idea Paper award was conferred to John R Smith, Dhiraj Joshi, Benoit Huet, Winston Hsu and Zef Cota for the paper “Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation“.

A few years back, the multimedia community was concerned with the lack of truly multimedia publications. In my opinion, those days are behind us. The technical program has evolved into a richer and broader one, let’s keep the momentum!

The location was a wonderful opportunity for many of the attendees to take a stroll down memory lane and see fig_huet_4computers and devices (VT100, PC, etc…) from the past thanks to the complementary entrance to the museum exhibitions. The “isolated” location of the conference venue meant going out for lunch breaks was out of the question given the duration of the lunch break. As a solution, the organizers catered buffet lunches. This resulted in the majority of the attendees interacting and mixing over the lunch break while eating. This could be an effective way to better integrate new participants and strengthen the community.  Both the welcome reception and the banquet were held successfully within Computer Museum. Both events offer yet another opportunity for new connections to be made and for further interaction between attendees. Indeed, the atmosphere of both occasions was relaxed, lively and joyful. 

All in all, ACM MM 2017 was another successful edition of our flagship conference, many thanks to the entire organizing team and see you all in Seoul for ACM MM 2018 http://www.acmmm.org/2018/ and follow @sigmm on Twitter!

Report from ACM Multimedia 2017 – by Conor Keighrey

conor1

My name is Conor Keighrey, I’m a PhD. candidate at the Athlone Institute Technology in Athlone, Co. Westmeath, Ireland.  The focus of my research is to understand the key influencing factors that affect Quality of Experience (QoE) in emerging immersive multimedia experiences, with a specific focus on applications in the speech and language therapy domain. I am funded for this research by the Irish Research Council Government of Ireland Postgraduate Scholarship Programme. I’m delighted to have been asked to present this report to the SIGMM community as a result of my social media activity at ACM Multimedia Conference.

Launched in 1993, the ACM Multimedia (ACMMM) Conference held its 25th anniversary event in the Mountain View, California. The conference was located in the heart of Silicon Valley, at the inspirational Computer History Museum.

Under five focal themes, the conference called for multimedia papers which focused on topics relating to multimedia: Experience, Systems and Applications, Understanding, Novel Topics, and Engagement.

Keynote addresses were delivered by high-profile industry leading experts from the field of multimedia. These talks provided insight into the active development from the following experts:

  • Achin Bhowmik (CTO & EVP, Starkey, USA)
  • Bill Dally (Senior Vice President and Chief Scientist, NVidia, USA)
  • Injong Rhee (CTO & EVP, Samsung Electronics, Korea)
  • Edward Y. Chang (President, HTC, Taiwan)
  • Scott Silver (Vice President, Google, USA)
  • Danny Lange (Vice President, Unity Technologies, USA)

Some keynote highlights include Bill Dally’s talk on “Efficient Methods and Hardware for Deep Learning”. Bill provided insight into the work NVidia are doing with neural networks, the hardware which drives them, and the techniques the company are using to make them more efficient. He also highlighted how AI should not be thought of as a mechanism which replaces, but empower humans, thus allowing us to explore more intellectual activities.

Danny Lange of Unity Technologies discussed the application of the Unity game engine to create scenarios in which machine learning models can be trained. His presentation entitled “Bringing Gaming, VR, and AR to Life with Deep Learning” described the capture of data for self-driving cars to prepare for unexpected occurrences in the real world (e.g. pedestrians activity or other cars behaving in unpredicted ways).

A number of the Keynotes were captured by FXPAL (an ACMMM Platinum Sponsor) and are available here.

With an acceptance rate of 27.63% (684 reviewed, 189 accepted), the main track at ACMMM showcased a diverse collection of research from academic institutes around the globe. An abundance of work was presented in the ever-expanding area of deep/machine learning, virtual/augmented/mixed realities, and the traditional multimedia field.

conor2

The importance of gender equality and diversity with respect to advancing careers of women in STEM has never been greater. Sponsored by SIGMM, the Women/Diversity in MM lunch took place on the first day of ACMMM. Speakers such as Prof. Noel O’Conner discussed the significance of initiatives such as Athena SWAN (Scientific Women’s Academic Network) within Dublin City University (DCU). Katherine Breeden (Pictured left), an Assistant Professor in the Department of Computer Science at Harvey Mudd College (HMC), presented a fantastic talk on gender balance at HMC. Katherine’s discussion highlighted the key changes which have occurred resulting in more women than men graduating with a degree in computer science at the college.

Other highlights from day 1 include a paper presented at the Experience 2 (Perceptual, Affect, and Interaction) session, chaired by Susanne Boll (University of Oldenburg). Researchers from the National University of Singapore presented the results of a multisensory virtual cocktail (Vocktail) experience which was well received. 

 

conor3Through the stimulation of 3 sensory modalities, Vocktails aim to create virtual flavor, and augment taste experiences through a customizable interactive drinking utensil. Controlled by a mobile device, participants of the study experienced augmented taste (electrical stimulation of the tongue), smell (micro air-pumps), and visual (RGB light projected onto the liquid) stimulus as they used the system. For more information, check out their paper entitled “Vocktail: A Virtual Cocktail for Pairing Digital Taste, Smell, and Color Sensations” on the ACM Digital Library.

Day 3 of the conference included a session entitled Brave New Ideas. The session presented a fantastic variety of work which focused on the use of multimedia technologies to enhance or create intelligent systems. Demonstrating AI as an assistive tool and winning the Best Brave New Idea Paper award, a paper entitled “Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation” (ACM Digital Library) describes the first-ever human machine collaboration for creating a real movie trailer. Through multi-modal semantic extraction, inclusive of audio-visual, scene analysis, and a statistical approach, key moments which characterize horror films were defined. As a result of this, the AI selected 10 scenes from a feature length film which were further developed alongside a professional film maker to finalize an exciting movie trailer. Officially released by 20th Century Fox, the complete AI trailer for the horror movie “Morgan” can be viewed here.

A new addition to the last ACMMM edition year has been the inclusion of thematic workshops. Four individual workshops (as outlined below) provided opportunity for papers which could not be accommodated within the main track to be presented to the multimedia research community. A total of 495 papers were reviewed from which 64 were accepted (12.93%). Authors of accepted papers presented their work via on-stage thematic workshop pitches, which were followed by poster presentations on Monday the 23rd and Friday the 27th. The workshop themes were as follows:

  • Experience (Organised by Wanmin Wu)
  • Systems and Applications (Organised by Roger Zimmermann & He Ma)
  • Engagement (Organised by Jianchao Yang)
  • Understanding (Organised by Qi Tian)

Presented as part of the thematic workshop pitches, one of the most fascinating demos at the conference was a body of work carried out by Audrey Ziwei Hu (University of Toronto). Her paper entitled “Liquid Jets as Logic-Computing Fluid-User-Interfaces” describes a fluid (water) user interface which is presented as a logic-computing device. Water jets form a medium for tactile interaction and control to create a musical instrument known as a hydraulophone.

conor4Steve Mann (Pictured left) from Stanford University, who is regarded as “The Father of Wearable Computing”, provided a fantastic live demonstration of the device. The full paper can be found on the ACM Digital Library, and a live demo can be seen here.

In large scale events such ACMMM, the importance of social media reporting/interaction has never been greater. More than 250 social media interactions (tweets, retweets, and likes) were monitored using the #SIGMM and #ACMMM hashtags, as outlined by the SIGMM Records prior to the event. Descriptive (and multimedia enhanced) social media reports provide a chance for those who encounter an unavoidable schedule overlap, and an opportunity to gather some insight into alternative works presented at the conference.

From my own perspective (as a PhD. student), the most important aspect of social media interaction is that reports often serve as a conversational piece. Developing a social presence throughout the many coffee breaks and social events during the conference is key to the success of building a network of contacts within any community. As a newcomer this can often be a daunting task, recognition of other social media reporters offers the perfect ice-breaker, providing opportunity to discuss and inform each other of the on-going work within the multimedia community. As a result of my own online reporting, I was recognized numerous times throughout the conference. Staying active on social media often leads to the development of a research audience, and social media presence among peers. Engaging in such an audience is key to the success of those who wish to follow a path in academia/research.

Building on my own personal experience, continued attendance to SIGMM conferences (irrespective of paper submission) has so many advantages. While the predominant role of a conference is to disseminate work, the informative aspect of attending such events is often overlooked. The area of multimedia research is moving at a fast pace, and thus having the opportunity to engage directly with researchers in your field of expertise is of upmost importance. Attendance to ACMMM and other SIGMM conferences, such ACM Multimedia Systems, has inspired me to explore alternative methodologies within my own respective research. Without a doubt, continued attendance will inspire my research as I move forward.

ACM Multimedia ‘18 (October 22nd – 26th) – The diverse landscape of modern skyscrapers mixed with traditional Buddhist temples, and palaces that is Seoul, South Korea, will be host to the 26th Annual ACMMM. The 2018 event will without a doubt present a variety of work from the multimedia research community. Regular paper abstracts are due on the 30th of March (Full manuscripts are due on the 8th of April). For more information on next year’s ACM Multimedia conference check out the following link: http://www.acmmm.org/2018

The Deep Learning Indaba Report

Abstract

Given the focus on deep learning and machine learning, there is a need to address this problem of low participation of Africans in data science and artificial intelligence. The Deep Learning Indaba was thus born to stimulate the participation of Africans within the research and innovation landscape surrounding deep learning and machine learning. This column reports on the Deep Learning Indaba event, which consisted of a 5-day series of introductory lectures on Deep Learning, held from 10-15 September 2017, coupled with tutorial sessions where participants gained practical experience with deep learning software packages. The column also includes interviews with some of the organisers to learn more about the origin and future plans of the Deep Learning Indaba.

Introduction

Africans have a low participation in the area of science called deep learning and machine learning, as shown by the fact that at the 2016 Neural Information Processing Systems (NIPS’16) conference, none of the accepted papers had at least one author from a research institution in Africa (http://www.deeplearningindaba.com/blog/missing-continents-a-study-using-accepted-nips-papers).

Given the increasing focus on deep learning, and the more general area of machine learning, there is a need to address this problem of low participation of Africans in the technology that underlies the recent advances in data science and artificial intelligence that is set to transform the way the world works. The Deep Learning Indaba was thus born, aiming to be a series of master classes on deep learning and machine learning for African researchers and technologists. The purpose of the Deep Learning Indaba was to stimulate the participation of Africans, within the research and innovation landscape surrounding deep learning and machine learning.

What is an ‘indaba’?

According to the organisers ‘indaba’ is a Zulu word that simply means gathering or meeting. There are several words for such meetings (that are held throughout southern Africa) including an imbizo (in Xhosa), an intlanganiso, and a lekgotla (in Sesotho), a baraza (in Kiswahili) in Kenya and Tanzania, and padare (in Shona) in Zimbabwe. Indabas have several functions: to listen and share news of members of the community, to discuss common interests and issues facing the community, and to give advice and coach others. Using the word ‘indaba’ for the Deep Learning event connects it to other community gatherings that are similarly held by cultures throughout the world. The Deep Learning Indaba is about the spirit of coming together, of sharing and learning and is one of the core values of the event.

The Deep Learning Indaba

After a couple of months of furious activity by the organisers, roughly 300 students, researchers and machine learning practitioners from all over Africa gathered for the first Deep Learning Indaba from 10-15 September 2017 at the University of Witswatersrand, Johannesburg, South Africa. More than 30 African countries were represented for an intense week of immersion into Deep Learning.

The Deep Learning Indaba consisted of a 5-day series of introductory lectures on Deep Learning, coupled with tutorial sessions where participants gained practical experience with deep learning software packages such as TensorFlow. The format of the Deep Learning Indaba was based on the intense summer school experience of NIPS. Presenters at the Indaba included prominent figures in the machine learning community such as Nando de Freitas, Ulrich Paquet and Yann Dauphin. The lecture sessions were all recorded and all the practical tutorials are also available online: Lectures and Tutorials.

After organising the first successful Deep Learning Indaba in Africa (a report on the outcomes of the Deep Learning Indaba can be found at online), the organisers have already started planning the next two Deep Learning Indabas, that will take place in 2018 and 2019. More information can be found at the Deep Learning Indaba website http://www.deeplearningindaba.com.

Having been privileged to attend this first Deep Learning Indaba, a number of the organisers were interviewed to learn more about the origin and future plans of the Deep Learning Indaba. The interviewed organisers include Ulrich Paquet and Stephan Gouws.

Question 1: What was the origin of the Deep Learning Indaba?

Ulrich Paquet: We’d have to dig into history a bit here, as the dream of taking ICML (International Conference on Machine Learning) to South Africa has been around for a while. The topic was again raised at the end of 2016, when Shakir and I sat at NIPS (Conference on Neural Information Processing Systems), and said “let’s find a way to make something happen in 2017.” We were waiting for the right opportunity. Stephan has been thinking along these lines, and so has George Konidaris. I met Benjamin Rosman in January or February over e-mail, and within a day we were already strategizing what to do.

We didn’t want to take a big conference to South Africa, as people parachute in and out, without properly investing in education. How can we make the best possible investment in South African machine learning? We thought a summer school would be the best vehicle, but more than that, we wanted a summer school that would replicate the intense NIPS experience in South Africa: networking, parties, high-octane teaching, poster sessions, debates and workshops…

Shakir asked Demis Hassibis for funding in February this year, and Demis was incredibly supportive. And that got the ball rolling…

Stephan Gouws: It began with the question that was whispered amongst many South Africans in the machine learning industry: “how can we bring ICML to South Africa?” Early in 2017, Ulrich Paquet and Shakir Mohamed (both from Google DeepMind) began a discussion regarding how a summer school-like event can be held in South Africa. A summer school-like event was chosen as it typically has a bigger impact after the event than a typical conference. Benjamin Rosman (from the South African Council of Scientific and Industrial Research), Nando de Freitas (also from Google DeepMind) joined the discussion in February. A fantastic group of researchers from South Africa was gathered that shared the vision of making the event a reality. I suggested the name “Deep Learning Indaba”, we registered a domain, and from there we got the ball rolling!

Question 2: What did the organisers want to achieve with the Indaba?

Ulrich Paquet: Strengthening African Machine Learning

“a shared space to learn, to share, and to debate the state-of-the-art in machine learning and artificial intelligence”

  • Teaching and mentoring
  • Building a strong research community
  • Overcoming isolation

We also wanted to work towards inclusion; build a community; confidence building; affect government policy.

Stephan Gouws: Our vision is to strengthen machine learning in Africa. Machine learning experts, workshop and conferences are mostly concentrated in North America and Western-Europe. African do not easily get the opportunity to be exposed to such events as they are far away, expensive to attend, etc. Furthermore, with a conference a bunch of experts fly in, discuss the state-of-the-art of the field, and then fly away. A conference does not easily allow for a transfer of expertise, and therefore the local community does not gain much from a conference. With the Indaba, we hoped to facility a knowledge transfer (for which a summer school-like event is better suited), and also to create networking opportunities for students, industry, academics and the international presenters.

Question 3: Why was the Indaba held in South Africa?

Ulrich Paquet: All of the (original) organizers are South African, and really care about development of their own country. We want to reach beyond South Africa, though, and tried to include as many institutions as possible (more than 20 African countries were represented).

But, one has to remember that the first Indaba was essentially an experiment. We had to start somewhere! We benefit by having like-minded local organizers :)

Stephan Gouws: All the organisers are originally from South Africa and want to support and strengthen the machine learning field in South Africa (and eventually in the rest of Africa).

Question 4: What was the expectations beforehand for the Indaba? (For example, how many people did the organisers expect will attend?)

Ulrich Paquet: Well, we originally wanted to run a series of master classes for 40 students. We had ABSOLUTELY NO idea how many students would apply, or if any would even apply. We were very surprised when we hit more than 700 applications by our deadline, and by then, the whole game changed. We couldn’t take 40 out of 700, and decided to go for the largest lecture hall we could possibly find (for 300 people).

There are then other logistics of scale that come into play: feeding everyone, transporting everyone, running practical sessions, etc. And it has to be within budget!! The cap at 300 seemed to work well.

Question 5: Are there any plans for the future of the Indaba? Are you planning on making it an annual event?

Ulrich Paquet: Yes, definitely.

Stephan Gouws: Nothing official yet, but the plan from the beginning was to make it an annual event.

[Editor]:  The Deep Learning Indaba 2018 has since been announced and more information can be found at the following link: http://www.deeplearningindaba.com/indaba-2018.html.  The organisers have also announced locally organised, one-day Indabas to be held from 26 March to 6 April 2108 with the aim of strengthening the African Machine learning community. Details for obtaining support for the organising of an IndabaX event can be found at the main site: http://www.deeplearningindaba.com/indabax

Question 6: How can students, researchers and people from industry still get and stay involved after the Indaba?

Ulrich Paquet: There are many things that could be changed with enough critical mass. One, that we’re hoping, is to ensure that the climate for research in sub-Saharan Africa is as fertile as possible. This will only happen through lots of collaboration and cross-pollination. There are some things that stand in the way of this kind of collaboration. One is government KPIs (key performance indicators) that rewards research: for AI, it does not rightly reward collaboration, and does not rightly reward publications in top-tier platforms, which are all conferences (NIPS, ICML). Therefore, it does not reward playing in and contributing to the most competitive playing field. These are all things that the AI community in SA should seek to creatively address and change.

We have seen organic South African papers published at UAI and ICML for the first time this year, and the next platforms should be JMLR and NIPS, and then Nature. There’s never been any organic Africa AI or machine learning papers in any of the latter venues. Students should be encouraged to collaborate and submit to them! The nature of the game is that the barrier to entry for these venues is so high, that one has to collaborate… This of course brings me to my point about why research grants (in SA) should be revisited to reflect these outcomes.

Stephan Gouws: In short, yes. All the practical, lectures and videos are made publicly available. There is also Facebook and WhatsApp groups, and we hope that the discussion and networking will not stop after the 15th of September. As a side note: I am working on ideas (more aimed at postgraduate students) to eventually put a mentor system in place, as well as other types of support for postgraduate students after the Indaba. But it is still early days and only time will tell.

Biographies of Interviewed Organisers

Ulrich Paquet (Research Scientist, DeepMind, London):

Ulrich Paquet

Dr. Ulrich Paquet is a Research Scientist at DeepMind, London. He really wanted to be an artist before stumbling onto machine learning while attending a third-year course taught at University of Pretoria (South Africa) where he eventually obtained a Master’s degree in Computer Science. In April 2007 Ulrich obtained his PhD from the University of Cambridge with dissertation topic “Bayesian Inference for Latent Variable Models.” After obtaining his PhD he worked with a start-up called Imense, focusing on face recognition and image similarity search. He then joined Microsoft’s FUSE Labs, based at Microsoft Research Cambridge, where he eventually worked on the XBox-One launch as part of the Xbox Recommendations team. From 2015 he joined another start-up in Cambridge, VocalIQ, which has been acquired by Apple before joining DeepMind in April 2016.

Stephan Gouws (Research Scientist, Google Brain Team):

Stephan Gouws

Dr. Stephan Gouws is a Research Scientist at Google and part of the Google Brain Team that developed TensorFlow and Google’s Neural Machine Translation System. His undergraduate studies was a double major in Electronic Engineering and Computer Science at Stellenbosch University (South Africa). His postgraduate studies in Electronic Engineering were also completed at the MIH Media Lab at Stellenbosch University. He obtained his Master’s degree cum laude in 2010 and his PhD degree in 2015 on the dissertation topic of “Training Neural Word Embeddings for Transfer Learning and Translation.” During his PhD he spent one year at Information Sciences Institute (ISI) at the University of Southern California in Los Angeles, and 1 year at Montreal Institute for Learning Algorithms where he worked closely with Yoshua Bengio. He also worked as Research Intern at both Microsoft Research and Google Brain during this period.

 
The Deep Learning Indaba Organisers:

Shakir Mohamed (Research Scientist, DeepMind, London)
​Nyalleng Moorosi (Researcher, Council for Scientific and Industrial Research, South Africa)
Ulrich Paquet (Research Scientist, DeepMind, London)
​Stephan Gouws (Research Scientist, Google, Brain Team, London)
Vukosi Marivate (Researcher, Council for Scientific and Industrial Research, South Africa)
Willie Brink (Senior Lecturer, Stellenbosch University, South Africa)
Benjamin Rosman (Researcher, Council for Scientific and Industrial Research, South Africa)
Richard Klein (Associate Lecturer, University of the Witwatersrand, South Africa)

Advisory Committee:

Nando De Freitas (Research Scientist, DeepMind, London)
Ben Herbst (Professor, Stellenbosch University)
Bonolo Mathibela (Research Scientist, IBM Research South Africa)
​George Konidaris (Assistant Professor, Brown University)​
​Bubacarr Bah (Research Chair, African Institute for Mathematical Sciences, South Africa)

Report from ACM MMSys 2017

–A report from Christian Timmerer, AAU/Bitmovin Austria

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. It is a unique event targeting “multimedia systems” from various angles and views across all domains instead of focusing on a specific aspect or data type. ACM MMSys’17 was held in Taipei, Taiwan in June 20-23, 2017.

MMSys is a single-track conference which hosts also a series of workshops, namely NOSSDAV, MMVE, and NetGames. Since 2016, it kicks off with overview talks and 2017 we’ve seen the following talks: “Geometric representations of 3D scenes” by Geraldine Morin; “Towards Understanding Truly Immersive Multimedia Experiences” by Niall Murray; “Rate Control In The Age Of Vision” by Ketan Mayer-Patel; “Humans, computers, delays and the joys of interaction” by Ragnhild Eg; “Context-aware, perception-guided workload characterization and resource scheduling on mobile phones for interactive applications” by Chung-Ta King and Chun-Han Lin.

Additionally, industry talks have been introduced: “Virtual Reality – The New Era of Future World” by WeiGing Ngang; “The innovation and challenge of Interactive streaming technology” by Wesley Kuo; “What challenges are we facing after Netflix revolutionized TV watching?” by Shuen-Huei Guan; “The overview of app streaming technology” by Sam Ding; “Semantic Awareness in 360 Streaming” by Shannon Chen; “On the frontiers of Video SaaS” by Sega Cheng.

An interesting set of keynotes presented different aspects related multimedia systems and its co-located workshops:

  • Henry Fuchs, The AR/VR Renaissance: opportunities, pitfalls, and remaining problems
  • Julien Lai, Towards Large-scale Deployment of Intelligent Video Analytics Systems
  • Dah Ming Chiu, Smart Streaming of Panoramic Video
  • Bo Li, When Computation Meets Communication: The Case for Scheduling Resources in the Cloud
  • Polly Huang, Measuring Subjective QoE for Interactive System Design in the Mobile Era – Lessons Learned Studying Skype Calls

IMG_4405The program included a diverse set of topics such as immersive experiences in AR and VR, network optimization and delivery, multisensory experiences, processing, rendering, interaction, cloud-based multimedia, IoT connectivity, infrastructure, media streaming, and security. A vital aspect of MMSys is dedicated sessions for showcasing latest developments in the area of multimedia systems and presenting datasets, which is important towards enabling reproducibility and sustainability in multimedia systems research.

The social events were a perfect venue for networking and in-depth discussion how to advance the state of the art. A welcome reception was held at “LE BLE D’OR (Miramar)”, the conference banquet at the Taipei World Trade Center Club, and finally a tour to the Shilin Night Market was organized.

ACM MMSys 2917 issued the following awards:

  • The Best Paper Award  goes to “A Scalable and Privacy-Aware IoT Service for Live Video Analytics” by Junjue Wang (Carnegie Mellon University), Brandon Amos (Carnegie Mellon University), Anupam Das (Carnegie Mellon University), Padmanabhan Pillai (Intel Labs), Norman Sadeh (Carnegie Mellon University), and Mahadev Satyanarayanan (Carnegie Mellon University).
  • The Best Student Paper Award goes to “A Measurement Study of Oculus 360 Degree Video Streaming” by Chao Zhou (SUNY Binghamton), Zhenhua Li (Tsinghua University), and Yao Liu (SUNY Binghamton).
  • The NOSSDAV’17 Best Paper Award goes to “A Comparative Case Study of HTTP Adaptive Streaming Algorithms in Mobile Networks” by Theodoros Karagkioules (Huawei Technologies France/Telecom ParisTech), Cyril Concolato (Telecom ParisTech), Dimitrios Tsilimantos (Huawei Technologies France), Stefan Valentin (Huawei Technologies France).

Excellence in DASH award sponsored by the DASH-IF 

  • 1st place: “SAP: Stall-Aware Pacing for Improved DASH Video Experience in Cellular Networks” by Ahmed Zahran (University College Cork), Jason J. Quinlan (University College Cork), K. K. Ramakrishnan (University of California, Riverside), and Cormac J. Sreenan (University College Cork)
  • 2nd place: “Improving Video Quality in Crowded Networks Using a DANE” by Jan Willem Kleinrouweler, Britta Meixner and Pablo Cesar (Centrum Wiskunde & Informatica)
  • 3rd place: “Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP” by Mario Graf (Bitmovin Inc.), Christian Timmerer (Alpen-Adria-Universität Klagenfurt / Bitmovin Inc.), and Christopher Mueller (Bitmovin Inc.)

Finally, student travel grants awards have been sponsored by SIGMM. All details including nice pictures can be found here.


ACM MMSys 2018 will be held in Amsterdam, The Netherlands, June 12 – 15, 2018 and includes the following tracks:

  • Research track: Submission deadline on November 30, 2017
  • Demo track: Submission deadline on February 25, 2018
  • Open Dataset & Software Track: Submission deadline on February 25, 2018

MMSys’18 co-locates the following workshops (with submission deadline on March 1, 2018):

  • MMVE2018: 10th International Workshop on Immersive Mixed and Virtual Environment Systems,
  • NetGames2018: 16th Annual Worksop on Network and Systems Support for Games,
  • NOSSDAV2018: 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video,
  • PV2018: 23rd Packet Video Workshop

MMSys’18 includes the following special sessions (submission deadline on December 15, 2017):

Report from ICMR 2017

ACM International Conference on Multimedia Retrieval (ICMR) 2017

ACM ICMR 2017 in “Little Paris”

ACM ICMR is the premier International Conference on Multimedia Retrieval, and from 2011 it “illuminates the state of the arts in multimedia retrieval”. This year, ICMR was in an wonderful location: Bucharest, Romania also known as “Little Paris”. Every year at ICMR I learn something new. And here is what I learnt this year.

ICMR2017

Final Conference Shot at UP Bucharest

UNDERSTANDING THE TANGIBLE: object, scenes, semantic categories – everything we can see.

1) Objects (and YODA) can be easily tracked in videos.

Arnold Smeulders delivered a brilliant keynote on “things” retrieval: given an object in an image, can we find (and retrieve) it in other images, videos, and beyond? Very interesting technique for tracking objects (e.g. Yoda) in videos based on similarity learnt through siamese networks.

Tracking Yoda with Siamese Networks

2) Wearables + computer vision help explore cultural heritage sites.

As showed in his keynote, at MICC University of Florence, Alberto del Bimbo and his amazing team have designed smart audio guides for indoor and outdoor spaces. The system detects, recognises, and describes landmarks and artworks from wearable camera inputs (and GPS coordinates, in case of outdoor spaces).

3) We can finally quantify how much images provide complementary semantics compared to text [BEST MULTIMODAL PAPER AWARD].

For ages, the community has asked how relevant different modalities are for multimedia analysis: this paper (http://dl.acm.org/citation.cfm?id=3078991) finally proposes a solution to quantify information gaps between different modalities.

4) Exploring news corpuses is now very easy: news graphs are easy to navigate and aware of the type of relations between articles.

Remi Bois and his colleagues presented this framework (http://dl.acm.org/citation.cfm?id=3079023), made for professional journalists and the general public, for seamlessly browsing through large-scale news corpus. They built a graph where nodes are articles in a news corpus. The most relevant items to each article are chosen (and linked) based on an adaptive nearest neighbor technique. Each link is then characterised according to the type of relation of the 2 linked nodes.

5) Panorama outdoor images are much easier to localise.

In his beautiful work (https://t.co/3PHCZIrA4N), Ahmet Iscen from Inria developed an algorithm for location prediction from StreetView images, outperforming the state of the art thanks to an intelligent stitching pre-processing step: predicting locations from panoramas (stitched individual views) instead of individual street images improves performances dramatically!

UNDERSTANDING THE INTANGIBLE: artistic aspects, beauty, intent: everything we can perceive

1) Image search intent can be predicted by the way we look.

In his best paper candidate research work (http://dl.acm.org/citation.cfm?id=3078995), Mohammad Soleymani showed that image search intent (seeking information, finding content, or re-finding content) can be predicted from physisological responses (eye gaze) and implicit user interaction (mouse movements).

2) Real-time detection of fake tweets is now possible using user and textual cues.

Another best paper candidate (http://dl.acm.org/citation.cfm?id=3078979), this time from CERTH. The team collected a large dataset of fake/real sample tweets spanning 17 events and built an effective model from misleading content detection from tweet content and user characteristics. A live demo here: http://reveal-mklab.iti.gr/reveal/fake/

3) Music tracks have different functions in our daily lives.

Researchers from TU Delft have developed an algorithm (http://dl.acm.org/citation.cfm?id=3078997) which classifies music tracks according to their purpose in our daily activities: relax, study and workout.

4) By transferring image style we can make images more memorable!

The team at University of Trento built an automatic framework (https://arxiv.org/abs/1704.01745) to improve image memorability. A selector finds the style seeds (i.e. abstract paintings) which are likely to increase memorability of a given image, and after style transfer, the image will be more memorable!

5) Neural networks can help retrieve and discover child book illustrations.

In this amazing work (https://arxiv.org/pdf/1704.03057.pdf), motivated by real children experiences, Pinar and her team from Hacettepe University collected a large dataset of children book illustrations and found that neural networks can predict and transfer style, allowing to make “Winnie the witch”-like many other illustrations.

Winnie the Witch

6) Locals perceive their neighborhood as less interesting, more dangerous and dirtier compared to non-locals.

In this wonderful work (http://www.idiap.ch/~gatica/publications/SantaniRuizGatica-icmr17.pdf), presented by Darshan Santain from IDIAP, researchers asked locals and crowd-workers to look at pictures from various neighborhoods in Guanajuato and rate them according to interestingness, cleanliness, and safety.

THE FUTURE: What’s Next?

1) We will be able to anonymize images of outdoor spaces thanks to Instagram filters, as proposed by this work (http://dl.acm.org/citation.cfm?id=3080543) in the Brave New Idea session.  When an image of an outdoor space is manipulated with appropriate Instagram filters, the location of the image can be masked from vision-based geolocation classifiers.

2) Soon we will be able to embed watermarks in our Deep Neural Network models in order to protect our intellectual property [BEST PAPER AWARD]. This is a disruptive, novel idea, and that is why this work from KDDI Research and Japan National Institute of Informatics won the best paper award. Congratulations!

3) Given an image view of an object, we will predict the other side of things (from Smeulders’ keynote). In the pic: predicting the other side of chairs. Beautiful.

Predicting the other side of things

THANKS: To the organisers, to the volunteers, and to all the authors for their beautiful work :)

EDITORIAL NOTE: A more extensive report from ICMR 2017 by Miriam is available on Medium