Report from ACM ICMR 2018 – by Cathal Gurrin

 

Multimedia computing, indexing, and retrieval continue to be one of the most exciting and fastest-growing research areas in the field of multimedia technology. ACM ICMR is the premier international conference that brings together experts and practitioners in the field for an annual conference. The eighth ACM International Conference on Multimedia Retrieval (ACM ICMR 2018) took place from June 11th to 14th, 2018 in Yokohama, Japan’s second most populous city. ACM ICMR 2018 featured a diverse range of activities including: Keynote talks, Demonstrations, Special Sessions and related Workshops, a Panel, a Doctoral Symposium, Industrial Talks, Tutorials, alongside regular conference papers in oral and poster session. The full ICMR2018 schedule can be found on the ICMR 2018 website <http://www.icmr2018.org/>. The organisers of ACM ICMR 2018 placed a large emphasis on generating a high-quality programme and in 2018; ICMR received 179 submissions to the main conference, with 21 accepted for oral presentation and 23 for poster presentation. A number of key themes emerged from the published papers at the conference: deep neural networks for content annotation; multimodal event detection and summarisation; novel multimedia applications; multimodal indexing and retrieval; and video retrieval from regular & social media sources. In addition, a strong emphasis on the user (in terms of end-user applications and user-predictive models) was noticeable throughout the ICMR 2018 programme. Indeed, the user theme was central to many of the components of the conference, from the panel discussion to the keynotes, workshops and special sessions. One of the most memorable elements of ICMR 2018 was a panel discussion on the ‘Top Five Problems in Multimedia Retrieval’ http://www.icmr2018.org/program_panel.html. The panel was composed of leading figures in the multimedia retrieval space: Tat-Seng Chua (National University of Singapore); Michael Houle (National Institute of Informatics); Ramesh Jain (University of California, Irvine); Nicu Sebe (University of Trento) and Rainer Lienhart (University of Augsburg). An engaging panel discussion was facilitated by Chong-Wah Ngo (City University of Hong Kong) and Vincent Oria (New Jersey Institute of Technology). The common theme was that multimedia retrieval is a hard challenge and that there are a number of fundamental topics that we need to make progress in, including bridging the semantic and user gaps, improving approaches to multimodal content fusion, neural network learning, addressing the challenge of processing at scale and the so called “curse of dimensionality”. ICMR2018 included two excellent keynote talks <http://www.icmr2018.org/program_keynote.html>. Firstly, Kohji Mitani, the Deputy Director of Science & Technology Research Laboratories NHK (Japan Broadcasting Corporation) explained about the ongoing evolution of broadcast technology and the efforts underway to create new (connected) broadcast services that can provide viewing experiences never before imagined and user experiences more attuned to daily life. The second keynote from Shunji Yamanaka, from The University of Tokyo discussed his experience of prototyping new user technologies and highlighted the importance of prototyping as a process that bridges an ever increasing gap between advanced technological solutions and societal users. During this entertaining and inspiring talk many prototypes developed in Yamanaka’s lab were introduced and the related vision explained to an eager audience. Three workshops were accepted for ACM ICMR 2018, covering the fields of lifelogging, art and real-estate technologies. Interestingly, all three workshops focused on domain specific applications in three emerging fields for multimedia analytics, all related to users and the user experience. The “LSC2018 – Lifelog Search Challenge”< http://lsc.dcu.ie/2018/> workshop was a novel and highly entertaining workshop modelled on the successful Video Browser Showdown series of participation workshops at the annual MMM conference. LSC was a participation workshop, which means that the participants wrote a paper describing a prototype interactive retrieval system for multimodal lifelog data. It was then evaluated during a live interactive search challenge during the workshop. Six prototype systems took part in the search challenge in front of an audience that reached fifty conference attendees. This was a popular and exciting workshop and could become a regular feature at future ICMR conferences. The second workshop was the MM-Art & ACM workshop <http://www.attractiveness-computing.org/mmart_acm2018/index.html>, which was a joint workshop that merged two existing workshops, the International Workshop on Multimedia Artworks Analysis (MMArt) and the International Workshop on Attractiveness Computing in Multimedia (ACM). The aim of the joint workshop was to enlarge the scope of discussion issues and inspire more works in related fields. The papers at the workshop focused on the creation, editing and retrieval of art-related multimedia content. The third workshop was RETech 2018 <https://sites.google.com/view/multimedia-for-retech/>, the first international workshop on multimedia for real estate tech. In recent years there has been a huge uptake of multimedia processing and retrieval technologies in the domain, but there are still a lot of challenges remaining, such as quality, cost, sensitivity, diversity, and attractiveness to users of content. In addition, ICMR 2018 included three tutorials <http://www.icmr2018.org/program_tutorial.html> on topical areas for the multimedia retrieval communities. The first was ‘Objects, Relationships and Context in Visual Data’ by Hanwang Zhang and Qianru Sun. The second was ‘Recommendation Technologies for Multimedia Content’ by Xiangnan He, Hanwang Zhang and Tat-Seng Chua and the final tutorial was ‘Multimedia Content Understanding, my Learning from very few Examples’ by Guo-Jun Qi. All tutorials were well received and feedback was very good. Other aspects of note from ICMR2018 were a doctoral symposium that attracted five authors and a dedicated industrial session that had four industrial talks highlighting the multimedia retrieval challenges faced by industry. It was interesting from the industrial talks to hear how the analytics and retrieval technologies developed over years and presented at venues such as ICMR were actually being deployed in real-world user applications by large organisations such as NEC and Hitachi. It is always a good idea to listen to the real-world applications of the research carried out by our community. The best paper session at ICMR 2018 had four top ranked works covering multimodal, audio and text retrieval. The best paper award went to ‘Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval’, by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury. The best Multi-Modal Paper Award winner was ‘Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing’ by Kevin Joslyn, Kai Li and Kien Hua. In addition, there were awards for best poster ‘PatternNet: Visual Pattern Mining with Deep Neural Network’ by Hongzhi Li, Joseph Ellis, Lei Zhang and Shih-Fu Chang, and best demo ‘Dynamic construction and manipulation of hierarchical quartic image graphs’ by Nico Hezel and Kai Uwe Barthel. Finally, although often overlooked, there were six reviewers commended for their outstanding reviews; Liqiang Nie, John Kender, Yasushi Makihara, Pascal Mettes, Jianquan Liu, and Yusuke Matsui. As with some other ACM sponsored conferences, ACM ICMR 2018 included an award for the most active social media commentator, which is how I ended up writing this report. There were a number of active social media commentators at ICMR 2018 each of which provided a valuable commentary on the proceedings and added to the historical archive.
fig1

Of course, the social side of a conference can be as important as the science. ICMR 2018 included two main social events, a welcome reception and the conference banquet. The welcome reception took place at the Fisherman’s Market, an Asian and ethnic dining experience with a wide selection of Japanese food available. The Conference Banquet took place in the Hotel New Grand, which was built in 1927 and has a long history of attracting famous guests. The venue is famed for the quality of the food and the spectacular panoramic views of the port of Yokohama. As with the rest of the conference, the banquet food was top-class with more than one of the attendees commenting that the Japanese beef on offer was the best they had ever tasted.

ICMR 2018 was an exciting and excellently organised conference and it is important to acknowledge the efforts of the general co-chairs: Kiyoharu Aizawa (The Univ. Of Tokyo), Michael Lew (Leiden Univ.) and Shin’ichi Satoh (National Inst. Of Informatics). They were ably assisted by the TPC co-chairs, Benoit Huet (Eurecom), Qi Tian (Univ. Of Texas at San Antonio) and Keiji Yanai (The Univ. Of Electro-Comm), who coordinated the reviews from a 111 person program committee in a double-blind manner, with an average of 3.8 reviews being prepared for every paper. ICMR 2019 will take place in Ottawa, Canada in June 2019 and ICMR 2020 will take place in Dublin, Ireland in June 2020. I hope to see you all there and continuing the tradition of excellent ICMR conferences.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

Shunji Yamanaka about to begin his keynote talk on Prototyping

Shunji Yamanaka about to begin his keynote talk on Prototyping

Kiyoharu Aizawa and Shin'ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

Kiyoharu Aizawa and Shin’ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

ACM Multimedia 2019 and Reproducibility in Multimedia Research

The first months of the new calendar year, multimedia researchers traditionally are hard at work on their ACM Multimedia submissions. (This year the submission deadline is 1 April.) Questions of reproducibility, including those of data set availability and release, are at the forefront of everyone’s mind. In this edition of SIGMM Records, the editors of the “Data Sets and Benchmarks” column have teamed up with two intersecting groups, the Reproducibility Chairs and the General Chairs of ACM Multimedia 2019, to bring you a column about reproducibility in multimedia research and the connection between reproducible research and publicly available data sets. The column highlights the activities of SIGMM towards implementing ACM paper badging. ACM MMSys has pushed our community forward on reproducibility and pioneered the use of ACM badging [1]. We are proud that in 2019 the newly established Reproducibility track will introduce badging at ACM Multimedia.

Complete information on Reproducibility at ACM Multimedia is available at:  https://project.inria.fr/acmmmreproducibility/

The importance of reproducibility

Researchers intuitively understand the importance of reproducibility. Too often, however, it is explained superficially, with statements such as, “If you don’t pay attention to reproducibility, your paper will be rejected”. The essence of the matter lies deeper: reproducibility is important because of its role in making scientific progress possible.

What is this role exactly? The reason that we do research is to contribute to the totality of knowledge at the disposal of humankind. If we think of this knowledge as a building, i.e. a sort of edifice, the role of reproducibility is to provide the strength and stability that makes it possible to build continually upwards. Without reproducibility, there would simply be no way of creating new knowledge.

ACM provides a helpful characterization of reproducibility: “An experimental result is not fully established unless it can be independently reproduced” [2]. In short, a result that is obtainable only once is not actually a result.

Reproducibility and scientific rigor are often mentioned in the same breath. Rigorous research provides systematic and sufficient evidence for its contributions. For example, in an experimental paper, the experiments must be properly designed and the conclusions of the paper must be directly supported by the experimental findings. Rigor involves careful analysis, interpretation, and reporting of the research results. Attention to reproducibility can be considered a part of rigor.

When we commit ourselves to reproducible research, we also commit ourselves to making sure that the research community has what it needs to reproduce our work. This means releasing the data that we use, and also releasing implementations of our algorithms. Devoting time and effort to reproducible research is an important way in which we support Open Science, the movement to make research resources and research results openly accessible to society.

Repeatability vs. Replicability vs. Reproducibility

We frequently use the word “reproducibility” in an informal way that includes three individual concepts, which actually have distinct formal uses: “repeatability”, “replicability” and “reproducibility”. Again, we can turn to ACM for definitions [2]. All three concepts express the idea that research results must be invariant with respect to changes in the conditions under which they were obtained.

Specifically, “repeatability” means that the same research team can achieve the same result using the same setup and resources. “Replicability” means that that team can pass the setup and resources to a different research team, and that that team can also achieve the same result. “Reproducibility” (here, used in the formal sense) means that a different team can achieve the same result using a different setup and different resources. Note the connection to scientific rigor: obtaining the same result multiple times via a process that lacks rigor is meaningless.

When we write a research paper paying attention to reproducibility, it means that we are confident we would obtain the same results again within our own research team, that the paper includes a detailed description of how we achieved the result (and is accompanied by code or other resources), and that we are convinced that other researchers would reach the same conclusions using a comparable, but not identical, set up and resources.

Reproducibility at ACM Multimedia 2019

ACM Multimedia 2019 promotes reproducibility in two ways: First, as usual, reproducibility is one of the review criteria considered by the reviewers (https://www.acmmm.org/2019/reviewer-guidelines/). It is critical that authors describe their approach clearly and completely, and do not omit any details of their implementation or evaluation. Authors should release their data and also provide experimental results on publicly available data. Finally, increasingly, we are seeing authors who include a link to their code or other resources associated with the paper. Releasing resources should be considered a best practice.

The second way that ACM Multimedia 2019 promotes reproducibility is the new Reproducibility Track. Full information is available on the ACM Multimedia Reproducibility website [3]. The purpose of the track is to ensure that authors receive recognition for the effort they have dedicated to making their research reproducible, and also to assign ACM badges to their papers. Next, we summarize the concept of ACM badges, then we will return to discuss the Reproducibility Track in more detail.

ACM Paper badging

Here, we provide a short summary of the information on badging available on the ACM website at [2]. ACM introduced a system of badges in order to help push forward the processes by which papers are reviewed. The goal is to move the attention given to reproducibility to a new level, beyond the level achieved during traditional reviews. Badges seek to motivate authors to use practices leading to better replicability, with the idea that replicability will in turn lead to reproducibility.

In order to understand the badge system, it is helpful to know that ACM badges are divided into two categories. “Artifacts Evaluated” and “Results Evaluated”. ACM defines artifacts as digital objects that are created for the purpose of, or as a result of, carrying out research. Artifacts include implementation code as well as scripts used to run experiments, analyze results, or generate plots. Critically, they also include the data sets that were used in the experiment. The different “Artifacts Evaluated” badges reflect the level of care that authors put into making the artifacts available including how far do they go beyond the minimal functionality necessary and how well are the artifacts are documented.  

There are two “Results Evaluated” badges. The “Results Replicated” badge, which results from a replicability review, and a “Results Reproduced” badge, which results from a full reproducibility review, in which the referees have succeeded in reproducing the results of the paper with only the descriptions of the authors, and without any of the authors’ artifacts. ACM Multimedia adopts the ACM idea that replicability leads to full reproducibility, and for this reason choses to focus in its first year on the “Results replicated” badge. Next we turn to a discussion of the ACM Multimedia 2019 Reproducibility Track and how it implements the “Results Replicated” badge.

Badging ACM MM 2019

Authors of main-conference papers appearing at ACM Multimedia 2018 or 2017 are eligible to make a submission to the Reproducibility Track of ACM Multimedia 2019. The submission has two components: An archive containing the resources needed to replicate the paper, and a short companion paper that contains a description of the experiments that were carried out in the original paper and implemented in the archive. The submissions undergo a formal reproducibility review, and submissions that pass receive a “Results Replicated” badge, which  is added to the original paper in the ACM Digital Library. The companion paper appears in the proceedings of ACM Multimedia 2019 (also with a badge) and is presented at the conference as a poster.

ACM defines the badges, but the choice of which badges to award, and how to implement the review process that leads to the badge, is left to the individual conferences. The consequence is that the design and implementation of the ACM Multimedia Reproducibility Track requires a number of important decisions as well as careful implementation.

A key consideration when designing the ACM Multimedia Reproducibility Track was the work of the reproducibility reviewers. These reviewers carry out tasks that go beyond those of main-conference reviewers, since they must use the authors’ artifacts to replicate their results. The track is designed such that the reproducibility reviewers are deeply involved in the process. Because the companion paper is submitted a year after the original paper, reproducibility reviewers have plenty of time to dive into the code and work together with the authors. During this intensive process, the reviewers extend the originally submitted companion paper with a description of the review process and become authors on the final version of the companion paper.

The ACM Multimedia Reproducibility Track is expected to run similarly in years beyond 2019. The experience gained in 2019 will allow future years to tweak the process in small ways if it proves necessary, and also to expand to other ACM badges.

The visibility of badged papers is important for ACM Multimedia. Visibility incentivizes the authors who submit work to the conference to apply best practices in reproducibility. Practically, the visibility of badges also allows researchers to quickly identify work that they can build on. If a paper presenting new research results has a badge, researchers can immediately understand that this paper would be straightforward to use as a baseline, or that they can build confidently on the paper results without encountering ambiguities, technical issues, or other time-consuming frustrations.

The link between reproducibility and multimedia data sets

The link between Reproducibility and Multimedia Data Sets has been pointed out before, for example, in the theme chosen by the ACM Multimedia 2016 MMCommons workshop, “Datasets, Evaluation, and Reproducibility” [4]. One of the goals of this workshop was to discuss how data challenges and benchmarking tasks can catalyze the reproducibility of algorithms and methods.

Researchers who dedicate time and effort to creating and publishing data sets are making a valuable contribution to research. In order to compare the effectiveness of two algorithms, all other aspects of the evaluation must be controlled, including the data set that is used. Making data sets publicly available supports the systematic comparison of algorithms that is necessary to demonstrate that new algorithms are capable of outperforming the state of the art.

Considering the definitions of “replicability” and “reproducibility” introduced above, additional observations can be made about the importance of multimedia data sets. Creating and publishing data sets supports replicability. In order to replicate a research result, the same resources as used in the original experiments, including the data set, must be available to research teams beyond the one who originally carried out the research.

Creating and publishing data sets also supports reproducibility (in the formal sense of the word defined above). In order to reproduce research results, however, it is necessary that there is more than one data set available that is suitable for carrying out evaluation of a particular approach or algorithm. Critically, the definition of reproducibility involves using different resources than were used in the original work. As the multimedia community continues to move from replication to reproduction, it is essential that a large number of data sets are created and published, in order to ensure that multiple data sets are available to assess the reproducibility of research results.

Acknowledgements

Thank you to people whose hard work is making reproducibility at ACM Multimedia happen: This includes the 2019 TPC Chairs, main-conference ACs and reviewers, as well as the Reproducibility reviewers. If you would like to volunteer to be a reproducibility committee member in this or future years, please contact the Reproducibility Chairs at MM19-Repro@sigmm.org

[1] Simon, Gwendal. Reproducibility in ACM MMSys Conference. Blogpost, 9 May 2017 http://peerdal.blogspot.com/2017/05/reproducibility-in-acm-mmsys-conference.html Accessed 9 March 2019.

[2] ACM, Artifact Review and Badging, Reviewed April 2018,  https://www.acm.org/publications/policies/artifact-review-badging Accessed 9 March 2019.

[3] ACM MM Reproducibility: Information on Reproducibility at ACM Multimedia https://project.inria.fr/acmmmreproducibility/ Accessed 9 March 2019.

[4] Bart Thomee, Damian Borth, and Julia Bernd. 2016. Multimedia COMMONS Workshop 2016 (MMCommons’16): Datasets, Evaluation, and Reproducibility. In Proceedings of the 24th ACM international conference on Multimedia (MM ’16). ACM, New York, NY, USA, 1485-1486.

JPEG Column: 82nd JPEG Meeting in Lisbon, Portugal

The 82nd JPEG meeting was held in Lisbon, Portugal. Highlights of the meeting are progress on JPEG XL, JPEG XS, HTJ2K, JPEG Pleno, JPEG Systems and JPEG reference software.

JPEG has been the most common representation format of digital images for more than 25 years. Other image representation formats have been standardised by JPEG committee like JPEG 2000 or more recently JPEG XS. Furthermore, JPEG has been extended with new functionalities like HDR or alpha plane coding with the JPEG XT standard, and more recently with a reference software. Another solutions have been also proposed by different players with limited success. The JPEG committee decided it is the time to create a new working item, named JPEG XL, that aims to develop an image coding standard with increased quality and flexibility combined with a better compression efficiency. The evaluation of the call for proposals responses had already confirmed the industry interest, and development of core experiments has now begun. Several functionalities will be considered, like support for lossless transcoding of images represented with JPEG standard.

A 2nd workshop on media blockchain technologies was held in Lisbon, collocated with the JPEG meeting. Touradj Ebrahimi and Frederik Temmermans opened the workshop with presentations on relevant JPEG activities such as JPEG Privacy and Security. Thereafter, Zekeriya Erkin made a presentation on blockchain, distributed trust and privacy, and Carlos Serrão presented an overview of the ISO/TC 307 standardization work on blockchain and distributed ledger technologies. The workshop concluded with a panel discussion chaired by Fernando Pereira where the interoperability of blockchain and media technologies was discussed. A 3rd workshop is planned during the 83rd meeting to be held in Geneva, Switzerland on March 20th, 2019.

The 82nd JPEG meeting had the following highlights: jpeg82ndpicS

  • The new working item JPEG XL
  • JPEG Pleno
  • JPEG XS
  • HTJ2K
  • JPEG Systems – JUMBF & JPEG 360
  • JPEG reference software

 

The following summarizes various highlights during JPEG’s Lisbon meeting. As always, JPEG welcomes participation from industry and academia in all its standards activities.

JPEG XL

The JPEG Committee launched JPEG XL with the aim of developing a standard for image coding that offers substantially better compression efficiency when compared to existing image formats, along with features desirable for web distribution and efficient compression of high quality images. Subjective tests conducted by two independent research laboratories were presented at the 82nd meeting in Lisbon and indicate promising results that compare favorably with state of the art codecs.

A development software for the JPEG XL verification model is currently being implemented. A series of experiments have been also defined for improving the above model; these experiments address new functionalities such as lossless coding and progressive decoding.

JPEG Pleno

The JPEG Committee has three activities in JPEG Pleno: Light Field, Point Cloud, and Holographic image coding.

At the Lisbon meeting, Part 2 of JPEG Pleno Light Field was refined and a Committee Draft (CD) text was prepared. A new round of core experiments targets improved subaperture image prediction quality and scalability functionality.

JPEG Pleno Holography will be hosting a workshop on March 19th, 2019 during the 83rd JPEG meeting in Geneva. The purpose of this workshop is to provide insights in the status of holographic applications such as holographic microscopy and tomography, displays and printing, and to assess their impact on the planned standardization specification. This workshop invites participation from both industry and academia experts. Information on the workshop can be find at https://jpeg.org/items/20190228_pleno_holography_workshop_geneva_announcement.html

JPEG XS

The JPEG Committee is pleased to announce a new milestone of the JPEG XS project, with the Profiles and Buffer Models (JPEG XS ISO/IEC 21122 Part 2) submitted to ISO for immediate publication as International Standard.

This project aims at standardization of a visually lossless low-latency and lightweight compression scheme that can be used as a mezzanine codec within any AV market. Among the targeted use cases are video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example in cameras and in the automotive industry). The Core Coding System allows for visually lossless quality at moderate compression rates, scalable end-to-end latency ranging from less than a line to a few lines of the image, and low complexity real time implementations in ASIC, FPGA, CPU and GPU. The new part “Profiles and Buffer Models” defines different coding tools subsets addressing specific application fields and use cases. For more information, interested parties are invited to read the JPEG White paper on JPEG XS that has been recently published on the JPEG website (https://jpeg.org).

 HTJ2K

The JPEG Committee continues its work on ISO/IEC 15444-15 High-Throughput JPEG 2000 (HTJ2K) with the development of conformance codestreams and reference software, improving interoperability and reducing obstacles to implementation.

The HTJ2K block coding algorithm has demonstrated an average tenfold increase in encoding and decoding throughput compared to the block coding algorithm currently defined by JPEG 2000 Part 1. This increase in throughput results in an average coding efficiency loss of 10% or less in comparison to the most efficient modes of the block coding algorithm in JPEG 2000 Part 1, and enables mathematically lossless transcoding to-and-from JPEG 2000 Part 1 codestreams.

JPEG Systems – JUMBF & JPEG 360

At the 82nd JPEG meeting, the Committee DIS ballots were completed, comments reviewed, and the standard progressed towards FDIS text for upcoming ballots on “JPEG Universal Metadata Box Format (JUMBF)” as ISO/IEC 19566-5, and “JPEG 360” as ISO/IEC 19566-6. Investigations continued to generalize the framework to other applications relying on JPEG (ISO/IEC 10918 | ITU-T.81), and JPEG Pleno Light Field.

JPEG reference software

With the JPEG Reference Software reaching FDIS stage, the JPEG Committee reaches an important milestone by extending its specifications with a new part containing a reference software. With its FDIS release, two implementations will become official reference to the most successful standard of the JPEG Committee: The fast and widely deployed libjpeg-turbo code, along with a complete implementation of JPEG coming from the Committee itself that also covers coding modes that were only known by a few experts.

 

Final Quote

“One of the strengths of the JPEG Committee has been in its ability to identify important trends in imaging technologies and their impact on products and services. I am delighted to see that this effort still continues and the Committee remains attentive to future.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 82nd JPEG Meeting was held on 19-25 October 2018, in Lisbon, Portugal. The next 83rd JPEG Meeting will be held on 16-22 March 2019, in Geneva, Switzerland.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 83, Geneva, Switzerland, March 16 to 22, 2019
  • No 84, Brussels, Belgium, July 13 to 19, 2019

 

MPEG Column: 125th MPEG Meeting in Marrakesh, Morocco

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 125th MPEG meeting concluded on January 18, 2019 in Marrakesh, Morocco with the following topics:

  • Network-Based Media Processing (NBMP) – MPEG promotes NBMP to Committee Draft stage
  • 3DoF+ Visual – MPEG issues Call for Proposals on Immersive 3DoF+ Video Coding Technology
  • MPEG-5 Essential Video Coding (EVC) – MPEG starts work on MPEG-5 Essential Video Coding
  • ISOBMFF – MPEG issues Final Draft International Standard of Conformance and Reference software for formats based on the ISO Base Media File Format (ISOBMFF)
  • MPEG-21 User Description – MPEG finalizes 2nd edition of the MPEG-21 User Description

The corresponding press release of the 125th MPEG meeting can be found here. In this blog post I’d like to focus on those topics potentially relevant for over-the-top (OTT), namely NBMP, EVC, and ISOBMFF.

Network-Based Media Processing (NBMP)

The NBMP standard addresses the increasing complexity and sophistication of media services, specifically as the incurred media processing requires offloading complex media processing operations to the cloud/network to keep receiver hardware simple and power consumption low. Therefore, NBMP standard provides a standardized framework that allows content and service providers to describe, deploy, and control media processing for their content in the cloud. It comes with two main functions: (i) an abstraction layer to be deployed on top of existing cloud platforms (+ support for 5G core and edge computing) and (ii) a workflow manager to enable composition of multiple media processing tasks (i.e., process incoming media and metadata from a media source and produce processed media streams and metadata that are ready for distribution to a media sink). The NBMP standard now reached Committee Draft (CD) stage and final milestone is targeted for early 2020.

In particular, a standard like NBMP might become handy in the context of 5G in combination with mobile edge computing (MEC) which allows offloading certain tasks to a cloud environment in close proximity to the end user. For OTT, this could enable lower latency and more content being personalized towards the user’s context conditions and needs, hopefully leading to a better quality and user experience.

For further research aspects please see one of my previous posts

MPEG-5 Essential Video Coding (EVC)

MPEG-5 EVC clearly targets the high demand for efficient and cost-effective video coding technologies. Therefore, MPEG commenced work on such a new video coding standard that should have two profiles: (i) royalty-free baseline profile and (ii) main profile, which adds a small number of additional tools, each of which is capable, on an individual basis, of being either cleanly switched off or else switched over to the corresponding baseline tool. Timely publication of licensing terms (if any) is obviously very important for the success of such a standard.

The target coding efficiency for responses to the call for proposals was to be at least as efficient as HEVC. This target was exceeded by approximately 24% and the development of the MPEG-5 EVC standard is expected to be completed in 2020.

As of today, there’s the need to support AVC, HEVC, VP9, and AV1; soon VVC will become important. In other words, we already have a multi-codec environment to support and one might argue one more codec is probably not a big issue. The main benefit of EVC will be a royalty-free baseline profile but with AV1 there’s already such a codec available and it will be interesting to see how the royalty-free baseline profile of EVC compares to AV1.

For a new video coding format we will witness a plethora of evaluations and comparisons with existing formats (i.e., AVC, HEVC, VP9, AV1, VVC). These evaluations will be mainly based on objective metrics such as PSNR, SSIM, and VMAF. It will be also interesting to see subjective evaluations, specifically targeting OTT use cases (e.g., live and on demand).

ISO Base Media File Format (ISOBMFF)

The ISOBMFF (ISO/IEC 14496-12) is used as basis for many file (e.g., MP4) and streaming formats (e.g., DASH, CMAF) and as such received widespread adoption in both industry and academia. An overview of ISOBMFF is available here. The reference software is now available on GitHub and a plethora of conformance files are available here. In this context, the open source project GPAC is probably the most interesting aspect from a research point of view.

SISAP 2018: 11th International Conference on Similarity Search and Applications

The International Conference on Similarity Search and Applications (SISAP) is an annual forum for researchers and application developers in the area of similarity data management. It aims at the technological problems shared by numerous application domains, such as data mining, information retrieval, multimedia, computer vision, pattern recognition, computational biology, geography, biometrics, machine learning, and many others that make use of similarity search as a necessary supporting service.

From its roots as a regional workshop in metric indexing, SISAP has expanded to become the only international conference entirely devoted to the issues surrounding the theory, design, analysis, practice, and application of content-based and feature-based similarity search. The SISAP initiative has also created a repository serving the similarity search community, for the exchange of examples of real-world applications, the source code for similarity indexes, and experimental testbeds and benchmark data sets (http://www.sisap.org). The proceedings of SISAP are published by Springer as a volume in the Lecture Notes in Computer Science (LNCS) series.

The 2018 edition of SISAP was held at the Universidad de Ingeniería y Tecnología (UTEC) in one of the oldest neighborhoods of Lima, in a modern building just recently inaugurated. The conference was held back-to-back, with a shared session, with the International Symposium on String Processing and Information Retrieval (SPIRE), an independent symposium with some intersection with SISAP. The organization was smooth and with a strong technical program assembled by two co-chairs and sixty program committee members. Each paper was reviewed by at least three referees. The program was completed with three invited speakers of high caliber.

During this 11th edition of SISAP, the first invited speaker was Hanan Samet (http://www.cs.umd.edu/~hjs/) from the University of Maryland, a pioneer in the similarity search field, with several books published on the subject. Professor Samet presented a state of the art system for news search based on the geographical location of the user to get more accurate results. The second invited speaker was Alistair Moffat (https://people.eng.unimelb.edu.au/ammoffat/) from the University of Melbourne, who delivered a talk about a novel technique for building compressed indexes using Asymmetric Numeral Systems (ANS). The ANS is a curious case of a scientific breakthrough not published in a peer-reviewed journal. Although it is available only as an arXiv technical, it is widely used in the industry – from Google and Facebook to Amazon, the adoption has been widespread. The third keynote talk was delivered in the shared session with SPIRE by Moshe Vardi (https://www.cs.rice.edu/~vardi/) of Rice University, a most celebrated editor of Communications of the ACM. Professor Vardi’s talk was an eye-opening discussion of jobs conquered by machines and the perspectives in accepting technological changes in everyday life. In the same shared session, a keynote presentation of SPIRE was given by Nataša Przulj (http://www0.cs.ucl.ac.uk/staff/natasa/) of University College London, concerning molecular networks and the challenges researchers face in developing a better understanding of them. It is worth noting that roughly 10% of the SPIRE participants were inspired to attend the SISAP technical program.

As it is usually the case, SISAP 2018 included a program with papers exploring various similarity-aware data analysis and processing problems from multiple perspectives. The papers presented at the conference in 2018 studied the role of similarity processing in the context of metric search, visual search, nearest neighbor queries, clustering, outlier detection, and graph analysis. Some of the papers had a theoretical emphasis, while others had a systems perspective, presenting experimental evaluations comparing against state-of-the-art methods. An interesting event at the 2018 conference, as well as the two previous editions, was a poster session that included all accepted papers. This component of the conference generated many lively interactions between presenters and attendees, to not only learn more about the presented techniques but also to identify potential topics for future collaboration.

A shortlist for the Best Paper Award was created from those conference papers nominated by at least one of their 3 reviewers. An award committee of 3 researchers ranked the shortlisted papers, from which a final ranking was decided using Borda count. The Best Paper Award was presented during the Conference Dinner. In a tradition that began with the 2009 conference in Prague, extended versions of the top-ranked papers were invited for a Special Issue of the Information Systems journal.

The venue and the location of SISAP 2018 deserve a special mention. In addition to the excellent conference facilities at UTEC, we had many student volunteers who were ready to help ensure that the logistical aspects of the conference ran smoothly. Lima was a superb location for the conference. Our conference dinner was held at the Huaca Pucllana Restaurant, located on the site of amazing archaeological remains within the city itself. We also had many opportunities to enjoy excellently-prepared traditional Peruvian food and drink. Before and after the conference, many participants chose to visit Machu Picchu, voted as one of the New Seven Wonders of the World.

SISAP 2018 demonstrated that the SISAP community has a strong stable kernel of researchers, active in the field of similarity search and to fostering the growth of the community. Organizing SISAP is a smooth experience thanks to the support of the Steering Committee and dedicated participants.

SISAP 2019 will be organized in Newark (NJ, USA) by Professor Vincent Oria (NJIT). This attractive location in the New York City metropolitan area will allow for easy and convenient travel to and from the conference. One of the major challenges of the SISAP conference series is to continue to raise its profile in the landscape of scientific events related to information indexing, database and search systems.

Figure 1. The conference dinner at Pachacamac ruins

Figure 1. The conference dinner at Pachacamac ruins

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 3. Keynote by Vardi

Figure 3. Keynote by Vardi

MPEG Column: 124th MPEG Meeting in Macau, China

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The MPEG press release comprises the following aspects:

  • Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage
  • Compressed Representation of Neural Networks – MPEG issues Call for Proposals
  • Low Complexity Video Coding Enhancements – MPEG issues Call for Proposals
  • New Video Coding Standard expected to have licensing terms timely available – MPEG issues Call for Proposals
  • Multi-Image Application Format (MIAF) promoted to Final Draft International Standard
  • 3DoF+ Draft Call for Proposal goes Public

Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage

At its 124th meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to Committee Draft (CD) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colour. By leveraging existing and video ecosystems in general (hardware acceleration, transmission services and infrastructure), and future video codecs as well, the V-PCC technology enables new applications. The current V-PCC encoder implementation provides a compression of 125:1, which means that a dynamic point cloud of 1 million points could be encoded at 8 Mbit/s with good perceptual quality.

A next step is the storage of V-PCC in ISOBMFF for which a working draft has been produced. It is expected that further details will be discussed in upcoming reports.

Research aspects: Video-based Point Cloud Compression (V-PCC) is at CD stage and a first working draft for the storage of V-PCC in ISOBMFF has been provided. Thus, a next consequence is the delivery of V-PCC encapsulated in ISOBMFF over networks utilizing various approaches, protocols, and tools. Additionally, one may think of using also different encapsulation formats if needed.

MPEG issues Call for Proposals on Compressed Representation of Neural Networks

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, media coding, data analytics, and many other fields. Their recent success is based on the feasibility of processing much larger and complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. Some applications require the deployment of a particular trained network instance to a potentially large number of devices and, thus, could benefit from a standard for the compressed representation of neural networks. Therefore, MPEG has issued a Call for Proposals (CfP) for compression technology for neural networks, focusing on the compression of parameters and weights, focusing on four use cases: (i) visual object classification, (ii) audio classification, (iii) visual feature extraction (as used in MPEG CDVA), and (iv) video coding.

Research aspects: As point out last time, research here will mainly focus around compression efficiency for both lossy and lossless scenarios. Additionally, communication aspects such as transmission of compressed artificial neural networks within lossy, large-scale environments including update mechanisms may become relevant in the (near) future.

 

MPEG issues Call for Proposals on Low Complexity Video Coding Enhancements

Upon request from the industry, MPEG has identified an area of interest in which video technology deployed in the market (e.g., AVC, HEVC) can be enhanced in terms of video quality without the need to necessarily replace existing hardware. Therefore, MPEG has issued a Call for Proposals (CfP) on Low Complexity Video Coding Enhancements.

The objective is to develop video coding technology with a data stream structure defined by two component streams: a base stream decodable by a hardware decoder and an enhancement stream suitable for software processing implementation. The project is meant to be codec agnostic; in other words, the base encoder and base decoder can be AVC, HEVC, or any other codec in the market.

Research aspects: The interesting aspect here is that this use case assumes a legacy base decoder – most likely realized in hardware – which is enhanced with software-based implementations to improve coding efficiency or/and quality without sacrificing capabilities of the end user in terms of complexity and, thus, energy efficiency due to the software based solution. 

 

MPEG issues Call for Proposals for a New Video Coding Standard expected to have licensing terms timely available

At its 124th meeting, MPEG issued a Call for Proposals (CfP) for a new video coding standard to address combinations of both technical and application (i.e., business) requirements that may not be adequately met by existing standards. The aim is to provide a standardized video compression solution which combines coding efficiency similar to that of HEVC with a level of complexity suitable for real-time encoding/decoding and the timely availability of licensing terms.

Research aspects: This new work item is more related to business aspects (i.e., licensing terms) than technical aspects of video coding.

 

Multi-Image Application Format (MIAF) promoted to Final Draft International Standard

The Multi-Image Application Format (MIAF) defines interoperability points for creation, reading, parsing, and decoding of images embedded in High Efficiency Image File (HEIF) format by (i) only defining additional constraints on the HEIF format, (ii) limiting the supported encoding types to a set of specific profiles and levels, (iii) requiring specific metadata formats, and (iv) defining a set of brands for signaling such constraints including specific depth map and alpha plane formats. For instance, it addresses use case like a capturing device may use one of HEIF codecs with a specific HEVC profile and level in its created HEIF files, while a playback device is only capable of decoding the AVC bitstreams.

Research aspects: MIAF is an application format which is defined as a combination of tools (incl. profiles and levels) of other standards (e.g., audio codecs, video codecs, systems) to address the needs of a specific application. Thus, the research is related to use cases enabled by this application format. 

 

3DoF+ Draft Call for Proposal goes Public

Following investigations on the coding of “three Degrees of Freedom plus” (3DoF+) content in the context of MPEG-I, the MPEG video subgroup has provided evidence demonstrating the capability to encode a 3DoF+ content efficiently while maintaining compatibility with legacy HEVC hardware. As a result, MPEG decided to issue a draft Call for Proposal (CfP) to the public containing the information necessary to prepare for the final Call for Proposal expected to occur at the 125th MPEG meeting (January 2019) with responses due at the 126th MPEG meeting (March 2019).

Research aspects: This work item is about video (coding) and, thus, research is about compression efficiency.

 

What else happened at #MPEG124?

  • MPEG-DASH 3rd edition is still in the final editing phase and not yet available. Last time, I wrote that we expect final publication later this year or early next year and we hope this is still the case. At this meeting Amendment.5 is progressed to DAM and conformance/reference software for SRD, SAND and Server Push is also promoted to DAM. In other words, DASH is pretty much in maintenance mode.
  • MPEG-I (systems part) is working on immersive media access and delivery and I guess more updates will come on this after the next meeting. OMAF is working on a 2nd edition for which a working draft exists and phase 2 use cases (public document) and draft requirements are discussed.
  • Versatile Video Coding (VVC): working draft 3 (WD3) and test model 3 (VTM3) has been issued at this meeting including a large number of new tools. Both documents (and software) will be publicly available after editing periods (Nov. 23 for WD3 and Dec 14 for VTM3).

 

JPEG Column: 81st JPEG Meeting in Vancouver, Canada

The 81st JPEG meeting was held in Vancouver, British Columbia, Canada, at which significant efforts were put into the analysis of the responses to the call for proposals on the next generation image coding standard, nicknamed JPEG XL, that is expected to provide a solution for image format with improved quality and flexibility, allied with a better compression efficiency. The responses to the call confirms the interest of different parties on this activity. Moreover, the initial  subjective and objective evaluation of the different proposals confirm a significative evolution on both quality and compression efficiency that will be provided by the future standard.

Apart the multiple activities related with several standards development, a workshop on Blockchain technologies was held at Telus facilities in Vancouver, with several talks on Blockchain and Distributed Ledger Technologies, and a Panel where the influence of these technologies on multimedia was analysed and discussed. A new workshop is planned at the 82nd JPEG meeting to be held in Lisbon, Portugal, in January 2019.

The 81st JPEG meeting had the following highlights:JPEG81VancouverCut

  • JPEG Completes Initial Assessment on Responses for the Next Generation Image Coding Standard (JPEG XL);
  • Workshop on Blockchain technology;
  • JPEG XS Core Coding System submitted to ISO for immediate publication as International Standard;
  • HTJ2K achieves Draft International Status;
  • JPEG Pleno defines a generic file format syntax architecture.

The following summarizes various highlights during JPEG’s Vancouver meeting.

JPEG XL completes the initial assessment of responses to the call for proposals

 The JPEG Committee launched the Next Generation Image Coding activity, also referred to as JPEG XL, with the aim of developing a standard for image coding that offers substantially better compression efficiency than existing image formats, along with features desirable for web distribution and efficient compression of high quality images. A Call for Proposals on Next Generation Image Coding was issued at the 79th JPEG meeting.

Seven submissions were received in response to the Call for Proposals. The submissions, along with the anchors, were evaluated in subjective tests by three independent research labs. At the 81st JPEG meeting in Vancouver, Canada, the proposals were evaluated using subjective and objective evaluation metrics, and a verification model (XLM) was agreed upon. Following this selection process, a series of experiments have been designed in order to compare the performance of the current XLM with alternative choices as coding components including those technologies submitted by some of the top performing submissions; these experiments are commonly referred to as core experiments and will serve to further refine and improve the XLM towards the final standard. 

Workshop on Blockchain technology

On October 16th, 2018, JPEG organized its first workshop on Media Blockchain in Vancouver. Touradj Ebrahimi JPEG Convenor and Frederik Temmermans a leading JPEG expert, presented on the background of the JPEG standardization committee and ongoing JPEG activities such as JPEG Privacy and Security. Thereafter, Eric Paquet, Victoria Lemieux and Stephen Swift shared their experiences related to blockchain technology focusing on standardization challenges and formalization, real world adoption in media use cases and the state of the art related to consensus models. The workshop closed with an interactive discussion between the speakers and the audience, moderated by JPEG Requirements Chair Fernando Pereira.

The presentations from the workshop are available for download on the JPEG website. In January 2019, during the 82nd JPEG meeting in Lisbon, Portugal, a 2nd workshop will be organized to continue the discussion and interact with European stakeholders. More information about the program and registration will be made available on jpeg.org.

In addition to the workshop, JPEG issued an updated version of its white paper “JPEG White paper: Towards a Standardized Framework for Media Blockchain and Distributed Ledger Technologies” that elaborates on the blockchain initiative, exploring relevant standardization activities, industrial needs and use cases. The white paper will be further extended in the future with more elaborated use cases and conclusions drawn from the workshops. To keep informed and get involved in the discussion, interested parties are invited to register to the ad hoc group’s mailing list via http://jpeg-blockchain-list.jpeg.org.

WorkshopBlockChainCut

Touradj Ebrahimi, convenor of JPEG, giving the introductory talk in the Workshop on Blockchain technology.


JPEG XS

The JPEG committee is pleased to announce a significant milestone of the JPEG XS project, with the Core Coding System (aka JPEG XS Part-1) submitted to ISO for immediate publication as International Standard. This project aims at the standardization of a near-lossless low-latency and lightweight compression scheme that can be used as a mezzanine codec within any AV market. Among the targeted use cases are video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example in cameras and in the automotive industry). The Core Coding System allows for visual transparent quality at moderate compression rates, scalable end-to-end latency ranging from less than a line to a few lines of the image, and low complexity real time implementations in ASIC, FPGA, CPU and GPU. Beside the Core Coding System, Profiles and levels (addressing specific application fields and use cases), together with the transport and container formats (defining different means to store and transport JPEG XS codestreams in files, over IP networks or SDI infrastructures) are also being finalized and their expected submission for publication as International Standard is Q1 2019.

HTJ2K

The JPEG Committee has reached a major milestone in the development of an alternative block coding algorithm for the JPEG 2000 family of standards, with ISO/IEC 15444-15 High Throughput JPEG 2000 (HTJ2K) achieving Draft International Status (DIS).

The HTJ2K algorithm has demonstrated an average tenfold increase in encoding and decoding throughput compared to the algorithm currently defined by JPEG 2000 Part 1. This increase in throughput results in an average coding efficiency loss of 10% or less in comparison to the most efficient modes of the block coding algorithm in JPEG 2000 Part 1, and enables mathematically lossless transcoding to and from JPEG 2000 Part 1 codestreams.

The JPEG Committee has begun the development of HTJ2K conformance codestreams and reference software.

JPEG Pleno

The JPEG Committee is currently pursuing three activities in the framework of the JPEG Pleno Standardization: Light Field, Point Cloud and Holographic content coding.

At the Vancouver meeting, a generic file format syntax architecture was outlined that allows for efficient exchange of these modalities by utilizing a box-based file format. This format will enable the carriage of light field, point cloud and holography data, including associated metadata for colour space specification, camera calibration etc. In the particular case of light field data, this will encompass both texture and disparity information.

For coding of point clouds and holographic data, activities are still in exploratory phase addressing the elaboration of use cases and the refinement of requirements for coding such modalities. In addition, experimental procedures are being designed to facilitate the quality evaluation and testing of technologies that will be submitted in later calls for coding technologies. Interested parties active in point cloud and holography related markets and applications, both from industry and academia are welcome to participate in this standardization activity.

Final Quote

“JPEG XL standard will enable a higher quality content while improving on compression efficiency and offering new features useful for emerging multimedia applications. said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.  

The JPEG Committee nominally meets four times a year, in different world locations. The 81st JPEG Meeting was held on 12-19 October 2018, in Vancouver, Canada. The next 82nd JPEG Meeting will be held on 19-25 January 2019, in Lisbon, Portugal.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 82, Lisbon, Portugal, January 19 to 25, 2019
  • No 83, Geneva, Switzerland, March 16 to 22, 2019
  • No 84, Brussels, Belgium, July 13 to 19, 2019

 

Towards an Integrated View on QoE and UX: Adding the Eudaimonic Dimension

In the past, research on Quality of Experience (QoE) has frequently been limited to networked multimedia applications, such as the transmission of speech, audio and video signals. In parallel, usability and User Experience (UX) research addressed human-machine interaction systems which either focus on a functional (pragmatic) or aesthetic (hedonic) aspect of the experience of the user. In both, the QoE and UX domains, the context (mental, social, physical, societal etc.) of use has mostly been considered as a control factor, in order to guarantee the functionality of the service or the ecological validity of the evaluation. This situation changes when systems are considered which explicitly integrate the usage environment and context they are used in, such as Cyber-Physical Systems (CPS), used e.g. in smart home or smart workplace scenarios. Such systems dispose of sensors and actuators which are able to sample and manipulate the environment they are integrated into, and thus the interaction with them is somehow moderated through the environment; e.g. the environment can react to a user entering a room. In addition, such systems are used for applications which differ from standard multimedia communication in the sense that they are frequently used over a long or repeating period(s) of time, and/or in a professional use scenario. In such application scenarios the motivation of system usage can be divided between the actual system user and a third party (e.g. the employer) resulting in differing factors affecting related experiences (in comparison to services which are used on the user’s own account). However, the impact of this duality of usage motivation on the resulting QoE or UX has rarely been addressed in existing research of both scientific communities. 

In the context of QoE research, the European Network on Quality of Experience in Multimedia Systems and Services, Qualinet (COST Action IC 1003) as well as a number of Dagstuhl seminars [see note from the editors], started a scientific discussion about the definition of the term QoE and related concepts around 2011. This discussion resulted in a White Paper which defines QoE as “the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/ or enjoyment of the application or service in the light of the users personality and current state.” [White Paper 2012]. Besides this definition, the white paper describes a number of factors that influence a user’s QoE perception, e.g. human-, system- and contextual factors. Although this discussion lists a large set of influencing factors quite thoroughly, it still focuses on rather short-term (or episodic) and media related hedonic experiences. A first step towards integrating an additional (quality) dimension (to the hedonic one) has been described in [Hammer et al., 2018], where the authors introduced the eudaimonic perspective as being the user’s overall well-being as a result of system usage. The term “eudaimonic” stems from Aristoteles and is commonly used to designate a deeper degree of well-being, as a result of a self-fulfillment by developing one’s own strengths.

On a different side, UX research has historically evolved from usability research (which was for a long time focusing on enhancing the efficiency and effectiveness of the system), and was initially concerned with the prevention of negative emotions related to technology use. As an important contributor for such preventions, pragmatic aspects of analyzed ICT systems have been identified in usability research. However, the twist towards a modern understanding of UX focuses on the understanding of human-machine interaction as a specific emotional experience (e.g., pleasure) and considers pragmatic aspects only as enablers of positive experiences but not as contributors to positive experiences. In line with this understanding, the concept of Positive or Hedonic Psychology, as introduced by [Kahnemann 1999], has been embedded and adopted in HCI and UX research. As a result, the related research community has mainly focused on the hedonic aspects of experiences as described in [Diefenbach 2014] and as critically outlined by [Mekler 2016] in which the authors argue that this concentration on hedonic aspects has overcasted the importance of eudaimonic aspects of well-being as described in positive psychology. With respect to the measurement of user experiences, the devotion towards hedonic psychology comes also with the need for measuring emotional responses (or experiential qualities). In contrast to the majority of QoE research, where the measurement of the (single) experienced (media) quality of a multimedia system is in the focus, the measurement of experiential qualities in UX calls for the measurement of a range of qualities (e.g. [Bargas-Avila 2011] lists affect, emotion, fun, aesthetics, hedonic and flow as qualities that are assessed in the context of UX). Hence, this measurement approach considers a considerable broader range of quantified qualities. However, the development of the UX domain towards a design-based UX research that steers away from quantitatively measurable qualities and focuses more towards a qualitative research approach (that does not generate measurable numbers) has marginalized this measurement or model-based UX research camp in recent UX developments as denoted by [Law 2014].

While existing work in QoE mainly focuses on hedonic aspects (and in UX, also on pragmatic ones), eudaimonic aspects such as the development of one’s own strengths have not been considered extensively so far in the context of both research areas. Especially in the usage context of professional applications, the meaningfulness of system usage (which is strongly related to eudaimonic aspects) and the growth of the user’s capabilities will certainly influence the resulting experiential quality(ies). In particular, professional applications must be designed such that the user continues to use the system in the long run without frustration, i.e. provide long-term acceptance for applications which the user is required to use by the employer. In order to consider these aspects, the so-called “HEP cube” has been introduced in [Hammer et al. 2018]. It opens a 3-dimensional space of hedonic (H), eudaimonic (E) and pragmatic (P) aspects of QoE and UX, which are integrated towards a Quality of User Experience (QUX) concept.

Whereas a simple definition of QUX has not yet been set up in this context, a number of QUX-related aspects, e.g. utility (P), joy-of-use (H), meaningfulness (E), have been integrated into a multidimensional HEP construct. This construct is displayed in Figure 1. In addition to the well-known hedonic and pragmatic aspects of UX, it incorporates the eudaimonic dimension. Thereby, it shows the assumed relationships between aforementioned aspects of User Experience and QoE, and in addition usefulness and motivation (which is strongly related to the eudaimonic dimension). These aspects are triggered by user needs (first layer) and moderated by the respective dimension aspects joy-of-use (for hedonic), ease-of-use (pragmatic), and purpose-of-use (eudaimonic). The authors expect that a consideration of the additional needs and QUX aspects, and an incorporation of these aspects into application design, will not only lead to higher acceptance rates, but also to deep-grounded well-being of users. Furthermore, incorporation of these aspects into QoE and / or QUX modelling will improve their respective prediction performance and ecological validity.

towardsAnIntegratedViewQoEandUX_AddingEudaimonicDimension

Figure 1: QUX as a multidimensional construct involving HEP attributes, existing QoE/UX, need fulfillment and motivation. Picture taken from Hammer, F., Egger-Lampl, S., Möller, S.: Quality-of-User-Experience: A Position Paper, Quality and User Experience, Springer (2018).

References

  • [White Paper 2012] Qualinet White Paper on Definitions of Quality of Experience (2012).  European Network on Quality of Experience in Multimedia Systems and  Services (COST Action IC 1003), Patrick Le Callet, Sebastian Möller and Andrew Perkis, eds., Lausanne, Switzerland, Version 1.2, March 2013.
  • [Kahnemann 1999] Kahneman, D.: Well-being: Foundations of Hedonic Psychology, chap. Objective Happiness, pp. 3{25. Russell Sage Foundation Press, New York (1999)
  • [Diefenbach 2014] Diefenbach, S., Kolb, N., Hassenzahl, M.: The `hedonic’ in human-computer interaction: History, contributions, and future research directions. In: Proceedings of the 2014 conference on Designing interactive systems, pp. 305{314. ACM (2014)
  • [Mekler 2016] Mekler, E.D., Hornbaek, K.: Momentary pleasure or lasting meaning?: Distinguishing eudaimonic and hedonic user experiences. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4509{4520. ACM (2016)
  • [Bargas-Avila 2011] Bargas-Avila, J.A., Hornbaek, K.: Old wine in new bottles or novel challenges: A critical analysis of empirical studies of user experience. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2689{2698. ACM (2011)
  • [Law 2014] Law, E.L.C., van Schaik, P., Roto, V.: Attitudes towards user experience (UX) measurement. International Journal of Human-Computer Studies 72(6), 526{541 (2014)
  • [Hammer et al. 2018] Hammer, F., Egger-Lampl, S., Möller, S.: Quality-of-User-Experience: A Position Paper, Quality and User Experience, Springer (2018).

Note from the editors:

More details on the integrated view of QoE and UX can be found in Hammer, F., Egger-Lampl, S. & Möller, “Quality-of-user-experience: a position paper”. Springer Quality and User Experience (2018) 3: 9. https://doi.org/10.1007/s41233-018-0022-0

The Dagstuhl seminars mentioned by the authors started a scientific discussion about the definition of the term QoE in 2009. Three Dagstuhl Seminars were related to QoE: 09192 “From Quality of Service to Quality of Experience” (2009), 12181 “Quality of Experience: From User Perception to Instrumental Metrics” (2012), and 15022 “Quality of Experience: From Assessment to Application” (2015). A Dagstuhl Perspectives Workshop 16472 “QoE Vadis?” followed in 2016 which set out to jointly and critically reflect on future perspectives and directions of QoE research. During the Dagstuhl Perspectives Workshop, the QoE-UX wedding proposal came up to marry the area of QoE and UX. The reports from the Dagstuhl seminars  as well as the Manifesto from the Perspectives Workshop are available online and listed below.

One step towards an integrated view of QoE and UX is reflected by QoMEX 2019. The 11th International Conference on Quality of Multimedia Experience will be held in June 5th to 7th, 2019 in Berlin, Germany. It will bring together leading experts from academia and industry to present and discuss current and future research on multimedia quality, quality of experience (QoE) and user experience (UX). This way, it will contribute towards an integrated view on QoE and UX, and foster the exchange between the so-far distinct communities. More details: https://www.qomex2019.de/

 

JPEG Column: 80th JPEG Meeting in Berlin, Germany

The 80th JPEG meeting was held in Berlin, Germany, from 7 to 13 July 2018. During this meeting, JPEG issued a record number of ballots and output documents, spread through the multiple activities taking place. These record numbers are very revealing of the level of commitment of JPEG standardisation committee. A strong effort is being accomplished on the standardisation of new solutions for the emerging image technologies enabling the interoperability of different systems on the growing market of multimedia. Moreover, it is intended that these new initiatives should provide royalty-free patent licensing solutions at least in one of the available profiles, which shall promote a wider adoption of these future JPEG standards from the consumer market, and applications and systems developers.

A significant progress in low latency and high throughput standardisation initiatives has taken place at Berlin meetings. The new part 15 of JPEG 2000, known as High Throughput JPEG 2000 (HTJ2K), is finally ready and reached committee draft status. Furthermore, JPEG XS profiles and levels were released for their second and final ballot. Hence, these new low complexity standards foresee to be finalised in a short time, providing new solutions for developers and consumers on applications where mobility is important and large bandwidth is available. Virtual and augmented reality, as well as 360º images and video, are among the several applications that might benefit from these new standards.

Berlin80T1cut

JPEG meeting plenary in Berlin.

The 80th JPEG meeting had the following highlights:

  • HTJ2K reaches Committee Draft status;
  • JPEG XS profiles and levels are under ballot;
  • JPEG XL publishes additional information to the CfP;
  • JPEG Systems – JUMBF & JPEG 360;
  • JPEG-in-HEIF;
  • JPEG Blockchain white paper;
  • JPEG Pleno Light Field verification model.

The following summarizes the various highlights during JPEG’s Berlin meeting.

HTJ2K

The JPEG committee is pleased to announce a significant milestone, with ISO/IEC 15444-15 High-Throughput JPEG 2000 (HTJ2K) reaching Committee Draft status.

HTJ2K introduces a new FAST block coder to the JPEG 2000 family. The FAST block coder can be used in place of the JPEG 2000 Part 1 arithmetic block coder, and, as illustrated in Table 1, offers in average an order of magnitude increase on decoding and encoding throughput – at the expense of slightly reduced coding efficiency and elimination of quality scalability.

Table 1. Comparison between FAST block coder and JPEG 2000 Part 1 arithmetic block coder. Results were generated by optimized implementations evaluated as part of the HTJ2K activity, using professional video test images in the transcoding context specified in the Call for Proposal available at https://jpeg.org.  Figures are relative to JPEG2000 Part1 arithmetic block coder (bpp – bits per pixel).

JPEG 2000 Part 1 Block Coder Bitrate 0.5 bpp 1 bpp 2 bpp 4 bpp 6 bpp lossless
Average FAST Block Coder Speedup Factor 17.5x 19.5x 21.1x 25.5x 27.4x 43.7x
Average FAST Block Decoder Speedup Factor 10.2x 11.4x 11.9x 14.1x 15.1x 24.0x
Average Increase in Codestream Size  8.4%  7.3%   7.1% 6.6%  6.5%  6.6% 

Apart from the block coding algorithm itself, the FAST block coding algorithm does not modify the JPEG 2000 codestream, and allows mathematically lossless transcoding to and from JPEG 2000 codestreams. As a result the FAST block coding algorithm can be readily integrated into existing JPEG 2000 applications, where it can bring significant increases in processing efficiency. 

 

JPEG XS

This project aims at the standardization of a visually lossless low-latency and lightweight compression scheme that can be used as a mezzanine codec for the broadcast industry, Pro-AV and other markets. Targeted use cases are video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (in particular in the automotive industry). The Core Coding System, expected to be published in Q4 2018 allows for visually lossless quality at 6:1 compression ratio for most content, 32 lines end-to-end latency, and ultra low complexity implementations in ASIC, FPGA, CPU and GPU. Following the 80th JPEG meeting in Berlin, profiles and levels (addressing specific application fields and use cases) are now under final ballot (expected publication in Q1 2019). Different means to store and transport JPEG XS codestreams in files, over IP networks or SDI infrastructures are also defined and go to a first ballot.

 

JPEG XL

The JPEG Committee issued a Call for Proposals (CfP) following its 79th meeting (April 2018), with the objective of seeking technologies that fulfill the objectives and scope of the Next-Generation Image Coding activity. The CfP, with all related info, can be found in https://jpeg.org/downloads/jpegxl/jpegxl-cfp.pdf. The deadline for expression of interest and registration was August 15, 2018, and submissions to the CfP were due on September 1, 2018. 

As outcome of the 80th JPEG meeting in Berlin, a document was produced containing additional information related to the objective and subjective quality assessment methodologies that will be used to evaluate the anchors and proposals to the CfP, available on https://jpeg.org/downloads/jpegxl/wg1n80024-additional-information-cfp.pdf. Moreover, a detailed workflow is described, together with the software and command lines used to generate the anchors and to compute objective quality metrics.

To stay posted on the action plan of JPEG XL, please regularly consult our website at jpeg.org and/or subscribe to its e-mail reflector.

 

JPEG Systems – JUMBF & JPEG 360

The JPEG Committee progressed towards a common framework and definition for metadata which will improve the ability to share 360 images. At the 80th meeting, the Committee Draft ballot was completed, the comments reviewed, and is now progressing towards DIS text for upcoming ballots on “JPEG Universal Metadata Box Format (JUMBF)” as ISO/IEC 19566-5, and “JPEG 360” as ISO/IEC 19566-6. Investigations have started to apply the framework on the structure of JPEG Pleno files.

 

JPEG-in-HEIF

The JPEG Committee made significant progress towards standardizing how JPEG XR, JPEG 2000 and the upcoming JPEG XS will be carried in ISO/IEC 23008-12 image file container.

 

JPEG Blockchain

Fake news, copyright violation, media forensics, privacy and security are emerging challenges for digital media. JPEG has determined that blockchain technology has great potential as a technology component to address these challenges in transparent and trustable media transactions. However, blockchain needs to be integrated closely with a widely adopted standard to ensure broad interoperability of protected images. JPEG calls for industry participation to help define use cases and requirements that will drive the standardization process. To reach this objective, JPEG issued a white paper entitled “Towards a Standardized Framework for Media Blockchain” that elaborates on the initiative, exploring relevant standardization activities, industrial needs and use cases. In addition, JPEG plans to organise a workshop during its 81st meeting in Vancouver on Tuesday 16th October 2018. More information about the workshop is available on https://www.jpeg.org. To keep informed and get involved, interested parties are invited to register on the ad hoc group’s mailing list at http://jpeg-blockchain-list.jpeg.org.

 

JPEG Pleno

The JPEG Committee is currently pursuing three activities in the framework of the JPEG Pleno Standardization: Light Field, Point Cloud and Holographic content coding.

At its Berlin meeting, a first version of the verification model software for light field coding has been produced. This software supports the core functionality that was indented for the light field coding standard. It serves for intensive testing of the standard. JPEG Pleno Light Field Coding supports various sensors ranging from lenslet cameras to high-density camera arrays, light field related content production chains up to light field displays.

For coding of point clouds and holographic data, activities are still in exploratory phase addressing the elaboration of use cases and the refinement of requirements for coding such modalities. In addition, experimental procedures are being designed to facilitate the quality evaluation and testing of technologies that will be submitted in later calls for coding technologies. Interested parties active in point cloud and holography related markets and applications, both from industry and academia are welcome to participate in this standardization activity.

 

Final Quote 

“After a record number of ballots and output documents generated during its 80th meeting, the JPEG Committee pursues its activity on the specification of effective and reliable solutions for image coding offering needed features in emerging multimedia applications. The new JPEG XS and JPEG 2000 part 15 provide low complexity compression solutions that will benefit many growing markets such as content production, virtual and augmented reality as well as autonomous cars and drones.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

 

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 80th JPEG Meeting was held on 7-13 July 2018, in Berlin, Germany. The next 81st JPEG Meeting will be held on 13-19 October 2018, in Vancouver, Canada.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

 

Future JPEG meetings are planned as follows:JPEG-signature

  • No 81, Vancouver, Canada, October 13 to 19, 2018
  • No 82, Lisbon, Portugal, January 19 to 25, 2019
  • No 83, Geneva, Switzerland, March 16 to 22, 2019

MPEG Column: 123rd MPEG Meeting in Ljubljana, Slovenia

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

IMG_5700The MPEG press release comprises the following topics:

  • MPEG issues Call for Evidence on Compressed Representation of Neural Networks
  • Network-Based Media Processing – MPEG evaluates responses to call for proposal and kicks off its technical work
  • MPEG finalizes 1st edition of Technical Report on Architectures for Immersive Media
  • MPEG releases software for MPEG-I visual activities
  • MPEG enhances ISO Base Media File Format (ISOBMFF) with new features

MPEG issues Call for Evidence on Compressed Representation of Neural Networks

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, media coding, data analytics, translation and many other fields. Their recent success is based on the feasibility of processing much larger and complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. As a consequence, trained neural networks contain a large number of parameters (weights), resulting in a quite large size (e.g., several hundred MBs). Many applications require the deployment of a particular trained network instance, potentially to a larger number of devices, which may have limitations in terms of processing power and memory (e.g., mobile devices or smart cameras). Any use case, in which a trained neural network (and its updates) needs to be deployed to a number of devices could thus benefit from a standard for the compressed representation of neural networks.

At its 123rd meeting, MPEG has issued a Call for Evidence (CfE) for compression technology for neural networks. The compression technology will be evaluated in terms of compression efficiency, runtime, and memory consumption and the impact on performance in three use cases: visual object classification, visual feature extraction (as used in MPEG Compact Descriptors for Visual Analysis) and filters for video coding. Responses to the CfE will be analyzed on the weekend prior to and during the 124th MPEG meeting in October 2018 (Macau, CN).

Research aspects: As this is about “compression” of structured data, research aspects will mainly focus around compression efficiency for both lossy and lossless scenarios. Additionally, communication aspects such as transmission of compressed artificial neural networks within lossy, large-scale environments including update mechanisms may become relevant in the (near) future. Furthermore, additional use cases should be communicated towards MPEG until the next meeting.

Network-Based Media Processing – MPEG evaluates responses to call for proposal and kicks off its technical work

Recent developments in multimedia have brought significant innovation and disruption to the way multimedia content is created and consumed. At its 123rd meeting, MPEG analyzed the technologies submitted by eight industry leaders as responses to the Call for Proposals (CfP) for Network-Based Media Processing (NBMP, MPEG-I Part 8). These technologies address advanced media processing use cases such as network stitching for virtual reality (VR) services, super-resolution for enhanced visual quality, transcoding by a mobile edge cloud, or viewport extraction for 360-degree video within the network environment. NBMP allows service providers and end users to describe media processing operations that are to be performed by the entities in the networks. NBMP will describe the composition of network-based media processing services out of a set of NBMP functions and makes these NBMP services accessible through Application Programming Interfaces (APIs).

NBMP will support the existing delivery methods such as streaming, file delivery, push-based progressive download, hybrid delivery, and multipath delivery within heterogeneous network environments. MPEG issued a Call for Proposal (CfP) seeking technologies that allow end-user devices, which are limited in processing capabilities and power consumption, to offload certain kinds of processing to the network.

After a formal evaluation of submissions, MPEG selected three technologies as starting points for the (i) workflow, (ii) metadata, and (iii) interfaces for static and dynamically acquired NBMP. A key conclusion of the evaluation was that NBMP can significantly improve the performance and efficiency of the cloud infrastructure and media processing services.

Research aspects: I reported about NBMP in my previous post and basically the same applies here. NBMP will be particularly interesting in the context of new networking approaches including, but not limited to, software-defined networking (SDN), information-centric networking (ICN), mobile edge computing (MEC), fog computing, and related aspects in the context of 5G.

MPEG finalizes 1st edition of Technical Report on Architectures for Immersive Media

At its 123nd meeting, MPEG finalized the first edition of its Technical Report (TR) on Architectures for Immersive Media. This report constitutes the first part of the MPEG-I standard for the coded representation of immersive media and introduces the eight MPEG-I parts currently under specification in MPEG. In particular, it addresses three Degrees of Freedom (3DoF; three rotational and un-limited movements around the X, Y and Z axes (respectively pitch, yaw and roll)), 3DoF+ (3DoF with additional limited translational movements (typically, head movements) along X, Y and Z axes), and 6DoF (3DoF with full translational movements along X, Y and Z axes) experiences but it mostly focuses on 3DoF. Future versions are expected to cover aspects beyond 3DoF. The report documents use cases and defines architectural views on elements that contribute to an overall immersive experience. Finally, the report also includes quality considerations for immersive services and introduces minimum requirements as well as objectives for a high-quality immersive media experience.

Research aspects: ISO/IEC technical reports are typically publicly available and provides informative descriptions of what the standard is about. In MPEG-I this technical report can be used as a guideline for possible architectures for immersive media. This first edition focuses on three Degrees of Freedom (3DoF; three rotational and un-limited movements around the X, Y and Z axes (respectively pitch, yaw and roll)) and outlines the other degrees of freedom currently foreseen in MPEG-I. It also highlights use cases and quality-related aspects that could be of interest for the research community.

MPEG releases software for MPEG-I visual activities

MPEG-I visual is an activity that addresses the specific requirements of immersive visual media for six degrees of freedom virtual walkthroughs with correct motion parallax within a bounded volume. MPEG-I visual covers application scenarios from 3DoF+ with slight body and head movements in a sitting position to 6DoF allowing some walking steps from a central position. At the 123nd MPEG meeting, an important progress has been achieved in software development. A new Reference View Synthesizer (RVS 2.0) has been released for 3DoF+, allowing to synthesize virtual viewpoints from an unlimited number of input views. RVS integrates code bases from Universite Libre de Bruxelles and Philips, who acted as software coordinator. A Weighted-to-Spherically-uniform PSNR (WS-PSNR) software utility, essential to 3DoF+ and 6DoF activities, has been developed by Zhejiang University. WS-PSNR is a full reference objective quality metric for all flavors of omnidirectional video. RVS and WS-PSNR are essential software tools for the upcoming Call for Proposals on 3DoF+ expected to be released at the 124th MPEG meeting in October 2018 (Macau, CN).

Research aspects: MPEG does not only produce text specifications but also reference software and conformance bitstreams, which are important assets for both research and development. Thus, it is very much appreciated to have a new Reference View Synthesizer (RVS 2.0) and Weighted-to-Spherically-uniform PSNR (WS-PSNR) software utility available which enables interoperability and reproducibility of R&D efforts/results in this area.

MPEG enhances ISO Base Media File Format (ISOBMFF) with new features

At the 123rd MPEG meeting, a couple of new amendments related to ISOBMFF has reached the first milestone. Amendment 2 to ISO/IEC 14496-12 6th edition will add the option to have relative addressing as an alternative to offset addressing, which in some environments and workflows can simplify the handling of files and will allow creation of derived visual tracks using items and samples in other tracks with some transformation, for example rotation. Another amendment reached its first milestone is the first amendment to ISO/IEC 23001-7 3rd edition. It will allow use of multiple keys to a single sample and scramble some parts of AVC or HEVC video bitstreams without breaking conformance to the existing decoders. That is, the bitstream will be decodable by existing decoders, but some parts of the video will be scrambled. It is expected that these amendments will reach the final milestone in Q3 2019.

Research aspects: The ISOBMFF reference software is now available on Github, which is a valuable service to the community and allows for active standard’s participation even from outside of MPEG. It is recommended that interested parties have a look at it and consider contributing to this project.


What else happened at #MPEG123?

  • The MPEG-DASH 3rd edition is finally available as output document (N17813; only available to MPEG members) combining 2nd edition, four amendments, and 2 corrigenda. We expect final publication later this year or early next year.
  • There is a new DASH amendment and corrigenda items in pipeline which should progress to final stages also some time next year. The status of MPEG-DASH (July 2018) can be seen below.

DASHstatus0718

  • MPEG received a rather interesting input document related to “streaming first” which resulted into a publicly available output document entitled “thoughts on adaptive delivery and access to immersive media”. The key idea here is to focus on streaming (first) rather than on file/encapsulation formats typically used for storage (and streaming second). This document should become available here.
  • Since a couple of meetings, MPEG maintains a standardization roadmap highlighting recent/major MPEG standards and documenting the roadmap for the next five years. It definitely worth keeping this in mind when defining/updating your own roadmap.
  • JVET/VVC issued Working Draft 2 of Versatile Video Coding (N17732 | JVET-K1001) and Test Model 2 of Versatile Video Coding (VTM 2) (N17733 | JVET-K1002). Please note that N-documents are MPEG internal but JVET-documents are publicly accessible here: http://phenix.it-sudparis.eu/jvet/. An interesting aspect is that VTM2/WD2 should have >20% rate reduction compared to HEVC, all with reasonable complexity and the next benchmark set (BMS) should have close to 30% rate reduction vs. HEVC. Further improvements expected from (a) improved merge, intra prediction, etc., (b) decoder-side estimation with low complexity, (c) multi-hypothesis prediction and OBMC, (d) diagonal and other geometric partitioning, (e) secondary transforms, (f) new approaches of loop filtering, reconstruction and prediction filtering (denoising, non-local, diffusion based, bilateral, etc.), (g) current picture referencing, palette, and (h) neural networks.
  • In addition to VVC — which is a joint activity with VCEG –, MPEG is working on two video-related exploration activities, namely (a) an enhanced quality profile of the AVC standard and (b) a low complexity enhancement video codec. Both topics will be further discussed within respective Ad-hoc Groups (AhGs) and further details are available here.
  • Finally, MPEG established an Ad-hoc Group (AhG) dedicated to the long-term planning which is also looking into application areas/domains other than media coding/representation.

In this context it is probably worth mentioning the following DASH awards at recent conferences

Additionally, there have been two tutorials at ICME related to MPEG standards, which you may find interesting