The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.
The MPEG press release comprises the following aspects:
- Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage
- Compressed Representation of Neural Networks – MPEG issues Call for Proposals
- Low Complexity Video Coding Enhancements – MPEG issues Call for Proposals
- New Video Coding Standard expected to have licensing terms timely available – MPEG issues Call for Proposals
- Multi-Image Application Format (MIAF) promoted to Final Draft International Standard
- 3DoF+ Draft Call for Proposal goes Public
Point Cloud Compression – MPEG promotes a video-based point cloud compression technology to the Committee Draft stage
At its 124th meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to Committee Draft (CD) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colour. By leveraging existing and video ecosystems in general (hardware acceleration, transmission services and infrastructure), and future video codecs as well, the V-PCC technology enables new applications. The current V-PCC encoder implementation provides a compression of 125:1, which means that a dynamic point cloud of 1 million points could be encoded at 8 Mbit/s with good perceptual quality.
A next step is the storage of V-PCC in ISOBMFF for which a working draft has been produced. It is expected that further details will be discussed in upcoming reports.
Research aspects: Video-based Point Cloud Compression (V-PCC) is at CD stage and a first working draft for the storage of V-PCC in ISOBMFF has been provided. Thus, a next consequence is the delivery of V-PCC encapsulated in ISOBMFF over networks utilizing various approaches, protocols, and tools. Additionally, one may think of using also different encapsulation formats if needed.
MPEG issues Call for Proposals on Compressed Representation of Neural Networks
Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, media coding, data analytics, and many other fields. Their recent success is based on the feasibility of processing much larger and complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. Some applications require the deployment of a particular trained network instance to a potentially large number of devices and, thus, could benefit from a standard for the compressed representation of neural networks. Therefore, MPEG has issued a Call for Proposals (CfP) for compression technology for neural networks, focusing on the compression of parameters and weights, focusing on four use cases: (i) visual object classification, (ii) audio classification, (iii) visual feature extraction (as used in MPEG CDVA), and (iv) video coding.
Research aspects: As point out last time, research here will mainly focus around compression efficiency for both lossy and lossless scenarios. Additionally, communication aspects such as transmission of compressed artificial neural networks within lossy, large-scale environments including update mechanisms may become relevant in the (near) future.
MPEG issues Call for Proposals on Low Complexity Video Coding Enhancements
Upon request from the industry, MPEG has identified an area of interest in which video technology deployed in the market (e.g., AVC, HEVC) can be enhanced in terms of video quality without the need to necessarily replace existing hardware. Therefore, MPEG has issued a Call for Proposals (CfP) on Low Complexity Video Coding Enhancements.
The objective is to develop video coding technology with a data stream structure defined by two component streams: a base stream decodable by a hardware decoder and an enhancement stream suitable for software processing implementation. The project is meant to be codec agnostic; in other words, the base encoder and base decoder can be AVC, HEVC, or any other codec in the market.
Research aspects: The interesting aspect here is that this use case assumes a legacy base decoder – most likely realized in hardware – which is enhanced with software-based implementations to improve coding efficiency or/and quality without sacrificing capabilities of the end user in terms of complexity and, thus, energy efficiency due to the software based solution.
MPEG issues Call for Proposals for a New Video Coding Standard expected to have licensing terms timely available
At its 124th meeting, MPEG issued a Call for Proposals (CfP) for a new video coding standard to address combinations of both technical and application (i.e., business) requirements that may not be adequately met by existing standards. The aim is to provide a standardized video compression solution which combines coding efficiency similar to that of HEVC with a level of complexity suitable for real-time encoding/decoding and the timely availability of licensing terms.
Research aspects: This new work item is more related to business aspects (i.e., licensing terms) than technical aspects of video coding.
Multi-Image Application Format (MIAF) promoted to Final Draft International Standard
The Multi-Image Application Format (MIAF) defines interoperability points for creation, reading, parsing, and decoding of images embedded in High Efficiency Image File (HEIF) format by (i) only defining additional constraints on the HEIF format, (ii) limiting the supported encoding types to a set of specific profiles and levels, (iii) requiring specific metadata formats, and (iv) defining a set of brands for signaling such constraints including specific depth map and alpha plane formats. For instance, it addresses use case like a capturing device may use one of HEIF codecs with a specific HEVC profile and level in its created HEIF files, while a playback device is only capable of decoding the AVC bitstreams.
Research aspects: MIAF is an application format which is defined as a combination of tools (incl. profiles and levels) of other standards (e.g., audio codecs, video codecs, systems) to address the needs of a specific application. Thus, the research is related to use cases enabled by this application format.
3DoF+ Draft Call for Proposal goes Public
Following investigations on the coding of “three Degrees of Freedom plus” (3DoF+) content in the context of MPEG-I, the MPEG video subgroup has provided evidence demonstrating the capability to encode a 3DoF+ content efficiently while maintaining compatibility with legacy HEVC hardware. As a result, MPEG decided to issue a draft Call for Proposal (CfP) to the public containing the information necessary to prepare for the final Call for Proposal expected to occur at the 125th MPEG meeting (January 2019) with responses due at the 126th MPEG meeting (March 2019).
Research aspects: This work item is about video (coding) and, thus, research is about compression efficiency.
What else happened at #MPEG124?
- MPEG-DASH 3rd edition is still in the final editing phase and not yet available. Last time, I wrote that we expect final publication later this year or early next year and we hope this is still the case. At this meeting Amendment.5 is progressed to DAM and conformance/reference software for SRD, SAND and Server Push is also promoted to DAM. In other words, DASH is pretty much in maintenance mode.
- MPEG-I (systems part) is working on immersive media access and delivery and I guess more updates will come on this after the next meeting. OMAF is working on a 2nd edition for which a working draft exists and phase 2 use cases (public document) and draft requirements are discussed.
- Versatile Video Coding (VVC): working draft 3 (WD3) and test model 3 (VTM3) has been issued at this meeting including a large number of new tools. Both documents (and software) will be publicly available after editing periods (Nov. 23 for WD3 and Dec 14 for VTM3).