ACM SIGMM Records
   

ACM SIGMM Records

Volume 8, Issue 4, March 2016 (ISSN 1947-4598)

Highlights

Call for Task Proposals: Multimedia Evaluation 2017

MediaEval 2017 Multimedia Evaluation Benchmark Call for Task Proposals Proposal Deadline: 3 December 2016 MediaEval is a benchmarking initiative dedicated to developing and evaluating new algorithms and technologies for multimedia retrieval, access and exploration. It offers tasks to the research community that are related to human and social aspects of … Read more

MPEG Column: 116th MPEG Meeting

MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu Chengdu, China – The 116th MPEG meeting was held in Chengdu, China, from 17 – 21 October 2016 MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu At its 116th meeting, MPEG successfully organised a workshop on its 5-year standardisation roadmap. … Read more

MPEG Column: 117th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects. The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects: MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF) MPEG-H 3D Audio Verification Test … Read more

JPEG Column: 74th JPEG Meeting

The 74th JPEG meeting was held at ITU Headquarters in Geneva, Switzerland, from 15 to 20 January featuring the following highlights: A Final Call for Proposals on JPEG Pleno was issued focusing on light field coding; Creation of a test model for the upcoming JPEG XS standard; A draft Call … Read more

Amirhossein Habibian

This thesis studies the fundamental question: what vocabulary of concepts are suited for machines to describe video content? The answer to this question involves two annotation steps: First, to specify a list of concepts by which videos are described. Second, to label a set of videos per concept as its … Read more

Chien-nan Chen

3D Tele-immersion (3DTI) technology allows full-body, multimodal interaction among geographically dispersed users, which opens a variety of possibilities in cyber collaborative applications such as art performance, exergaming, and physical rehabilitation. However, with its great potential, the resource and quality demands of 3DTI rise inevitably, especially when some advanced applications target … Read more


Call for Contributions

We await your contributions continuously. Please take a closer look at the information that is required for a particular submission to the newsletter.


Table of Contents

  1. Call for Task Proposals: Multimedia Evaluation 2017
  2. MPEG Column: 116th MPEG Meeting
  3. MPEG Column: 117th MPEG Meeting
  4. JPEG Column: 74th JPEG Meeting
  5. PhD thesis abstracts
    1. Amirhossein Habibian
    2. Chien-nan Chen
    3. Masoud Mazloom
    4. Rufael Mekuria
    5. Svetlana Kordumova
  6. Journal issue TOCs
    1. MTAP Volume 76, Issue 6
    2. MTAP Volume 76, Issue 5
    3. MTAP Volume 76, Issue 4
    4. MTAP Volume 76, Issue 3
    5. MTAP Volume 76, Issue 2
    6. MTAP Volume 76, Issue 1
    7. MTAP Volume 75, Issue 24
    8. MTAP Volume 75, Issue 23
    9. MTAP Volume 75, Issue 22
    10. MTAP Volume 75, Issue 21
    11. MTAP Volume 75, Issue 20
    12. IJMIR Volume 6, Issue 1
    13. IJMIR Volume 5, Issue 4
    14. MMSJ Volume 23, Issue 1
    15. MMSJ Volume 23, Issue 2
    16. TOMM Volume 12, Issue 4s
    17. TOMM Volume 12, Issue 5s
    18. TOMM Volume 13, Issue 1
    19. MMSJ Volume 22, Issue 6
  7. Job opportunities
    1. PhD Openings in ECE
    2. Post-doctoral position in the field of multimodal content understanding at Technicolor
  8. Calls for paper
    1. CFPs for ACM-sponsored events (any SIG)
    2. CFPs for IEEE-sponsored events (any TC)
    3. CFPs for other multimedia-related events

    Call for Task Proposals: Multimedia Evaluation 2017

    MediaEval 2017 Multimedia Evaluation Benchmark

    Call for Task Proposals

    Proposal Deadline: 3 December 2016

    MediaEval is a benchmarking initiative dedicated to developing and evaluating new algorithms and technologies for multimedia retrieval, access and exploration. It offers tasks to the research community that are related to human and social aspects of multimedia. MediaEval emphasizes the ‘multi’ in multimedia and seeks tasks involving multiple modalities, e.g., audio, visual, textual, and/or contextual.

    MediaEval is now calling for proposals for tasks to run in the 2017 benchmarking season. The proposal consists of a description of the motivation for the task and challenges that task participants must address. It provides information on the data and evaluation methodology to be used. The proposal also includes a statement of how the task is related to MediaEval (i.e., its human or social component), and how it extends the state of the art in an area related to multimedia indexing, search or other technologies that support users in accessing multimedia collections.

    For more detailed information about the content of the task proposal, please see:
    http://www.multimediaeval.org/files/mediaeval2017_taskproposals.html

    Task proposal deadline: 3 December 2016

    Task proposals are chosen on the basis of their feasibility, their match with the topical focus of MediaEval, and also according to the outcome of a survey circulated to the wider multimedia research community.

    The MediaEval 2017 Workshop will be held 13-15 September 2017 in Dublin, Ireland, co-located with CLEF 2017 (http://clef2017.clef-initiative.eu)

    For more information about MediaEval see http://multimediaeval.org or contact Martha Larson m.a.larson@tudelft.nl

     


    MPEG Column: 116th MPEG Meeting

    MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

    Chengdu, China – The 116th MPEG meeting was held in Chengdu, China, from 17 – 21 October 2016

    MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

    At its 116th meeting, MPEG successfully organised a workshop on its 5-year standardisation roadmap. Various industry representatives presented their views and reflected on the need for standards for new services and applications, specifically in the area of immersive media. The results of the workshop (roadmap, presentations) and the planned phases for the standardisation of “immersive media” are available at http://mpeg.chiariglione.org/. A follow-up workshop will be held on 18 January 2017 in Geneva, co-located with the 117th MPEG meeting. The workshop is open to all interested parties and free of charge. Details on the program and registration will be available at http://mpeg.chiariglione.org/.

    Summary of the “Survey on Virtual Reality”

    At its 115th meeting, MPEG established an ad-hoc group on virtual reality which conducted a survey on virtual reality with relevant stakeholders in this domain. The feedback from this survey has been provided as input for the 116th MPEG meeting where the results have been evaluated. Based on these results, MPEG aligned its standardisation timeline with the expected deployment timelines for 360-degree video and virtual reality services. An initial specification for 360-degree video and virtual reality services will be ready by the end of 2017 and is referred to as the Omnidirectional Media Application Format (OMAF; MPEG-A Part 20, ISO/IEC 23000-20). A standard addressing audio and video coding for 6 degrees of freedom where users can freely move around is on MPEG’s 5-year roadmap. The summary of the survey on virtual reality is available at http://mpeg.chiariglione.org/.

    MPEG and ISO/TC 276/WG 5 have collected and evaluated the answers to the Genomic Information Compression and Storage joint Call for Proposals

    At its 115th meeting, MPEG issued a Call for Proposals (CfP) for Genomic Information Compression and Storage in conjunction with the working group for standardisation of data processing and integration of the ISO Technical Committee for biotechnology standards (ISO/TC 276/WG5). The call sought submissions of technologies that can provide efficient compression of genomic data and metadata for storage and processing applications. During the 116th MPEG meeting, responses to this CfP have been collected and evaluated by a joint ad-hoc group of both working groups, comprising twelve distinct technologies submitted. An initial assessment of the performance of the best eleven solutions for the different categories reported compression factors ranging from 8 to 58 for the different classes of data.

    The submitted twelve technologies show consistent improvements versus the results assessed as an answer to the Call for Evidence in February 2016. Further improvements of the technologies under consideration are expected with the first phase of core experiments that has been defined at the 116th MPEG meeting. The open core experiments process planned in the next 12 months will address multiple, independent, directly comparable rigorous experiments performed by independent entities to determine the specific merit of each technology and their mutual integration into a single solution for standardisation. The core experiment process will consider submitted technologies as well as new solutions in the scope of each specific core experiment. The final inclusion of submitted technologies into the standard will be based on the experimental comparison of performance, as well as on the validation of requirements and inclusion of essential metadata describing the context of the sequence data, and will be reached by consensus within and across both committees.

    Call for Proposals: Internet of Media Things and Wearables (IoMT&W)

    At its 116th meeting, MPEG issued a Call for Proposals (CfP) for Internet of Media Things and Wearables (see http://mpeg.chiariglione.org/), motivated by the understanding that more than half of major new business processes and systems will incorporate some element of the Internet of Things (IoT) by 2020. Therefore, the CfP seeks submissions of protocols and data representation enabling dynamic discovery of media things and media wearables. A standard in this space will facilitate the large-scale deployment of complex media systems that can exchange data in an interoperable way between media things and media wearables.

    MPEG-DASH Amendment with Media Presentation Description Chaining and Pre-Selection of Adaptation Sets

    At its 116th MPEG meeting, a new amendment for MPEG-DASH reached the final stage of Final Draft Amendment (ISO/IEC 23009-1:2014 FDAM 4). This amendment includes several technologies useful for industry practices of adaptive media presentation delivery. For example, the media presentation description (MPD) can be daisy chained to simplify implementation of pre-roll ads in cases of targeted dynamic advertising for live linear services. Additionally, support for pre-selection in order to signal suitable combinations of audio elements that are offered in different adaptation sets is enabled by this amendment. As there have been several amendments and corrigenda produced, this amendment will be published as a part of the 3rd edition of ISO/IEC 23009-1 together with the amendments and corrigenda approved after the 2nd edition.

    How to contact MPEG, learn more, and find other MPEG facts

    To learn about MPEG basics, discover how to participate in the committee, or find out more about the array of technologies developed or currently under development by MPEG, visit MPEG’s home page at http://mpeg.chiariglione.org. There you will find information publicly available from MPEG experts past and present including tutorials, white papers, vision documents, and requirements under consideration for new standards efforts. You can also find useful information in many public documents by using the search window.

    Examples of tutorials that can be found on the MPEG homepage include tutorials for: High Efficiency Video Coding, Advanced Audio Coding, Universal Speech and Audio Coding, and DASH to name a few. A rich repository of white papers can also be found and continues to grow. You can find these papers and tutorials for many of MPEG’s standards freely available. Press releases from previous MPEG meetings are also available. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Christian Timmerer at christian.timmerer@itec.uni-klu.ac.at or christian.timmerer@bitmovin.com.

    Further Information

    Future MPEG meetings are planned as follows:
    No. 117, Geneva, CH, 16 – 20 January, 2017
    No. 118, Hobart, AU, 03 – 07 April, 2017
    No. 119, Torino, IT, 17 – 21 July, 2017
    No. 120, Macau, CN, 23 – 27 October 2017

    For further information about MPEG, please contact:
    Dr. Leonardo Chiariglione (Convenor of MPEG, Italy)
    Via Borgionera, 103
    10040 Villar Dora (TO), Italy
    Tel: +39 011 935 04 61
    leonardo@chiariglione.org

    or

    Priv.-Doz. Dr. Christian Timmerer
    Alpen-Adria-Universität Klagenfurt | Bitmovin Inc.
    9020 Klagenfurt am Wörthersee, Austria, Europe
    Tel: +43 463 2700 3621
    Email: christian.timmerer@itec.aau.at | christian.timmerer@bitmovin.com


    ACM TVX — Call for Volunteer Associate Chairs

    CALL FOR VOLUNTEER ASSOCIATE CHAIRS – Applications for Technical Program Committee

    ACM TVX 2017 International Conference on Interactive Experiences for Television and Online Video June 14-16, 2017, Hilversum, The Netherlands www.tvx2017.com


    We are welcoming applications to become part of the TVX 2017 Technical Program Committee (TPC), as Associate Chair (AC). This involves playing a key role in the submission and review process, including attendance at the TPC meeting (please note that this is not a call for reviewers, but a call for Associate Chairs). We are opening applications to all members of the community, from both industry and academia, who feel they can contribute to this team.

    • This call is open to new Associate Chairs and to those who have been Associate Chairs in previous years and want to be an Associate Chair again for TVX 2017
    • Application form: https://goo.gl/forms/c9gNPHYZbh2m6VhJ3
    • The application deadline is December 12, 2016

    Following the success of previous years’ invitations for open applications to join our Technical Program Committee, we again invite applications for Associate Chairs. Successful applicants would be responsible for arranging and coordinating reviews for around 3 or 4 submissions in the main Full and Short Papers track of ACM TVX2017, and attend the Technical Program Committee meeting in Delft, The Netherlands, in mid-March 2017 (participation in person is strongly recommended). Our aim is to broaden participation, ensuring a diverse Technical Program Committee, and to help widen the ACM TVX community to include a full range of perspectives.

    We welcome applications from academics, industrial practitioners and (where appropriate) senior PhD students, who have expertise in Human Computer Interaction or related fields, and who have an interest in topics related to interactive experiences for television or online video. We would expect all applicants to have ‘top-tier’ publications related to this area. Applicants should have an expertise or interest in at least one or more topics in our call for papers: https://tvx.acm.org/2017/participation/full-and-short-paper-submissions/

    After the application deadline, the volunteers will be considered and selected for ACs, and the TPC Chairs will be free to also invite previous ACs or other researchers of the community to integrate the team. The ultimate goal is to reach a balanced, diverse and inclusive TPC team in terms of fields of expertise, experience and perspectives, both from academia and industry.

    To submit, just fill in the application form above!

    CONTACT INFORMATION

    For up to date information and further details please visit: www.tvx2017.com or get in touch with the Inclusion Chairs:

    Teresa Chambel, University of Lisbon, PT; Rob Koenen, TNO, NL
    at: inclusion@tvx2017.com

    In collaboration with the Program Chairs: Wendy van den Broeck, Vrije Universiteit Brussel, BE; Mike Darnell, Samsung, USA; Roger Zimmermann, NUS, Singapore


    MPEG Column: 117th MPEG Meeting

    The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

    The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

    • MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF)
    • MPEG-H 3D Audio Verification Test Report
    • MPEG Workshop on 5-Year Roadmap Successfully Held in Geneva
    • Call for Proposals (CfP) for Point Cloud Compression (PCC)
    • Preliminary Call for Evidence on video compression with capability beyond HEVC
    • MPEG issues Committee Draft of the Media Orchestration (MORE) Standard
    • Technical Report on HDR/WCG Video Coding

    In this article, I’d like to focus on the topics related to multimedia communication starting with OMAF.

    Omnidirectional Media Application Format (OMAF)

    Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats “are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users’ growing need for better-integrated multimedia solutions” [MPEG-A].” In the context of OMAF, the following aspects are defined:

    • Equirectangular projection format (note: others might be added in the future)
    • Metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data
    • Storage format: ISO base media file format (ISOBMFF)
    • Codecs: High Efficiency Video Coding (HEVC) and MPEG-H 3D audio

    OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 — Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin’s solution for VR and 360-degree video.

    Research aspects related to OMAF can be summarized as follows:

    • HEVC supports tiles which allow for efficient streaming of omnidirectional video but HEVC is not as widely deployed as AVC. Thus, it would be interesting how to mimic such a tile-based streaming approach utilizing AVC.
    • The question how to efficiently encode and package HEVC tile-based video is an open issue and call for a tradeoff between tile flexibility and coding efficiency.
    • When combined with MPEG-DASH (or similar), there’s a need to update the adaptation logic as the with tiles yet another dimension is added that needs to be considered in order to provide a good Quality of Experience (QoE).
    • QoE is a big issue here and not well covered in the literature. Various aspects are worth to be investigated including a comprehensive dataset to enable reproducibility of research results in this domain. Finally, as omnidirectional video allows for interactivity, also the user experience is becoming an issue which needs to be covered within the research community.

    A second topic I’d like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC. 

    Preliminary Call for Evidence on video compression with capability beyond HEVC

    A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard — with a preliminary target date for completion around late 2020 — is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don’t like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

     

    MPEG Workshop “Global Media Technology Standards for an Immersive Age”

    On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today’s and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

    Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

    For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I’m happy to see any reactions, hints, etc. in the comments.

    Finally, let’s have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

    MPEG-DASH and CMAF: Friend or Foe?

    For MPEG-DASH and CMAF it was a meeting “in between” official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.

    • Spatial Relationship Description (SRD): allows to describe tiles and region of interests for partial delivery of media presentations. This is highly related to OMAF and VR/360-degree video streaming.
    • External MPD linking: this feature allows to describe the relationship between a single program/channel and a preview mosaic channel having all channels at once within the MPD.
    • Period continuity: simple signaling mechanism to indicate whether one period is a continuation of the previous one which is relevant for ad-insertion or live programs.
    • MPD chaining: allows for chaining two or more MPDs to each other, e.g., pre-roll ad when joining a live program.
    • Flexible segment format for broadcast TV: separates the signaling of the switching points and random access points in each stream and, thus, the content can be encoded with a good compression efficiency, yet allowing higher number of random access point, but with lower frequency of switching points.
    • Server and network-assisted DASH (SAND): enables asynchronous network-to-client and network-to-network communication of quality-related assisting information.
    • DASH with server push and WebSockets: basically addresses issues related to HTTP/2 push feature and WebSocket.

    CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends — let’s hope that the friendship lasts for a long time.

    What else happened at the MPEG meeting?

    • Committee Draft MORE (note: type in ‘man more’ on any unix/linux/max terminal and you’ll get ‘less – opposite of more’;): MORE stands for “Media Orchestration” and provides a specification that enables the automated combination of multiple media sources (cameras, microphones) into a coherent multimedia experience. Additionally, it targets use cases where a multimedia experience is rendered on multiple devices simultaneously, again giving a consistent and coherent experience.
    • Technical Report on HDR/WCG Video Coding: This technical report comprises conversion and coding practices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video coding (ISO/IEC 23008-14). The purpose of this document is to provide a set of publicly referenceable recommended guidelines for the operation of AVC or HEVC systems adapted for compressing HDR/WCG video for consumer distribution applications
    • CfP Point Cloud Compression (PCC): This call solicits technologies for the coding of 3D point clouds with associated attributes such as color and material properties. It will be part of the immersive media project introduced above.
    • MPEG-H 3D Audio verification test report: This report presents results of four subjective listening tests that assessed the performance of the Low Complexity Profile of MPEG-H 3D Audio. The tests covered a range of bit rates and a range of “immersive audio” use cases (i.e., from 22.2 down to 2.0 channel presentations). Seven test sites participated in the tests with a total of 288 listeners.

    The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.


    Call for Grand Challenge Problem Proposals

    Original page: http://www.acmmm.org/2017/contribute/call-for-multimedia-grand-challenge-proposals/

     

    The Multimedia Grand Challenge was first presented as part of ACM Multimedia 2009 and has established itself as a prestigious competition in the multimedia community.  The purpose of the Multimedia Grand Challenge is to engage with the multimedia research community by establishing well-defined and objectively judged challenge problems intended to exercise state-of-the-art techniques and methods and inspire future research directions.

    Industry leaders and academic institutions are invited to submit proposals for specific Multimedia Grand Challenges to be included in this year’s program.

    A Grand Challenge proposal should include:

    • A brief description motivating why the challenge problem is important and relevant for the multimedia research community, industry, and/or society today and going forward for the next 3-5 years.
    • A description of a specific set of tasks or goals to be accomplished by challenge problem submissions.
    • Links to relevant datasets to be used for experimentation, training, and evaluation as necessary. Full appropriate documentation on any datasets should be provided or made accessible.
    • A description of rigorously defined objective criteria and/or procedures for how submissions will be judged.
    • Contact information of at least two organizers who will be responsible for accepting and judging submissions as described in the proposal.

    Grand Challenge proposals will be considered until March 1st and will be evaluated on an on-going basis as they are received. Grand Challenge proposals that are accepted to be part of the ACM Multimedia 2017 program will be posted on the conference website and included in subsequent calls for participation. All material, datasets, and procedures for a Grand Challenge problem should be ready for dissemination no later than March 14th.

    While each Grand Challenge is allowed to define an independent timeline for solution evaluation and may allow iterative resubmission and possible feedback (e.g., a publicly posted leaderboard), challenge submissions must be complete and a paper describing the solution and results should be submitted to the conference program committee by July 14, 2017.

    Grand Challenge proposals should be sent via email to the Grand Challenge chair, Ketan Mayer-Patel.

    Those interested in submitting a Grand Challenge proposal are encouraged to review the problem descriptions from ACM Multimedia 2016 as examples. These are available here: http://www.acmmm.org/2016/?page_id=353


    JPEG Column: 74th JPEG Meeting

    The 74th JPEG meeting was held at ITU Headquarters in Geneva, Switzerland, from 15 to 20 January featuring the following highlights:

    • A Final Call for Proposals on JPEG Pleno was issued focusing on light field coding;
    • Creation of a test model for the upcoming JPEG XS standard;
    • A draft Call for Proposals for JPEG Privacy & Security was issued;
    • JPEG AIC technical report finalized on Guidelines for image coding system evaluation;
    • An AHG was created to investigate the evidence of high throughput JPEG 2000;
    • An AHG on next generation image compression standard was initiated to explore a future image coding format with superior compression efficiency.

     

    JPEG Pleno kicks off its activities towards JPEGmeeting74standardization of light field coding

    At the 74th JPEG meeting in Geneva, Switzerland the final Call for Proposals (CfP) on JPEG Pleno was issued particularly focusing on light field coding. The CfP is available here.

    The call encompasses coding technologies for lenslet light field cameras, and content produced by high-density arrays of cameras. In addition, system-level solutions associated with light field coding and processing technologies that have a normative impact are called for. In a later stage, calls for other modalities such as point cloud, holographic and omnidirectional data will be issued, encompassing image representations and new and rich forms of visual data beyond the traditional planar image representations.

    JPEG Pleno intends to provide a standard framework to facilitate capture, representation and exchange of these omnidirectional, depth-enhanced, point cloud, light field, and holographic imaging modalities. It aims to define new tools for improved compression while providing advanced functionalities at the system level. Moreover, it targets to support data and metadata manipulation, editing, random access and interaction, protection of privacy and ownership rights as well as other security mechanisms.

     

    JPEG XS aims at the standardization of a visually lossless low-latency lightweight compression scheme that can be used for a wide range of applications including mezzanine codec for the broadcast industry and Pro-AV markets. Targeted use cases are professional video links, IP transport, Ethernet transport, real-time video storage, video memory buffers, and omnidirectional video capture and rendering. After a Call for Proposal issued on March 11th 2016 and the assessment of the submitted technologies, a test model for the upcoming JPEG XS standard was created during the 73rd JPEG meeting in Chengdu and the results of a first set of core experiments have been reviewed during the 74th JPEG meeting in Geneva. More core experiments are on their way before finalizing the standard: JPEG committee therefore invites interested parties – in particular coding experts, codec providers, system integrators and potential users of the foreseen solutions – to contribute to the further specification process.

     

    JPEG Privacy & Security aims at developing a standard for realizing secure image information sharing which is capable of ensuring privacy, maintaining data integrity, and protecting intellectual property rights (IPR). JPEG Privacy & Security will explore ways on how to design and implement the necessary features without significantly impacting coding performance while ensuring scalability, interoperability, and forward and backward compatibility with current JPEG standard frameworks.

    A draft Call for Proposals for JPEG Privacy & Security has been issued and the JPEG committee invites interested parties to contribute to this standardisation activity in JPEG Systems. The draft of CfP is available here.

    The call addresses protection mechanisms and technologies such as handling hierarchical levels of access and multiple protection levels for metadata and image protection, checking integrity of image data and embedded metadata, and supporting backward and forward compatibility with JPEG coding technologies. Interested parties are encouraged to subscribe to the JPEG Privacy & Security email reflector for collecting more information. A final version of the JPEG Privacy & Security Call for Proposals is expected at the 75th JPEG meeting located in Sydney, Australia.

     

    JPEG AIC provides guidance and standard procedures for advanced image coding evaluation.  At this meeting JPEG completed a technical report: TR 29170-1 Guidelines for image coding system evaluation. This report is a compendium of JPEGs best practices in evaluation that draws sources from several different international standards and international recommendations. The report discusses use of objective tools, subjective procedures and computational analysis techniques and when to use the different tools. Some of the techniques are tried and true tools familiar to image compression experts and vision scientists. Several tools represent new fields where few tools have been available, such as the evaluation of coding systems for high dynamic range content.

     

    High throughput JPEG 2000

    The JPEG committee started a new activity for high throughput JPEG 2000 and an AHG was created to investigate the evidence for such kind of standard. Experts are invited to participate in this expert group and to join the mailing list.

     

    Final Quote

    “JPEG continues to offer standards that redefine imaging products and services contributing to a better society without borders.” said Prof. Touradj Ebrahimi, the Convener of the JPEG committee.

     

    About JPEG

    The Joint Photographic Experts Group (JPEG) is a Working Group JPEG-signatureof ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS and JPEG Systems and JPEG PLENO families of imaging standards.

    More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro and Tim Bruylants of the JPEG Communication Subgroup at pr@jpeg.org.

    If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on https://listserv.uni-stuttgart.de/mailman/listinfo/jpeg-news.   Moreover, you can follow JPEG twitter account on http://twitter.com/WG1JPEG.

     

    Future JPEG meetings are planned as follows:

    • No. 75, Sydney, AU, 26 – 31 March, 2017
    • No. 76, Torino, IT, 17 – 21 July, 2017
    • No. 77, Macau, CN, 23 – 27 October 2017

     


    Amirhossein Habibian

    Storytelling Machines for Video Search

    Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

    URL: http://dare.uva.nl/record/1/540787

    ISBN: 978-94-6182-715-9

    0 (1)This thesis studies the fundamental question: what vocabulary of concepts are suited for machines to describe video content? The answer to this question involves two annotation steps: First, to specify a list of concepts by which videos are described. Second, to label a set of videos per concept as its examples or counter examples. Subsequently, the vocabulary is constructed as a set of video concept detectors learned from the provided annotations by supervised learning.

    Starting from handcrafting the vocabulary by manual annotation, we gradually automate vocabulary construction by concept composition, and by learning from human stories. As a case study, we focus on vocabularies for describing events, such as marriage proposal, graduation ceremony, and changing a vehicle tire, in videos.

    As the first step, we rely on an extensive pool of manually specified concepts to study what are the best practices for handcrafting the vocabulary? From our analysis, we conclude that the vocabulary should encompass over thousands of concepts from various types, including object, action, scene, people, animal, and attribute. Moreover, the vocabulary should include the detectors for both generic concepts and specific concepts, which are trained and normalized in an appropriate way.
    We alleviate the manual labor for vocabulary construction by addressing the next research question: can a machine learn novel concepts by composition? We propose an algorithm, which learns new concepts by composing the ground concepts by Boolean logic connectives, i.e. “ride-AND-bike”. We demonstrate that concept composition is an effective trick to infer the annotations, needed for training new concept detectors, without additional human annotation.
    As a further step towards reducing the manual labor for vocabulary construction, we investigate the question of can a machine learn its vocabulary from human stories, i.e. video captions or subtitles? By analyzing the human stories using topic models, we effectively extract the concepts that humans use for describing videos. Moreover, we show that the occurrences of concepts in stories can be effectively used as weak supervision to train concept detectors.
    Finally, we address the question of how to learn the vocabulary from human stories? We learn the vocabulary as an embedding from videos into their stories. We utilize the correlations between the terms to learn the embedding more effectively. More specifically, we learn similar embeddings for the terms, which highly co-occur in the stories, as these terms are usually synonyms. Furthermore, we extend our embedding to learn the vocabulary from various video modalities including audio and motion. It makes us able to generate more natural descriptions by incorporating concepts from various modalities, i.e. the laughing and singing concepts from audio, and the jumping and dancing concepts from motion.

    Intelligent Sensory Information Systems group

    URL: https://ivi.fnwi.uva.nl/isis/

    The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.


    Chien-nan Chen

    Semantic-Aware Content Delivery Framework for 3D Tele-Immersion

    Supervisor(s) and Committee member(s): Klara Nahrstedt (advisor), Roy Campbell (opponent), Indranil Gupta (opponent), Cheng-Hsin Hsu (opponent)

    URL: http://cairo.cs.uiuc.edu/publications/papers/Shannon_Thesis.pdf

    chen3D Tele-immersion (3DTI) technology allows full-body, multimodal interaction among geographically dispersed users, which opens a variety of possibilities in cyber collaborative applications such as art performance, exergaming, and physical rehabilitation. However, with its great potential, the resource and quality demands of 3DTI rise inevitably, especially when some advanced applications target resource-limited computing environments with stringent scalability demands. Under these circumstances, the tradeoffs between 1) resource requirements, 2) content complexity, and 3) user satisfaction in delivery of 3DTI services are magnified.

    In this dissertation, we argue that these tradeoffs of 3DTI systems are actually avoidable when the underlying delivery framework of 3DTI takes the semantic information into consideration. We introduce the concept of semantic information into 3DTI, which encompasses information about the three factors: environment, activity, and user role in 3DTI applications. With semantic information, 3DTI systems are able to 1) identify the characteristics of its computing environment to allocate computing power and bandwidth to delivery of prioritized contents, 2) pinpoint and discard the dispensable content in activity capturing according to properties of target application, and 3) differentiate contents by their contributions on fulfilling the objectives and expectation of user’s role in the application so that the adaptation module can allocate resource budget accordingly. With these capabilities we can change the tradeoffs into synergy between resource requirements, content complexity, and user satisfaction.

    We implement semantics-aware 3DTI systems to verify the performance gain on the three phases in 3DTI systems’ delivery chain: capturing phase, dissemination phase, and receiving phase. By introducing semantics information to distinct 3DTI systems, the efficiency improvements brought by our semantics-aware content delivery framework are validated under different application requirements, different scalability bottlenecks, and different user and application models.

    To sum up, in this dissertation we aim to change the tradeoff between requirements, complexity, and satisfaction in 3DTI services by exploiting the semantic information about the computing environment, the activity, and the user role upon the underlying delivery systems of 3DTI. The devised mechanisms will enhance the efficiency of 3DTI systems targeting on serving different purposes and 3DTI applications with different computation and scalability requirements.

    MONET

    URL: http://cairo.cs.uiuc.edu/

    The Multimedia Operating Systems and Networking (MONET) Research Group, led by Professor Klara Nahrstedt in the Department of Computer Science at the University of Illinois at Urbana-Champaign, is engaged in research in various areas of distributed multimedia systems.


    Masoud Mazloom

    In Search of Video Event Semantics

    Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

    URL: http://dare.uva.nl/record/1/430219

    ISBN: 978-94-6182-717-3

    0 (2)In this thesis we aim to represent an event in a video using semantic features. We start from a bank of concept detectors for representing events in video.
    At first we considered the relevance of concepts to the event inside the video representation. We address the problem of video event classification using a bank of concept detectors. Different from existing work, which simply relies on a bank containing all available detectors, we propose an algorithm that learns from examples what concepts in bank are most informative per event.
    Secondly, we concentrated on the accuracy of concept detectors. Different from existing works, which obtain a semantic representation by training concepts over entire video clips, we propose an algorithm that learns a set of relevant frames as the concept prototypes from web video examples, without the need for frame-level annotations, and use them for representing an event video.
    Thirdly, we consider the problem of searching video events with concepts. We aim at querying web videos for events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events.
    Finally, we consider the problem of video event search without semantic concepts. The prevailing solutions in literature rely on a semantic video representation obtained from thousands of pre-trained concept detectors. Different from them, we propose a new semantic video representation that is based on freely available social tagged videos only, without the need for training any intermediate concept detectors.

    Intelligent Sensory Information Systems group

    URL: https://ivi.fnwi.uva.nl/isis/

    The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.


    Rufael Mekuria

    Network Streaming and Compression for Mixed Reality Tele-Immersion

    Supervisor(s) and Committee member(s): prof. Dick Bulterman (promotor), Dr. Pablo Cesar (co-promotor, supervisor), prof. Klara Nahrstedt (Opponent), prof. Fernando M.B. Pereira (Opponent), prof. Maarten van Steen (Opponent), Dr. Thilo Kielmann (Opponent), prof. Rob van der Mei (Opponent)

    URL: http://dare.ubvu.vu.nl/handle/1871/55024

    ISBN: 978-90-9030147-1

    rufaelThe Internet is used for distributed shared experiences such as video conferencing, voice calls (possibly in a group), chatting, photo sharing, online gaming and virtual reality. These technologies are changing our daily lives and the way we interact with each other. The current rapid advances in 3D depth sensing and 3D cameras are enabling acquisition of highly realistic reconstructed 3D representations. These natural scene representations are often based on 3D point clouds or 3D Meshes. Integration of these data in distributed shared experiences can have a large impact on the way we work and interact online. Such shared experiences may enable 3D Tele-immersion and Mixed reality that combine real and synthetic contents in a virtual world. However, it poses many challenges to the existing Internet infrastructure. A large part of the challenge is due to the shear volume of reconstructed 3D data. End-to-End Internet connections are bandlimited and currently cannot support real-time end-to-end transmission of uncompressed 3D point cloud or mesh scans with hundreds of thousands of points (over 15 Megabytes per frame) captured at a fast rate (over 10 frames per second). Therefore the volume of the 3D data requires development of methods for efficient compression and transmission, possibly taking application and user specific requirements into account. In addition, sessions often need to be setup between different software and devices such as browsers, desktop applications, mobile applications or server side applications. For this reason interoperability is required. This introduces the need for standardisation of data formats, compression techniques and signalling (session management). In the case of mixed reality in a social networking context, users may use different types of reconstructed and synthetic 3D content (from simple avatar commands, to highly realistic 3D reconstructions based on 3D Mesh or Point Clouds). Therefore such signalling should take into account that different types of user-setups exist, from simple to very advanced, that can each join shared sessions and interact.

    This thesis develops strategies for compression and transmission of reconstructed 3D data in Internet infrastructures. It develop three different approaches for the compression of 3D meshes and a codec for time varying 3D point clouds. Further, it develops an integrated 3D streaming framework that includes session management and signalling, media synchronization and a generic API for sending streams based ix on UDP/TCP based protocols. Experiments with these components in a realistic integrated mixed reality system with state of art rendering and 3D data capture investigates the specific system and user experience issues arising in the integration of these sub components.

    The first mesh codec is based on taking blocks of the mesh geometry list based on local per block differential encoding and coding the connectivity based on a repetitive pattern resulting from the reconstruction system. The main advantage of this approach is simplicity and parallelizability. The codec is integrated in an initial prototype for 3D immersive communication that includes a communication protocol based on rateless coding based on LT codes and a light 3D rendering engine that includes an implementation for global illumination.

    The second mesh codec is a connectivity driven approach. It codes the connectivity in a similar manner as the first codec but with entropy encoding added based on deflate/inflate (based on the popular zlib library). This addition makes the connectivity codec much more generically applicable. Subsequently it traverses the connectivity to apply differential coding of the geometry. The differences between connected vertices are then quantized using a non linear quantizer. We call this approach delayed quantization step late quantization (LQ). This approach resulted in reduced encoding complexity at only a modest degradation in R-D performance compared to the state of the art in standardized mesh compression in MPEG-4. The resulting codec performs over 10 times faster encoding in practice compared to the latter. The codec is used to achieve real-time communication in a WAN/MAN scenario in a controlled IP network configuration. This includes real-time rendering and rateless packet coding of UDP Packet data. The streaming pipeline has been optimized to run in real time with varying frame rates that often occur in 3D Tele-immersion and mixed reality. In addition it was tested in different network conditions using a LIFO (Last in First Out) approach that optimizes the pipeline. In addition, it has been integrated with highly realistic rendering and 3D capture.

    The third codec is based on a geometry driven approach. In this codec the geometry is coded first in an octree fashion and then the connectivity representation is converted to a representation that indexes voxels in the octree grid. This representation introduces correlation between the indices that is exploited using a vector quantization scheme. This codec enables real-time coding at different levels of detail (LoD) and highly adaptive bit-rates. This codec is useful when the 3D immersive virtual x room is deployed in the Internet when bandwidths may fluctuate heavily and are more restricted compared to the controlled WAN/MAN scenario. In addition, it is suitable for 3D representations that can be rendered at a lower level of detail, such as participants/objects rendered at a distance in the 3D Room. Next, the focus shifts towards 3D Point Clouds instead of 3D meshes. 3D Point Clouds are a simpler representation of the 3D reconstructions. The thesis develops a codec for time-varying point clouds. It introduces a hybrid architecture that combines an octree based intra codec with lossy inter-prediction and lossy attributes coding based on mapping attributes to a JPEG image grid. It also introduces temporal inter-prediction. The predictive frames are reduced up to 30% and the colours up to 90% in size compared to the current state of the art in real-time point cloud compression. Subjective experiments in a realistic mixed reality virtual world framework developed in the Reverie project showed no significant degradation in the resulting perceptual quality.

    In the last phase of this thesis, the complete 3D tele-immersive streaming platform is further developed. Additions include signalling support for 3D streaming (session management) that supports terminal scalability for light clients (render only) up to very heavy clients. This is done by signalling the local modular configuration in advance via the XMPP Protocol. Further, the streaming platform architecture presents an API where different stream types suitable to different 3D capture/reconstruction platforms (i.e. 3D audio, 3D visual, 3D animation) can be created. As the platform includes a distributed virtual clock, mechanisms to perform inter-stream and inter-sender media synchronization can be deployed at the application layer. Therefore, synchronization of compressed 3D audio streams in an audio playout buffer was implemented in a 3D audio rendering module. We also implemented a mesh and point cloud playout buffer in the module for synchronized rendering. This mesh playout buffer enables inter-sender synchronization between different incoming visual streams. In addition, the system includes simple publish and subscribe transmission protocol for small messages based on web socket (through a real-time cloud broker). In addition publish and subscribe based on the XMPP and UDP protocols was implemented. These publish and subscribe messages are particularly suitable for 3D animation commands and AI data exchange. All components developed throughout this thesis have been integrated with 3D capture/rendering modules and in a social networking context in the larger Reverie 3D xi Tele-immersive framework. Field trials of this system in different scenarios have shown the benefits of highly realistic live captured 3D data representations. This further highlights the importance of this work. The components developed in this thesis and their integration outlines many of the significant challenges encountered in the next generation of 3D tele-presence and mixed reality systems. These insights have contributed to the development of requirements for new international standards in the consortia MPEG (Moving Picture Experts Group) and JPEG (Joint Picture Experts Group). In addition, the developed codec and quality metrics for point cloud compression have been accepted as a base reference software model for a novel standard on point cloud compression in MPEG and are available in the MPEG code repository and online on github.

    DIS CWI

    URL: http://www.dis.cwi.nl/

    Centrum Wiskunde Informatica Netherlands focusses on applied and fundamental problems in Mathematics and Computer Science. The Distributed and Interactive Systems group (DIS) focuses on modeling and controlling complex collections of media objects (including real-time media and sensor data) that are interactive and distributed in time and space. The group’s fundamental interest is in understanding how the various notions of ‘time’ influence the creation, distribution and delivery of complex content in a customizable manner. The group is led by Dr. Pablo Cesar


    Svetlana Kordumova

    Learning to Search for Images without Annotations

    Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

    URL: http://dare.uva.nl/record/1/540788

    ISBN: 978-608-4784-15-9

    0This thesis contributes to learning machines what is in an image by avoiding direct manual annotation as training data. We either rely on tagged data from social media platforms to recognize concepts, or on objects semantics and layout to recognize scenes. We focus our effort on image search.
    We firstly demonstrate that concepts detectors can be learned using tagged examples from social media platforms. We show that using tagged images and videos directly as ground truth for learning can be problematic because of the noisy nature of tags. To this end, through extensive experimental analysis, we recommend to calculate the relevance of tags, and select only relevant positive and relevant negative examples for learning. Inclusive, we present four best practices which led to a winning entry on the TRECVID 2013 benchmark for the semantic indexing with no annotations task. Following the findings that important concepts appear rarely as tags in social media platforms, we propose to use semantic knowledge from an ontology to improve calculating tag relevance and to enrich training data for learning concept detectors of rare tags.
    When searching images of a particular scene, instead of using annotated scene images, we show that with object classifiers we can reasonably well recognize scenes. We exploit 15,000 object classifiers trained with a convolutional neural network. Since not all objects can contribute equally in describing a scene, we show that pooling only the 100 most prominent object classifiers per image is good enough to recognize its scene. Furthermore, we go to the extreme of recognizing scenes by removing all object identities. We refer to the most probable positions in images to contain objects as things. We show that the ensemble of things properties, size, position, aspect ratio and prominent color, and those only, can discriminate scenes. The benefit of removing all object identities is that we also eliminate the learning of object classifiers in the process, and thus demonstrate that scenes can be recognized with no learning at all.
    Overall, this thesis presents alternative ways to learn what concept is in an image or what scene it belongs to, without using manually annotated data, for the goal of image search. It investigates new approaches for learning machines to recognize the visually depicted environment captured in images, all the while dismissing the annotation process.

    Intelligent Sensory Information Systems group

    URL: https://ivi.fnwi.uva.nl/isis/

    The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.

  9. Notice to contributing authors
  10. Impressum
/* ACM SIGMM Records

Volume -8, Issue 0, 200 (ISSN 1947-4598)

Call for Task Proposals: Multimedia Evaluation 2017

MediaEval 2017 Multimedia Evaluation Benchmark

Call for Task Proposals

Proposal Deadline: 3 December 2016

MediaEval is a benchmarking initiative dedicated to developing and evaluating new algorithms and technologies for multimedia retrieval, access and exploration. It offers tasks to the research community that are related to human and social aspects of multimedia. MediaEval emphasizes the ‘multi’ in multimedia and seeks tasks involving multiple modalities, e.g., audio, visual, textual, and/or contextual.

MediaEval is now calling for proposals for tasks to run in the 2017 benchmarking season. The proposal consists of a description of the motivation for the task and challenges that task participants must address. It provides information on the data and evaluation methodology to be used. The proposal also includes a statement of how the task is related to MediaEval (i.e., its human or social component), and how it extends the state of the art in an area related to multimedia indexing, search or other technologies that support users in accessing multimedia collections.

For more detailed information about the content of the task proposal, please see:
http://www.multimediaeval.org/files/mediaeval2017_taskproposals.html

Task proposal deadline: 3 December 2016

Task proposals are chosen on the basis of their feasibility, their match with the topical focus of MediaEval, and also according to the outcome of a survey circulated to the wider multimedia research community.

The MediaEval 2017 Workshop will be held 13-15 September 2017 in Dublin, Ireland, co-located with CLEF 2017 (http://clef2017.clef-initiative.eu)

For more information about MediaEval see http://multimediaeval.org or contact Martha Larson m.a.larson@tudelft.nl

 

MPEG Column: 116th MPEG Meeting

MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

Chengdu, China – The 116th MPEG meeting was held in Chengdu, China, from 17 – 21 October 2016

MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

At its 116th meeting, MPEG successfully organised a workshop on its 5-year standardisation roadmap. Various industry representatives presented their views and reflected on the need for standards for new services and applications, specifically in the area of immersive media. The results of the workshop (roadmap, presentations) and the planned phases for the standardisation of “immersive media” are available at http://mpeg.chiariglione.org/. A follow-up workshop will be held on 18 January 2017 in Geneva, co-located with the 117th MPEG meeting. The workshop is open to all interested parties and free of charge. Details on the program and registration will be available at http://mpeg.chiariglione.org/.

Summary of the “Survey on Virtual Reality”

At its 115th meeting, MPEG established an ad-hoc group on virtual reality which conducted a survey on virtual reality with relevant stakeholders in this domain. The feedback from this survey has been provided as input for the 116th MPEG meeting where the results have been evaluated. Based on these results, MPEG aligned its standardisation timeline with the expected deployment timelines for 360-degree video and virtual reality services. An initial specification for 360-degree video and virtual reality services will be ready by the end of 2017 and is referred to as the Omnidirectional Media Application Format (OMAF; MPEG-A Part 20, ISO/IEC 23000-20). A standard addressing audio and video coding for 6 degrees of freedom where users can freely move around is on MPEG’s 5-year roadmap. The summary of the survey on virtual reality is available at http://mpeg.chiariglione.org/.

MPEG and ISO/TC 276/WG 5 have collected and evaluated the answers to the Genomic Information Compression and Storage joint Call for Proposals

At its 115th meeting, MPEG issued a Call for Proposals (CfP) for Genomic Information Compression and Storage in conjunction with the working group for standardisation of data processing and integration of the ISO Technical Committee for biotechnology standards (ISO/TC 276/WG5). The call sought submissions of technologies that can provide efficient compression of genomic data and metadata for storage and processing applications. During the 116th MPEG meeting, responses to this CfP have been collected and evaluated by a joint ad-hoc group of both working groups, comprising twelve distinct technologies submitted. An initial assessment of the performance of the best eleven solutions for the different categories reported compression factors ranging from 8 to 58 for the different classes of data.

The submitted twelve technologies show consistent improvements versus the results assessed as an answer to the Call for Evidence in February 2016. Further improvements of the technologies under consideration are expected with the first phase of core experiments that has been defined at the 116th MPEG meeting. The open core experiments process planned in the next 12 months will address multiple, independent, directly comparable rigorous experiments performed by independent entities to determine the specific merit of each technology and their mutual integration into a single solution for standardisation. The core experiment process will consider submitted technologies as well as new solutions in the scope of each specific core experiment. The final inclusion of submitted technologies into the standard will be based on the experimental comparison of performance, as well as on the validation of requirements and inclusion of essential metadata describing the context of the sequence data, and will be reached by consensus within and across both committees.

Call for Proposals: Internet of Media Things and Wearables (IoMT&W)

At its 116th meeting, MPEG issued a Call for Proposals (CfP) for Internet of Media Things and Wearables (see http://mpeg.chiariglione.org/), motivated by the understanding that more than half of major new business processes and systems will incorporate some element of the Internet of Things (IoT) by 2020. Therefore, the CfP seeks submissions of protocols and data representation enabling dynamic discovery of media things and media wearables. A standard in this space will facilitate the large-scale deployment of complex media systems that can exchange data in an interoperable way between media things and media wearables.

MPEG-DASH Amendment with Media Presentation Description Chaining and Pre-Selection of Adaptation Sets

At its 116th MPEG meeting, a new amendment for MPEG-DASH reached the final stage of Final Draft Amendment (ISO/IEC 23009-1:2014 FDAM 4). This amendment includes several technologies useful for industry practices of adaptive media presentation delivery. For example, the media presentation description (MPD) can be daisy chained to simplify implementation of pre-roll ads in cases of targeted dynamic advertising for live linear services. Additionally, support for pre-selection in order to signal suitable combinations of audio elements that are offered in different adaptation sets is enabled by this amendment. As there have been several amendments and corrigenda produced, this amendment will be published as a part of the 3rd edition of ISO/IEC 23009-1 together with the amendments and corrigenda approved after the 2nd edition.

How to contact MPEG, learn more, and find other MPEG facts

To learn about MPEG basics, discover how to participate in the committee, or find out more about the array of technologies developed or currently under development by MPEG, visit MPEG’s home page at http://mpeg.chiariglione.org. There you will find information publicly available from MPEG experts past and present including tutorials, white papers, vision documents, and requirements under consideration for new standards efforts. You can also find useful information in many public documents by using the search window.

Examples of tutorials that can be found on the MPEG homepage include tutorials for: High Efficiency Video Coding, Advanced Audio Coding, Universal Speech and Audio Coding, and DASH to name a few. A rich repository of white papers can also be found and continues to grow. You can find these papers and tutorials for many of MPEG’s standards freely available. Press releases from previous MPEG meetings are also available. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Christian Timmerer at christian.timmerer@itec.uni-klu.ac.at or christian.timmerer@bitmovin.com.

Further Information

Future MPEG meetings are planned as follows:
No. 117, Geneva, CH, 16 – 20 January, 2017
No. 118, Hobart, AU, 03 – 07 April, 2017
No. 119, Torino, IT, 17 – 21 July, 2017
No. 120, Macau, CN, 23 – 27 October 2017

For further information about MPEG, please contact:
Dr. Leonardo Chiariglione (Convenor of MPEG, Italy)
Via Borgionera, 103
10040 Villar Dora (TO), Italy
Tel: +39 011 935 04 61
leonardo@chiariglione.org

or

Priv.-Doz. Dr. Christian Timmerer
Alpen-Adria-Universität Klagenfurt | Bitmovin Inc.
9020 Klagenfurt am Wörthersee, Austria, Europe
Tel: +43 463 2700 3621
Email: christian.timmerer@itec.aau.at | christian.timmerer@bitmovin.com

ACM TVX — Call for Volunteer Associate Chairs

CALL FOR VOLUNTEER ASSOCIATE CHAIRS – Applications for Technical Program Committee

ACM TVX 2017 International Conference on Interactive Experiences for Television and Online Video June 14-16, 2017, Hilversum, The Netherlands www.tvx2017.com


We are welcoming applications to become part of the TVX 2017 Technical Program Committee (TPC), as Associate Chair (AC). This involves playing a key role in the submission and review process, including attendance at the TPC meeting (please note that this is not a call for reviewers, but a call for Associate Chairs). We are opening applications to all members of the community, from both industry and academia, who feel they can contribute to this team.

Following the success of previous years’ invitations for open applications to join our Technical Program Committee, we again invite applications for Associate Chairs. Successful applicants would be responsible for arranging and coordinating reviews for around 3 or 4 submissions in the main Full and Short Papers track of ACM TVX2017, and attend the Technical Program Committee meeting in Delft, The Netherlands, in mid-March 2017 (participation in person is strongly recommended). Our aim is to broaden participation, ensuring a diverse Technical Program Committee, and to help widen the ACM TVX community to include a full range of perspectives.

We welcome applications from academics, industrial practitioners and (where appropriate) senior PhD students, who have expertise in Human Computer Interaction or related fields, and who have an interest in topics related to interactive experiences for television or online video. We would expect all applicants to have ‘top-tier’ publications related to this area. Applicants should have an expertise or interest in at least one or more topics in our call for papers: https://tvx.acm.org/2017/participation/full-and-short-paper-submissions/

After the application deadline, the volunteers will be considered and selected for ACs, and the TPC Chairs will be free to also invite previous ACs or other researchers of the community to integrate the team. The ultimate goal is to reach a balanced, diverse and inclusive TPC team in terms of fields of expertise, experience and perspectives, both from academia and industry.

To submit, just fill in the application form above!

CONTACT INFORMATION

For up to date information and further details please visit: www.tvx2017.com or get in touch with the Inclusion Chairs:

Teresa Chambel, University of Lisbon, PT; Rob Koenen, TNO, NL
at: inclusion@tvx2017.com

In collaboration with the Program Chairs: Wendy van den Broeck, Vrije Universiteit Brussel, BE; Mike Darnell, Samsung, USA; Roger Zimmermann, NUS, Singapore

MPEG Column: 117th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

In this article, I’d like to focus on the topics related to multimedia communication starting with OMAF.

Omnidirectional Media Application Format (OMAF)

Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats “are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users’ growing need for better-integrated multimedia solutions” [MPEG-A].” In the context of OMAF, the following aspects are defined:

OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 — Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin’s solution for VR and 360-degree video.

Research aspects related to OMAF can be summarized as follows:

A second topic I’d like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC. 

Preliminary Call for Evidence on video compression with capability beyond HEVC

A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard — with a preliminary target date for completion around late 2020 — is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don’t like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

 

MPEG Workshop “Global Media Technology Standards for an Immersive Age”

On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today’s and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I’m happy to see any reactions, hints, etc. in the comments.

Finally, let’s have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

MPEG-DASH and CMAF: Friend or Foe?

For MPEG-DASH and CMAF it was a meeting “in between” official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.

CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends — let’s hope that the friendship lasts for a long time.

What else happened at the MPEG meeting?

The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.

Call for Grand Challenge Problem Proposals

Original page: http://www.acmmm.org/2017/contribute/call-for-multimedia-grand-challenge-proposals/

 

The Multimedia Grand Challenge was first presented as part of ACM Multimedia 2009 and has established itself as a prestigious competition in the multimedia community.  The purpose of the Multimedia Grand Challenge is to engage with the multimedia research community by establishing well-defined and objectively judged challenge problems intended to exercise state-of-the-art techniques and methods and inspire future research directions.

Industry leaders and academic institutions are invited to submit proposals for specific Multimedia Grand Challenges to be included in this year’s program.

A Grand Challenge proposal should include:

Grand Challenge proposals will be considered until March 1st and will be evaluated on an on-going basis as they are received. Grand Challenge proposals that are accepted to be part of the ACM Multimedia 2017 program will be posted on the conference website and included in subsequent calls for participation. All material, datasets, and procedures for a Grand Challenge problem should be ready for dissemination no later than March 14th.

While each Grand Challenge is allowed to define an independent timeline for solution evaluation and may allow iterative resubmission and possible feedback (e.g., a publicly posted leaderboard), challenge submissions must be complete and a paper describing the solution and results should be submitted to the conference program committee by July 14, 2017.

Grand Challenge proposals should be sent via email to the Grand Challenge chair, Ketan Mayer-Patel.

Those interested in submitting a Grand Challenge proposal are encouraged to review the problem descriptions from ACM Multimedia 2016 as examples. These are available here: http://www.acmmm.org/2016/?page_id=353

JPEG Column: 74th JPEG Meeting

The 74th JPEG meeting was held at ITU Headquarters in Geneva, Switzerland, from 15 to 20 January featuring the following highlights:

 

JPEG Pleno kicks off its activities towards JPEGmeeting74standardization of light field coding

At the 74th JPEG meeting in Geneva, Switzerland the final Call for Proposals (CfP) on JPEG Pleno was issued particularly focusing on light field coding. The CfP is available here.

The call encompasses coding technologies for lenslet light field cameras, and content produced by high-density arrays of cameras. In addition, system-level solutions associated with light field coding and processing technologies that have a normative impact are called for. In a later stage, calls for other modalities such as point cloud, holographic and omnidirectional data will be issued, encompassing image representations and new and rich forms of visual data beyond the traditional planar image representations.

JPEG Pleno intends to provide a standard framework to facilitate capture, representation and exchange of these omnidirectional, depth-enhanced, point cloud, light field, and holographic imaging modalities. It aims to define new tools for improved compression while providing advanced functionalities at the system level. Moreover, it targets to support data and metadata manipulation, editing, random access and interaction, protection of privacy and ownership rights as well as other security mechanisms.

 

JPEG XS aims at the standardization of a visually lossless low-latency lightweight compression scheme that can be used for a wide range of applications including mezzanine codec for the broadcast industry and Pro-AV markets. Targeted use cases are professional video links, IP transport, Ethernet transport, real-time video storage, video memory buffers, and omnidirectional video capture and rendering. After a Call for Proposal issued on March 11th 2016 and the assessment of the submitted technologies, a test model for the upcoming JPEG XS standard was created during the 73rd JPEG meeting in Chengdu and the results of a first set of core experiments have been reviewed during the 74th JPEG meeting in Geneva. More core experiments are on their way before finalizing the standard: JPEG committee therefore invites interested parties – in particular coding experts, codec providers, system integrators and potential users of the foreseen solutions – to contribute to the further specification process.

 

JPEG Privacy & Security aims at developing a standard for realizing secure image information sharing which is capable of ensuring privacy, maintaining data integrity, and protecting intellectual property rights (IPR). JPEG Privacy & Security will explore ways on how to design and implement the necessary features without significantly impacting coding performance while ensuring scalability, interoperability, and forward and backward compatibility with current JPEG standard frameworks.

A draft Call for Proposals for JPEG Privacy & Security has been issued and the JPEG committee invites interested parties to contribute to this standardisation activity in JPEG Systems. The draft of CfP is available here.

The call addresses protection mechanisms and technologies such as handling hierarchical levels of access and multiple protection levels for metadata and image protection, checking integrity of image data and embedded metadata, and supporting backward and forward compatibility with JPEG coding technologies. Interested parties are encouraged to subscribe to the JPEG Privacy & Security email reflector for collecting more information. A final version of the JPEG Privacy & Security Call for Proposals is expected at the 75th JPEG meeting located in Sydney, Australia.

 

JPEG AIC provides guidance and standard procedures for advanced image coding evaluation.  At this meeting JPEG completed a technical report: TR 29170-1 Guidelines for image coding system evaluation. This report is a compendium of JPEGs best practices in evaluation that draws sources from several different international standards and international recommendations. The report discusses use of objective tools, subjective procedures and computational analysis techniques and when to use the different tools. Some of the techniques are tried and true tools familiar to image compression experts and vision scientists. Several tools represent new fields where few tools have been available, such as the evaluation of coding systems for high dynamic range content.

 

High throughput JPEG 2000

The JPEG committee started a new activity for high throughput JPEG 2000 and an AHG was created to investigate the evidence for such kind of standard. Experts are invited to participate in this expert group and to join the mailing list.

 

Final Quote

“JPEG continues to offer standards that redefine imaging products and services contributing to a better society without borders.” said Prof. Touradj Ebrahimi, the Convener of the JPEG committee.

 

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group JPEG-signatureof ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS and JPEG Systems and JPEG PLENO families of imaging standards.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro and Tim Bruylants of the JPEG Communication Subgroup at pr@jpeg.org.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on https://listserv.uni-stuttgart.de/mailman/listinfo/jpeg-news.   Moreover, you can follow JPEG twitter account on http://twitter.com/WG1JPEG.

 

Future JPEG meetings are planned as follows:

 

PhD Thesis Summaries

Amirhossein Habibian

Storytelling Machines for Video Search

hack begin box

Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

URL: http://dare.uva.nl/record/1/540787

ISBN: 978-94-6182-715-9

hack end box

0 (1)This thesis studies the fundamental question: what vocabulary of concepts are suited for machines to describe video content? The answer to this question involves two annotation steps: First, to specify a list of concepts by which videos are described. Second, to label a set of videos per concept as its examples or counter examples. Subsequently, the vocabulary is constructed as a set of video concept detectors learned from the provided annotations by supervised learning.

Starting from handcrafting the vocabulary by manual annotation, we gradually automate vocabulary construction by concept composition, and by learning from human stories. As a case study, we focus on vocabularies for describing events, such as marriage proposal, graduation ceremony, and changing a vehicle tire, in videos.

As the first step, we rely on an extensive pool of manually specified concepts to study what are the best practices for handcrafting the vocabulary? From our analysis, we conclude that the vocabulary should encompass over thousands of concepts from various types, including object, action, scene, people, animal, and attribute. Moreover, the vocabulary should include the detectors for both generic concepts and specific concepts, which are trained and normalized in an appropriate way.
We alleviate the manual labor for vocabulary construction by addressing the next research question: can a machine learn novel concepts by composition? We propose an algorithm, which learns new concepts by composing the ground concepts by Boolean logic connectives, i.e. “ride-AND-bike”. We demonstrate that concept composition is an effective trick to infer the annotations, needed for training new concept detectors, without additional human annotation.
As a further step towards reducing the manual labor for vocabulary construction, we investigate the question of can a machine learn its vocabulary from human stories, i.e. video captions or subtitles? By analyzing the human stories using topic models, we effectively extract the concepts that humans use for describing videos. Moreover, we show that the occurrences of concepts in stories can be effectively used as weak supervision to train concept detectors.
Finally, we address the question of how to learn the vocabulary from human stories? We learn the vocabulary as an embedding from videos into their stories. We utilize the correlations between the terms to learn the embedding more effectively. More specifically, we learn similar embeddings for the terms, which highly co-occur in the stories, as these terms are usually synonyms. Furthermore, we extend our embedding to learn the vocabulary from various video modalities including audio and motion. It makes us able to generate more natural descriptions by incorporating concepts from various modalities, i.e. the laughing and singing concepts from audio, and the jumping and dancing concepts from motion.

hack begin box

Intelligent Sensory Information Systems group

URL: https://ivi.fnwi.uva.nl/isis/

hack end box

The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.

Chien-nan Chen

Semantic-Aware Content Delivery Framework for 3D Tele-Immersion

hack begin box

Supervisor(s) and Committee member(s): Klara Nahrstedt (advisor), Roy Campbell (opponent), Indranil Gupta (opponent), Cheng-Hsin Hsu (opponent)

URL: http://cairo.cs.uiuc.edu/publications/papers/Shannon_Thesis.pdf

hack end box

chen3D Tele-immersion (3DTI) technology allows full-body, multimodal interaction among geographically dispersed users, which opens a variety of possibilities in cyber collaborative applications such as art performance, exergaming, and physical rehabilitation. However, with its great potential, the resource and quality demands of 3DTI rise inevitably, especially when some advanced applications target resource-limited computing environments with stringent scalability demands. Under these circumstances, the tradeoffs between 1) resource requirements, 2) content complexity, and 3) user satisfaction in delivery of 3DTI services are magnified.

In this dissertation, we argue that these tradeoffs of 3DTI systems are actually avoidable when the underlying delivery framework of 3DTI takes the semantic information into consideration. We introduce the concept of semantic information into 3DTI, which encompasses information about the three factors: environment, activity, and user role in 3DTI applications. With semantic information, 3DTI systems are able to 1) identify the characteristics of its computing environment to allocate computing power and bandwidth to delivery of prioritized contents, 2) pinpoint and discard the dispensable content in activity capturing according to properties of target application, and 3) differentiate contents by their contributions on fulfilling the objectives and expectation of user’s role in the application so that the adaptation module can allocate resource budget accordingly. With these capabilities we can change the tradeoffs into synergy between resource requirements, content complexity, and user satisfaction.

We implement semantics-aware 3DTI systems to verify the performance gain on the three phases in 3DTI systems’ delivery chain: capturing phase, dissemination phase, and receiving phase. By introducing semantics information to distinct 3DTI systems, the efficiency improvements brought by our semantics-aware content delivery framework are validated under different application requirements, different scalability bottlenecks, and different user and application models.

To sum up, in this dissertation we aim to change the tradeoff between requirements, complexity, and satisfaction in 3DTI services by exploiting the semantic information about the computing environment, the activity, and the user role upon the underlying delivery systems of 3DTI. The devised mechanisms will enhance the efficiency of 3DTI systems targeting on serving different purposes and 3DTI applications with different computation and scalability requirements.

hack begin box

MONET

URL: http://cairo.cs.uiuc.edu/

hack end box

The Multimedia Operating Systems and Networking (MONET) Research Group, led by Professor Klara Nahrstedt in the Department of Computer Science at the University of Illinois at Urbana-Champaign, is engaged in research in various areas of distributed multimedia systems.

Masoud Mazloom

In Search of Video Event Semantics

hack begin box

Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

URL: http://dare.uva.nl/record/1/430219

ISBN: 978-94-6182-717-3

hack end box

0 (2)In this thesis we aim to represent an event in a video using semantic features. We start from a bank of concept detectors for representing events in video.
At first we considered the relevance of concepts to the event inside the video representation. We address the problem of video event classification using a bank of concept detectors. Different from existing work, which simply relies on a bank containing all available detectors, we propose an algorithm that learns from examples what concepts in bank are most informative per event.
Secondly, we concentrated on the accuracy of concept detectors. Different from existing works, which obtain a semantic representation by training concepts over entire video clips, we propose an algorithm that learns a set of relevant frames as the concept prototypes from web video examples, without the need for frame-level annotations, and use them for representing an event video.
Thirdly, we consider the problem of searching video events with concepts. We aim at querying web videos for events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events.
Finally, we consider the problem of video event search without semantic concepts. The prevailing solutions in literature rely on a semantic video representation obtained from thousands of pre-trained concept detectors. Different from them, we propose a new semantic video representation that is based on freely available social tagged videos only, without the need for training any intermediate concept detectors.

hack begin box

Intelligent Sensory Information Systems group

URL: https://ivi.fnwi.uva.nl/isis/

hack end box

The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.

Rufael Mekuria

Network Streaming and Compression for Mixed Reality Tele-Immersion

hack begin box

Supervisor(s) and Committee member(s): prof. Dick Bulterman (promotor), Dr. Pablo Cesar (co-promotor, supervisor), prof. Klara Nahrstedt (Opponent), prof. Fernando M.B. Pereira (Opponent), prof. Maarten van Steen (Opponent), Dr. Thilo Kielmann (Opponent), prof. Rob van der Mei (Opponent)

URL: http://dare.ubvu.vu.nl/handle/1871/55024

ISBN: 978-90-9030147-1

hack end box

rufaelThe Internet is used for distributed shared experiences such as video conferencing, voice calls (possibly in a group), chatting, photo sharing, online gaming and virtual reality. These technologies are changing our daily lives and the way we interact with each other. The current rapid advances in 3D depth sensing and 3D cameras are enabling acquisition of highly realistic reconstructed 3D representations. These natural scene representations are often based on 3D point clouds or 3D Meshes. Integration of these data in distributed shared experiences can have a large impact on the way we work and interact online. Such shared experiences may enable 3D Tele-immersion and Mixed reality that combine real and synthetic contents in a virtual world. However, it poses many challenges to the existing Internet infrastructure. A large part of the challenge is due to the shear volume of reconstructed 3D data. End-to-End Internet connections are bandlimited and currently cannot support real-time end-to-end transmission of uncompressed 3D point cloud or mesh scans with hundreds of thousands of points (over 15 Megabytes per frame) captured at a fast rate (over 10 frames per second). Therefore the volume of the 3D data requires development of methods for efficient compression and transmission, possibly taking application and user specific requirements into account. In addition, sessions often need to be setup between different software and devices such as browsers, desktop applications, mobile applications or server side applications. For this reason interoperability is required. This introduces the need for standardisation of data formats, compression techniques and signalling (session management). In the case of mixed reality in a social networking context, users may use different types of reconstructed and synthetic 3D content (from simple avatar commands, to highly realistic 3D reconstructions based on 3D Mesh or Point Clouds). Therefore such signalling should take into account that different types of user-setups exist, from simple to very advanced, that can each join shared sessions and interact.

This thesis develops strategies for compression and transmission of reconstructed 3D data in Internet infrastructures. It develop three different approaches for the compression of 3D meshes and a codec for time varying 3D point clouds. Further, it develops an integrated 3D streaming framework that includes session management and signalling, media synchronization and a generic API for sending streams based ix on UDP/TCP based protocols. Experiments with these components in a realistic integrated mixed reality system with state of art rendering and 3D data capture investigates the specific system and user experience issues arising in the integration of these sub components.

The first mesh codec is based on taking blocks of the mesh geometry list based on local per block differential encoding and coding the connectivity based on a repetitive pattern resulting from the reconstruction system. The main advantage of this approach is simplicity and parallelizability. The codec is integrated in an initial prototype for 3D immersive communication that includes a communication protocol based on rateless coding based on LT codes and a light 3D rendering engine that includes an implementation for global illumination.

The second mesh codec is a connectivity driven approach. It codes the connectivity in a similar manner as the first codec but with entropy encoding added based on deflate/inflate (based on the popular zlib library). This addition makes the connectivity codec much more generically applicable. Subsequently it traverses the connectivity to apply differential coding of the geometry. The differences between connected vertices are then quantized using a non linear quantizer. We call this approach delayed quantization step late quantization (LQ). This approach resulted in reduced encoding complexity at only a modest degradation in R-D performance compared to the state of the art in standardized mesh compression in MPEG-4. The resulting codec performs over 10 times faster encoding in practice compared to the latter. The codec is used to achieve real-time communication in a WAN/MAN scenario in a controlled IP network configuration. This includes real-time rendering and rateless packet coding of UDP Packet data. The streaming pipeline has been optimized to run in real time with varying frame rates that often occur in 3D Tele-immersion and mixed reality. In addition it was tested in different network conditions using a LIFO (Last in First Out) approach that optimizes the pipeline. In addition, it has been integrated with highly realistic rendering and 3D capture.

The third codec is based on a geometry driven approach. In this codec the geometry is coded first in an octree fashion and then the connectivity representation is converted to a representation that indexes voxels in the octree grid. This representation introduces correlation between the indices that is exploited using a vector quantization scheme. This codec enables real-time coding at different levels of detail (LoD) and highly adaptive bit-rates. This codec is useful when the 3D immersive virtual x room is deployed in the Internet when bandwidths may fluctuate heavily and are more restricted compared to the controlled WAN/MAN scenario. In addition, it is suitable for 3D representations that can be rendered at a lower level of detail, such as participants/objects rendered at a distance in the 3D Room. Next, the focus shifts towards 3D Point Clouds instead of 3D meshes. 3D Point Clouds are a simpler representation of the 3D reconstructions. The thesis develops a codec for time-varying point clouds. It introduces a hybrid architecture that combines an octree based intra codec with lossy inter-prediction and lossy attributes coding based on mapping attributes to a JPEG image grid. It also introduces temporal inter-prediction. The predictive frames are reduced up to 30% and the colours up to 90% in size compared to the current state of the art in real-time point cloud compression. Subjective experiments in a realistic mixed reality virtual world framework developed in the Reverie project showed no significant degradation in the resulting perceptual quality.

In the last phase of this thesis, the complete 3D tele-immersive streaming platform is further developed. Additions include signalling support for 3D streaming (session management) that supports terminal scalability for light clients (render only) up to very heavy clients. This is done by signalling the local modular configuration in advance via the XMPP Protocol. Further, the streaming platform architecture presents an API where different stream types suitable to different 3D capture/reconstruction platforms (i.e. 3D audio, 3D visual, 3D animation) can be created. As the platform includes a distributed virtual clock, mechanisms to perform inter-stream and inter-sender media synchronization can be deployed at the application layer. Therefore, synchronization of compressed 3D audio streams in an audio playout buffer was implemented in a 3D audio rendering module. We also implemented a mesh and point cloud playout buffer in the module for synchronized rendering. This mesh playout buffer enables inter-sender synchronization between different incoming visual streams. In addition, the system includes simple publish and subscribe transmission protocol for small messages based on web socket (through a real-time cloud broker). In addition publish and subscribe based on the XMPP and UDP protocols was implemented. These publish and subscribe messages are particularly suitable for 3D animation commands and AI data exchange. All components developed throughout this thesis have been integrated with 3D capture/rendering modules and in a social networking context in the larger Reverie 3D xi Tele-immersive framework. Field trials of this system in different scenarios have shown the benefits of highly realistic live captured 3D data representations. This further highlights the importance of this work. The components developed in this thesis and their integration outlines many of the significant challenges encountered in the next generation of 3D tele-presence and mixed reality systems. These insights have contributed to the development of requirements for new international standards in the consortia MPEG (Moving Picture Experts Group) and JPEG (Joint Picture Experts Group). In addition, the developed codec and quality metrics for point cloud compression have been accepted as a base reference software model for a novel standard on point cloud compression in MPEG and are available in the MPEG code repository and online on github.

hack begin box

DIS CWI

URL: http://www.dis.cwi.nl/

hack end box

Centrum Wiskunde Informatica Netherlands focusses on applied and fundamental problems in Mathematics and Computer Science. The Distributed and Interactive Systems group (DIS) focuses on modeling and controlling complex collections of media objects (including real-time media and sensor data) that are interactive and distributed in time and space. The group’s fundamental interest is in understanding how the various notions of ‘time’ influence the creation, distribution and delivery of complex content in a customizable manner. The group is led by Dr. Pablo Cesar

Svetlana Kordumova

Learning to Search for Images without Annotations

hack begin box

Supervisor(s) and Committee member(s): Advisor(s): Arnold W.M. Smeulders (promotor), Cees G.M. Snoek (co-promotor).

URL: http://dare.uva.nl/record/1/540788

ISBN: 978-608-4784-15-9

hack end box

0This thesis contributes to learning machines what is in an image by avoiding direct manual annotation as training data. We either rely on tagged data from social media platforms to recognize concepts, or on objects semantics and layout to recognize scenes. We focus our effort on image search.
We firstly demonstrate that concepts detectors can be learned using tagged examples from social media platforms. We show that using tagged images and videos directly as ground truth for learning can be problematic because of the noisy nature of tags. To this end, through extensive experimental analysis, we recommend to calculate the relevance of tags, and select only relevant positive and relevant negative examples for learning. Inclusive, we present four best practices which led to a winning entry on the TRECVID 2013 benchmark for the semantic indexing with no annotations task. Following the findings that important concepts appear rarely as tags in social media platforms, we propose to use semantic knowledge from an ontology to improve calculating tag relevance and to enrich training data for learning concept detectors of rare tags.
When searching images of a particular scene, instead of using annotated scene images, we show that with object classifiers we can reasonably well recognize scenes. We exploit 15,000 object classifiers trained with a convolutional neural network. Since not all objects can contribute equally in describing a scene, we show that pooling only the 100 most prominent object classifiers per image is good enough to recognize its scene. Furthermore, we go to the extreme of recognizing scenes by removing all object identities. We refer to the most probable positions in images to contain objects as things. We show that the ensemble of things properties, size, position, aspect ratio and prominent color, and those only, can discriminate scenes. The benefit of removing all object identities is that we also eliminate the learning of object classifiers in the process, and thus demonstrate that scenes can be recognized with no learning at all.
Overall, this thesis presents alternative ways to learn what concept is in an image or what scene it belongs to, without using manually annotated data, for the goal of image search. It investigates new approaches for learning machines to recognize the visually depicted environment captured in images, all the while dismissing the annotation process.

hack begin box

Intelligent Sensory Information Systems group

URL: https://ivi.fnwi.uva.nl/isis/

hack end box

The world is full of digital images and videos. In this deluge of visual information, the grand challenge is to unlock its content. This quest is the central research aim of the Intelligent Sensory Information Systems group. We address the complete knowledge chain of image and video retrieval by machine and human. Topics of study are semantic understanding, image and video mining, interactive picture analytics, and scalability. Our research strives for automation that matches human visual cognition, interaction surpassing man and machine intelligence, visualization blending it all in interfaces giving instant insight, and database architectures for extreme sized visual collections. Our research culminates in state-of-the-art image and video search engines which we evaluate in leading benchmarks, often as the best performer, in user studies, and in challenging applications.

Recently published

TOMM Volume 12, Issue 4s

hack begin box

Alberto Del Bimbo

URL: http://dl.acm.org/citation.cfm?id=2997658&picked=prox&CFID=919371797&CFTOKEN=25100630

Published: November 2016

hack end box

hack begin box

Papers

hack end box

TOMM Volume 12, Issue 5s

hack begin box

Alberto Del Bimbo

URL: http://dl.acm.org/citation.cfm?id=3001754&picked=prox&CFID=919371797&CFTOKEN=25100630

Published: December 2016

hack end box

hack begin box

Papers

hack end box

TOMM Volume 13, Issue 1

hack begin box

Alberto Del Bimbo

URL: http://dl.acm.org/citation.cfm?id=3012406&picked=prox&CFID=919371797&CFTOKEN=25100630

Published: January 2017

hack end box

hack begin box

Papers

hack end box

MMSJ Volume 23, Issue 1

hack begin box

Editor-in-Chief: Thomas Plagemann

URL: http://link.springer.com/journal/530/23/1/page/1

Published: February 2017

hack end box

hack begin box

Papers

hack end box

MMSJ Volume 23, Issue 2

hack begin box

Editor-in-Chief: Thomas Plagemann

URL: http://link.springer.com/journal/530/23/2/page/1

Published: March 2017

hack end box

hack begin box

Papers

hack end box

MMSJ Volume 22, Issue 6

hack begin box

Editor-in-Chief: Thomas Plagemann

URL: http://link.springer.com/journal/530/22/6/page/1

Published: November 2016

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 6

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/6/page/1

Published: March 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 5

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/5/page/1

Published: March 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 4

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/4/page/1

Published: February 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 3

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/3/page/1

Published: February 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 2

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/2/page/1

Published: January 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 76, Issue 1

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/76/1/page/1

Published: January 2017

hack end box

hack begin box

Papers

hack end box

MTAP Volume 75, Issue 24

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/75/24/page/1

Published: December 2016

hack end box

hack begin box

Papers

hack end box

MTAP Volume 75, Issue 23

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/75/23/page/1

Published: December 2016

hack end box

hack begin box

Papers

hack end box

MTAP Volume 75, Issue 22

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/75/22/page/1

Published: November 2016

hack end box

hack begin box

Papers

hack end box

MTAP Volume 75, Issue 21

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/75/21/page/1

Published: November 2016

hack end box

hack begin box

Papers

hack end box

MTAP Volume 75, Issue 20

hack begin box

Editor-in-Chief: Borko Furht

URL: http://link.springer.com/journal/11042/75/20/page/1

Published: October 2016

hack end box

hack begin box

Papers

hack end box

IJMIR Volume 6, Issue 1

hack begin box

URL: http://link.springer.com/journal/13735/6/1/page/1

Published: March 2017

hack end box

hack begin box

Papers

hack end box

IJMIR Volume 5, Issue 4

hack begin box

URL: http://link.springer.com/journal/13735/5/4/page/1

Published: November 2016

hack end box

hack begin box

Papers

hack end box

Job Opportunities

PhD Openings in ECE

PhD candidates are solicited in the Department of Electrical and Computer Engineering (ECE) at the University of Alabama (UA). The candidates will join the Laboratory for Immersive Communication [lion.ua.edu] headed by Prof. Jacob Chakareski. The lab features state-of-the-art equipment: head/wall-mounted immersion displays, high-definition visual/range sensors, augmented/virtual reality (VR/AR) goggles, UAV drones, and 5G/LTE-A MIMO SDR boards. The positions are fully funded (stipend & tuition for up to four years) and are available immediately, once the suitable candidates pass the application requirements.

Students at the B.S. or M.Sc. level with background in Electrical and Computer Engineering, Computer Science, or Applied Mathematics are encouraged to apply. The accepted candidates will work on exciting research problems at the intersection of immersive communication, future Internet architectures, and virtual/augmented reality. We also investigate airborne networks and UAV swarms for remote sensing applications, and the integration of multi-view imaging into cyber-physical health care devices.

Solid mathematical background and knowledge of programming languages and software tools (e.g., Matlab, NS-2/3) is required. Above all, the applicants must be self-motivated to learn quickly and work effectively on challenging research problems. For a description of recent research activities carried out by Prof. Chakareski, please visit the site www.jakov.org.

Application process: Please send your CV in attachment to jacob@ua.edu and specify in the subject line “Application for ECE PhD positions at UA”. Please describe briefly your background and research interests in the e-mail. Include a one page research statement describing your qualifications and how you can contribute to our own studies (summarized on the two web sites referenced earlier).

GRE and IELTS/TOEFL (for international applicants) exam scores are required for the university application. The applicant is encouraged to include in the application coursework transcripts and arrange for three reference letters to be sent separately to Prof. Chakareski. PDF copies of research articles authored by the applicant are welcome to be included, too.

Related references:

About: The University of Alabama (www.ua.edu) is a major, comprehensive, student-centered research university founded in 1831 as the first public college in Alabama. The University of Alabama consistently ranks among the top 50 public national universities according to the U.S. News and World Report annual rankings.

hack begin box

Employer: Electrical and Computer Engineering (ECE), the University of Alabama (UA).

Expiration date: Friday, June 30, 2017

More information: http://www.jakov.org

hack end box

Internship positions at Nokia Bell Labs Cambridge

The Social Dynamics team at Cambridge (UK) is looking for summer interns to work on the following internship projects (also available on http://researchswinger.org/hiring.html):

1. Urban Emotions and Touchy Maps – Enrolled in Phd in Computer Vision/Multimedia, Experience with deep learning for image analysis and with large-scale multimedia retrieval and processing

2. Visual Psychology – BSc or above in Computer Science, Interaction Design, or related fields; experience in designing Facebook apps, client (HTML5, Javascript) and server side development

3. Familiar Strangers – Enrolled in MSc/PhD in computer vision. Experience with face recognition and deep learning

4. Social Vitality of Cities – BSc or above in Computer Science or related fields. Experience with web data collection from API and scraping, GIS tools, and python development. Basic network analysis knowledge.

4. The Nature of Social Ties – Enrolled in MSc/PhD in Computational Social Science or Social Psychology; experience in designing Facebook apps and in running crowdsourcing experiments.

To learn more about our research, please check our sites – researchswinger.org, lajello.com, visionresearchwitch.com, and goodcitylife.org

Please send your CV and application letter with the subject line “Internship 2017” to daniele.quercia@nokia-bell-labs.com by the 6th of
March 2017. For informal inquiries, please contact daniele.quercia@nokia-bell-labs.com, luca.aiello@nokia-bell-labs.com or
miriam.redi@nokia.com.

Looking forward to receiving your application!

hack begin box

Employer: Bell Labs Cambridge, UK

Expiration date: Monday, March 6, 2017

More information: http://researchswinger.org/internship.pdf

hack end box

Associate Professor position in Network Science (incl. Content Delivery)

*******************************************************************************

Institution: IMT Atlantique
Location: Rennes, Brittany, France
Team: IRISA ADOPNET team
Department: Network Systems, Cybersecurity and Digital Law (SRCD)
Type: Full Time Permanent Position (Associate-Professor)

*******************************************************************************

IMT Atlantique is part of the Institut Mines-Telecom (IMT), a group of
Grandes Ecoles selective French higher education institutions) in the
field of engineering and digital technologies, under the aegis of the
French Ministry of Industry and Electronic Communications. IMT
Atlantique has three campuses in North-Western France (Brest, Nantes,
Rennes), with a total of 2300 students (including 300 Ph.D. students),
290 professors and researchers, publishing 1000 papers each year.

With 23 full time professors, 39 Ph.D. students, the Network Systems,
Cybersecurity and Digital Law (SRCD) department, located in Rennes, has
research and teaching activities in the fields of computer networks
(Cellular Networks, Internet of Things, content deliver), cybersecurity
as well as legal (including regulatory) and economic aspects of
communication networks.

The successful candidate will join the ADOPNET team
(http://www-adopnet.irisa.fr/) and will be expected to lead research
activities, participating in collaborative projects with industrial and
academic partners. The goal of the ADOPNET team is to build networks
that are flexible, adaptive, energy-efficient, secure, and able to
deliver content at a large scale to various types of terminals. ADOPNET,
in particular, addresses the convergence of access networks, the
combination of radio and optical technologies, and adaptive
software-based content delivery networks. The successful candidate is
expected to strengthen the team with a strong expertise of theoretical
tools such as prediction algorithms, operations research, and data
analysis.

The successful candidate is expected to teach graduate level classes on
computer networks and computer science, including but not limited to:
cloud computing, virtualization technologies (Hypervisors, Software
Defined Networks, Network Function Virtualization), mobile networks, and
performance evaluation. The successful candidate is expected to
supervise Ph.D. and master students as well as employing and developing
new pedagogical tools and techniques (e.g., MOOCs, inverted classroom).

Qualifications:
• PhD in computer science, electrical engineering or applied
mathematics in the area of networking
• Proven track record of applying theoretical tools to computer networks
• Strong publication record in selective conferences and journals
• Teaching experience
• English (fluent), French (basic)

IMT Atlantique offers an attractive salary and a pleasant environment to
develop research activities. Rennes is the tenth largest in France,
with a metropolitan area of 400,000 inhabitants. With more than 63,000
students, it is also the eighth-largest university center inf France. In
2017, the Express Journal selected Rennes as one of the three most
pleasant cities to work and live in France.

*******************************************************************************

Application Information
Please submit a single pdf file containing a cover letter, curriculum
vitae (including the list of publications), two reference letters,
research and teaching projects before 15 March 2017 to:
recrut17-mc-spereseaux-srcd@imt-atlantique.fr

Additional information about IMT Atlantique can be found on our website:
http://www.imt-atlantique.fr

For more information about the position, please contact the head of the
ADOPNET team, Pr. Xavier Lagrange at:
xavier[dot]lagrange[at]imt-atlantique[dot]fr

hack begin box

Employer: IMT-Atlantique (Formerly Telecom Bretagne)

Expiration date: Friday, March 17, 2017

More information: http://www.imt-atlantique.fr/sites/default/files/document/ecole/recrutement/FDP_MC_specialite_reseaux_SRCD_EN_v2.pdf

hack end box

Post-doc opportunity @ Irisa/Inria Rennes, France – Multimodal Audiovisual Content Analysis

LINKMEDIA is a research team of IRISA and Inria Rennes, France, working on the development of future technology enabling content-based description of and access to multimedia content, combining computer vision and image processing, speech and audio processing, natural language processing, information retrieval and media mining. LINKMEDIA participates to the NexGenTV project, an industry-academia joint venture on the analysis and enrichment of TV content. Television is undergoing a revolution, moving from the TV screen to multiple screens. Today’s user watches TV while exploring the web, searching for complementary information and commenting on social networks. Facing this situation, NexGenTV was thought to offer news solutions for the creation of rich multiscreen content and applications.

In this context, we are recruiting for a post-doctoral researcher specialized in audiovisual content analysis to develop, study and evaluate novel approaches for multimodal person recognition, clustering and linking in TV content. Research activities will take place at IRISA/Inria Rennes, France, within the LINKMEDIA team, in close collaboration with the partners of NexGenTV. Particular interaction with EURECOM is foreseen.

Prospective candidates should hold a PhD degree in a domain close to the research topic, preferably in one of the following specialism: multimodal modeling, speech and audio processing, speaker recognition, computer vision.

hack begin box

Employer: CNRS, Irisa, Rennes, France

Expiration date: Saturday, April 1, 2017

More information: https://www-linkmedia.irisa.fr/files/2011/07/Offre-Emploi-Linkmedia-NexGenTV-En.pdf

hack end box

PostDoc (AreaHead) Position in Ubiquitous Computing

The Telecooperation lab from TU Darmstadt is looking for a new postdoctoral researcher (area head)
for the Smart Proactive Assistance Area. The area covers several research fields ranging from
mobile sensing via machine learning to human-computer-interaction and persuasive computing.

To apply you must hold a PhD (or be close to its completion) in the areas of Computer Science,
Data Science, or related disciplines. Also, you should have demonstrated your research competence
through high­-quality and high­-impact publications in top conferences or journals in one (or more)
of the following areas: ubiquitous computing, data science, and/or machine learning.

More information about the post can be found at: https://www.tk.informatik.tu-darmstadt.de/index.php?id=3046

Contact for clarification and informal inquiries:
Christian Meurisch (meurisch@tk.tu-darmstadt.de)

hack begin box

Employer: TU Darmstadt (Telecooperation Lab), Germany

Expiration date: Wednesday, March 1, 2017

More information: https://www.tk.informatik.tu-darmstadt.de/index.php?id=3046

hack end box

PhD Position in 3D Computer Vision at University of Amsterdam

The Informatics Institute at the University of Amsterdam invites applications for a PhD position for four years, on the topic of 3D Computer Vision. The candidate will be supervised by Thomas Mensink and Arnold Smeulders.

The ultimate goal of this position is to enable 3D reasoning based on a single 2D photo. We aim to estimate the rough 3D geometry by separating the layout of objects in the scene from the global scene layout. While objects have an almost infinite number of possible configurations, the global scene layout is relatively more stable and can be cast in about 20 scene geometry types. The first research question is to define these different types and infer them from a single image alone using deep learning. Next, we focus on the local ordering of objects, to infer out-of-context objects and to describe an image based on this 3D ordering.

Context

The research position is part of a collaboration between SRI Stanford (USA), IDIAP (Martigny, Swiss) and the University of Amsterdam to automatically infer inconsistencies among the different modalities of a video. To this end the 3D geometry delivers an important cue to match the visual and audio channel. Within the collaboration the University of Amsterdam focusses on the visual scene analysis.

Informal inquiries may be obtained from: Thomas Mensink (thomas-dot-mensink-at-uva.nl)
Additional information and application procedure:
http://www.uva.nl/en/about-the-uva/working-at-the-uva/vacancies/item/17-046-phd-candidate-in-3d-computer-vision.html?n

Kind regards,
Thomas Mensink

hack begin box

Employer: University of Amsterdam

Expiration date: Wednesday, March 15, 2017

More information: http://www.uva.nl/en/about-the-uva/working-at-the-uva/vacancies/item/17-046-phd-candidate-in-3d-computer-vision.html

hack end box

Research Fellows (Postdocs) in Creative Technologies /Visual Computing

Join our new team of 20+ researchers (half postdocs half PhDs) in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing. We are building a dynamic environment where enthusiastic young scientists with different backgrounds get together to shape the future in fundamental as well as applied research projects. Possible directions include but are not limited to:
• augmented reality (AR),
• virtual reality (VR),
• free viewpoint video (FVV),
• 3D video,
• 360/omni-directional video,
• high dynamic range (HDR),
• wide colour gamut (WCG),
• light-field technologies,
• segmentation/matting,
• 3D reconstruction, etc.

Individual research plans will be designed between PI, successful candidates and team, considering individual background, expertise, skills and interests, matching the overall strategy, and exploiting opportunities and inspirations.

The research project “V-Sense – Extending Visual Sensation through Image-Based Visual Computing” is funded by SFI over five years with a substantial budget to cover over 20 researchers. This is part of a strategic investment in Creative Technologies by SFI and Trinity College, which is defined as one of the strategic research themes of the College. V-Sense intends to become an incubator in this context, to stimulate further integration and growth and to impact Creative Industries in Ireland as a whole.

Standard duties and Responsibilities of the Post

• Fundamental and/or applied research in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing
• Scientific publications
• Contribution to prototype and demonstrator development
• Overall contribution to V-SENSE and teamwork
• Supervision of PhD and other students
• Outreach & dissemination

Funding Information

The position is funded through the Science Foundation Ireland V-SENSE project.

Salary
Appointment will be made on the SFI Team member Budget Postdoctorate Research Fellow Level 2A Salary scale at a point in line with Government Pay Policy

Post Status
Specific Purpose contract approximately 4 years – Full-time
The successful candidate will be expected to take up post as soon as possible, preferable in March/April 2017.

Person Specification

Qualifications
• A Ph.D. in Computer Science, Engineering, or a related field in the area of ICT.

Knowledge & Experience
• An established track record of publication in leading journals and/or conferences, in one or more sub-areas of Visual Computing.
• Excellent knowledge of and integration in the related scientific communities.
• The ability to work well in a group, and the ability to mentor junior researchers, such as Ph.D. students.
• Affinity for creative dimensions of visual computing

Skills & Competencies
• Good written and oral proficiency in English (essential).
• Good communication and interpersonal skills both written and verbal.
• Proven aptitude for Programming, System Analysis and Design.
• Proven ability to prioritise workload and work to exacting deadlines.
• Proven track record of publication in high-quality venues.
• Flexible and adaptable in responding to stakeholder needs.
• Strong team player who is able to take responsibility to contribute to the overall success of the team.
• Enthusiastic and structured approach to research and development.
• Excellent problem solving abilities. – Desire to learn about new products, technologies and keep abreast of new product and technical and research developments.

Contacts and application
Candidates should submit a cover letter together with a full curriculum vitae to include the names and contact details of 2 referees (email addresses if possible) to:
Name: Orla Fox
Title: Research Project Administrator
Email Address: Orla.Fox@SCSS.TCD.ie
Contact Telephone Number: 018968176
Please include the reference code: VS-RF on all correspondence.

hack begin box

Employer: V-SENSE Project, School of Computer Science and Statistics, Trinity College Dublin, the University of Dublin

Expiration date: Tuesday, January 31, 2017

More information: https://www.scss.tcd.ie/vacancies/index.php?id=189

hack end box

PhD Studentship (4 positions) in Creative Technologies /Visual Computing

Join our new team of 20+ researchers (half postdocs half PhDs) in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing. We are building a dynamic environment where enthusiastic young scientists with different backgrounds get together to shape the future in fundamental as well as applied research projects. Possible directions include but are not limited to:
• augmented reality (AR),
• virtual reality (VR),
• free viewpoint video (FVV),
• 3D video,
• 360/omni-directional video,
• high dynamic range (HDR),
• wide colour gamut (WCG),
• light-field technologies,
• segmentation/matting,
• 3D reconstruction, etc.

Individual research plans will be designed between PI, successful candidates and team, considering individual background, expertise, skills and interests, matching the overall strategy, and exploiting opportunities and inspirations.

The research project “V-Sense – Extending Visual Sensation through Image-Based Visual Computing” is funded by SFI over five years with a substantial budget to cover over 20 researchers. This is part of a strategic investment in Creative Technologies by SFI and Trinity College, which is defined as one of the strategic research themes of the College. V-Sense intends to become an incubator in this context, to stimulate further integration and growth and to impact Creative Industries in Ireland as a whole.
The successful candidate will be expected to take up post as soon as possible, preferable in March 2017.

Standard duties and Responsibilities of the Post

• Fundamental and/or applied research in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing
• Scientific publications
• Contribution to prototype and demonstrator development
• Overall contribution to V-SENSE and teamwork

Funding Information

The position is funded through the Science Foundation Ireland V-SENSE project.
Payment of tax-free stipend 18k per annum. In addition, payment of EU academic fees.
Applicants must have been resident in an EU member state for 3 out of the last 5 years to be eligible for EU fees

Qualifications
The researcher will be expected to have a good primary degree (preferably MSc) in Computer Science, ICT, Electronic Engineering, Mathematics, Statistics, or a related discipline. Good programming skills are essential.
The successful candidate must meet Trinity College Dublin entry requirements for Postgraduate Research Degrees, and also have excellent communication skills.

https://www.tcd.ie/courses/postgraduate/how-to-apply/requirements/index.php

Knowledge & Experience
• Enthusiasm for scientific research
• Strong ambition to learn and to master skills and knowledge to a world leading level
• Background in a sub-area of Visual Computing such as Computer Vision, Computer Graphics, and Medial Signal Processing
• Programming experience in bigger projects, e.g. in C++, OpenCV, OpenGL, Matlab, etc.
• Affinity for creative dimensions of visual computing

Skills & Competencies
• Good written and oral proficiency in English (essential).
• Good communication and interpersonal skills both written and verbal.
• Proven aptitude for Programming, System Analysis and Design.
• Proven ability to prioritise workload and work to exacting deadlines.
• Strong team player who is able to take responsibility to contribute to the overall success of the team.
• Enthusiastic and structured approach to research and development.
• Excellent problem solving abilities. – Desire to learn about new products, technologies and keep abreast of new product and technical and research developments.

Contacts and application
Please apply via email to Orla.Fox@SCSS.TCD.ie and include a;
-Targeted cover letter (600-1000 words) expressing your suitability for a position
-Complete CV
Please include the reference code: VS-PhD on all correspondence.
There will be an interview process, and the successful candidate will be invited to apply via the TCD graduate studies admission system.

General enquires concerning this post can be addressed to Orla.Fox@scss.tcd.ie

hack begin box

Employer: V-SENSE Project, School of Computer Science and Statistics, Trinity College Dublin, the University of Dublin

Expiration date: Tuesday, January 31, 2017

More information: https://www.scss.tcd.ie/vacancies/index.php?id=188

hack end box

Experienced Research Fellow (Postdoc 4+ years) in Creative Technologies /Visual Computing

Join our new team of 20+ researchers (half postdocs half PhDs) in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing. We are building a dynamic environment where enthusiastic young scientists with different backgrounds get together to shape the future in fundamental as well as applied research projects. Possible directions include but are not limited to:
• augmented reality (AR),
• virtual reality (VR),
• free viewpoint video (FVV),
• 3D video,
• 360/omni-directional video,
• high dynamic range (HDR),
• wide colour gamut (WCG),
• light-field technologies,
• segmentation/matting,
• 3D reconstruction, etc.
Individual research plans will be designed between PI, successful candidates and team, considering individual background, expertise, skills and interests, matching the overall strategy, and exploiting opportunities and inspirations.

The Experienced Research Fellow will provide leadership in the team, e.g. as supervisor and/or project leader. Academic career development will be encouraged and supported.

The research project “V-Sense – Extending Visual Sensation through Image-Based Visual Computing” is funded by SFI over five years with a substantial budget to cover over 20 researchers. This is part of a strategic investment in Creative Technologies by SFI and Trinity College, which is defined as one of the strategic research themes of the College. V-Sense intends to become an incubator in this context, to stimulate further integration and growth and to impact Creative Industries in Ireland as a whole.

Standard duties and Responsibilities of the Post
• Fundamental and/or applied research in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing
• Scientific publications
• Contribution to prototype and demonstrator development
• Overall contribution to V-SENSE and teamwork
• Supervision of PhD and other students
• Outreach & dissemination
• Leadership in the team

Funding Information
The position is funded through the Science Foundation Ireland V-SENSE project.

Salary
Appointment will be made on the SFI Team member Budget Experienced Postdoctorate Research Fellow Level 2B Salary scale at a point in line with Government Pay Policy.

Post Status
Specific Purpose contract approximately 4 years – Full-time
The successful candidate will be expected to take up post as soon as possible, preferable in March/April 2017.

Person Specification
Qualifications
• A Ph.D. in Computer Science, Engineering, or a related field in the area of ICT.
• A minimum of 4 years of postdoctoral experience.

Knowledge & Experience (Essential & Desirable)
Required
• An established track record of publication in leading journals and/or conferences, in one or more sub-areas of Visual Computing.
• Excellent knowledge of and integration in the related scientific communities.
• The ability to work well in a group, and the ability to mentor junior researchers, such as Ph.D. students.
Desired
• Affinity for creative dimensions of visual computing
• Experience in supervision and project leadership.
• Experience in academic services, e.g. peer reviewing, workshop/conference committee, etc.
• Experience with funding acquisition
• Teaching experience
• Academic track record, e.g. talks, tutorials
• Industry experience or engagement
• Prototype development
• Exhibitions, demos
• Standardization

Skills & Competencies
• Good written and oral proficiency in English (essential).
• Good communication and interpersonal skills both written and verbal.
• Proven aptitude for Programming, System Analysis and Design.
• Proven ability to prioritise workload and work to exacting deadlines.
• Proven track record of publication in high-quality venues.
• Flexible and adaptable in responding to stakeholder needs.
• Strong team player who is able to take responsibility to contribute to the overall success of the team.
• Enthusiastic and structured approach to research and development.
• Excellent problem solving abilities. – Desire to learn about new products, technologies and keep abreast of new product and technical and research developments.

Contacts and application
Candidates should submit a cover letter together with a full curriculum vitae to include the names and contact details of 2 referees (email addresses if possible) to:
Name: Orla Fox
Title: Research Project Administrator
Email Address: Orla.Fox@SCSS.TCD.ie
Contact Telephone Number: 018968176
Please include the reference code: VS-ERF on all correspondence.

hack begin box

Employer: V-SENSE Project, School of Computer Science and Statistics, Trinity College Dublin, the University of Dublin

Expiration date: Tuesday, January 31, 2017

More information: https://www.scss.tcd.ie/vacancies/index.php?id=184

hack end box

PhD Studentship (4 positions) in Creative Technologies /Visual Computing

Post Title: PhD Studentship

Research Project: V-SENSE Project, School of Computer Science and Statistics, Trinity College Dublin, the University of Dublin

Post Status: Specific Purpose contract-up to 4 years- PhD Studentship in Creative Technologies/Visual Computing

Post Summary:
Join our new team of 20+ researchers (half postdocs half PhDs) in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing. We are building a dynamic environment where enthusiastic young scientists with different backgrounds get together to shape the future in fundamental as well as applied research projects. Possible directions include but are not limited to:
• augmented reality (AR),
• virtual reality (VR),
• free viewpoint video (FVV),
• 3D video,
• 360/omni-directional video,
• high dynamic range (HDR),
• wide colour gamut (WCG),
• light-field technologies,
• segmentation/matting,
• 3D reconstruction, etc.

Individual research plans will be designed between PI, successful candidates and team, considering individual background, expertise, skills and interests, matching the overall strategy, and exploiting opportunities and inspirations.

The research project “V-Sense – Extending Visual Sensation through Image-Based Visual Computing” is funded by SFI over five years with a substantial budget to cover over 20 researchers. This is part of a strategic investment in Creative Technologies by SFI and Trinity College, which is defined as one of the strategic research themes of the College. V-Sense intends to become an incubator in this context, to stimulate further integration and growth and to impact Creative Industries in Ireland as a whole.

Standard duties and Responsibilities of the Post:
• Fundamental and/or applied research in Visual Computing at the intersection of Computer Vision, Computer Graphics and Media Signal Processing
• Scientific publications
• Contribution to prototype and demonstrator development
• Overall contribution to V-SENSE and teamwork

Qualifications:
The researcher will be expected to have a good primary degree (preferably MSc) in Computer Science, ICT, Electronic Engineering, Mathematics, Statistics, or a related discipline. Good programming skills are essential.
The successful candidate must meet Trinity College Dublin entry requirements for Postgraduate Research Degrees, and also have excellent communication skills.

https://www.tcd.ie/courses/postgraduate/how-to-apply/requirements/index.php

Knowledge & Experience:
• Enthusiasm for scientific research
• Strong ambition to learn and to master skills and knowledge to a world leading level
• Background in a sub-area of Visual Computing such as Computer Vision, Computer Graphics, and Medial Signal Processing
• Programming experience in bigger projects, e.g. in C++, OpenCV, OpenGL, Matlab, etc.
• Affinity for creative dimensions of visual computing

Skills & Competencies:
• Good written and oral proficiency in English (essential).
• Good communication and interpersonal skills both written and verbal.
• Proven aptitude for Programming, System Analysis and Design.
• Proven ability to prioritise workload and work to exacting deadlines.
• Strong team player who is able to take responsibility to contribute to the overall success of the team.
• Enthusiastic and structured approach to research and development.
• Excellent problem solving abilities. – Desire to learn about new products, technologies and keep abreast of new product and technical and research developments.

Funding Information: The position is funded through the Science Foundation Ireland V-SENSE project.

Benefits: Payment of tax-free stipend 18k per annum. In addition, payment of EU academic fees.
NOTE: Applicants must have been resident in an EU member state for 3 out of the last 5 years to be eligible for EU fees

Closing Date: Open until filled
The successful candidate will be expected to take up post as soon as possible, preferable in March 2017.

Application Procedure:
Please apply via email to Orla.Fox@SCSS.TCD.ie and include a;
• -Targeted cover letter (600-1000 words) expressing your suitability for a position
• -Complete CV
Please include the reference code: VS-PhD on all correspondence.
There will be an interview process, and the successful candidate will be invited to apply via the TCD graduate studies admission system.
General enquires concerning this post can be addressed to Orla.Fox@scss.tcd.ie

Trinity College is an equal opportunities employer

hack begin box

Employer: Trinity College Dublin

Expiration date: Tuesday, January 31, 2017

More information: https://www.scss.tcd.ie/vacancies/index.php?id=188

hack end box

Postdocs positions at Intelligent Information Media Laboratory

Postdocs positions at Intelligent Information Media Laboratory (established in October, 2016), Toyota Technological Institute (TTI-Japan)

Positions: Post-doctoral Research Fellow

Number of Positions Available Three (3)

Research Field
Research on various kinds of human sensing and modeling with multimedia data such as images and videos (e.g., human motion sensing, human pose estimation, human action recognition, facial expression recognition, mental state estimation, surveillance) and basic techniques for these fields (e.g., computer vision, pattern recognition, machine learning such as deep neural networks and unsupervised learning).

Qualifications
PhD in a related field.

Starting Date
At the earliest possible date, after the employment contract is completed.

Terms of Employment
Yearly contract: renewable up to 3 years if positively evaluated.

Salary
JPY 320,000/month, plus commuting expenses and partial support for housing.

Documents to submit
(1) CV, including the applicant’s photograph, e-mail contact address, and the possible starting date of employment
(2) List of publications
(3) Summary of research achievements (about one page)
(4) Future plan of research and other activities (about one page)
(5) Name, affiliation, and e-mail address of two contact references
– Candidacy will not be considered unless the full documents are submitted.
– Interested applicants can submit the documents by either email with the subject line “Postdoc for intelligent information media” (PDF format is preferred) or postal mail with “Postdoc for intelligent information media” on the envelope; see the following address.
– The application documents will not be returned.

Deadline for submission
March 31, 2017. The application will be closed after filling this position regardless of the deadline.

Inquiry
Professor Norimichi Ukita
Toyota Technological Institute
2-12-1 Hisakata, Tempaku-ku, Nagoya 468-8511, JAPAN
Phone: +81-52-809-1832
e-mail: ukita@toyota-ti.ac.jp
http://www.toyota-ti.ac.jp/Lab/Denshi/iim/ukita/

Toyota Technological Institute is an Equal Opportunity/Affirmative Action Employer.

hack begin box

Employer: Toyota Technological Institute (TTI-Japan)

Expiration date: Friday, March 31, 2017

More information: http://www.toyota-ti.ac.jp/english/employment/2016/10/000321.html

hack end box

Post-doc: Social Media Analytics

CSIRO offers PhD graduates an opportunity to launch their scientific careers through our Postdoctoral Fellowships. These fellowships provide experience that will enhance career prospects and facilitate the development of potential leaders for CSIRO.

 

hack begin box

Employer: CSIRO (Commonwealth Scientific and Industrial Research Organisation), Australia

Expiration date: Monday, October 31, 2016

More information: https://jobs.csiro.au/job/Sydney%2C-NSW-CSIRO-Postdoctoral-Fellowship-Social-Media-Analytics/365517500/

hack end box

Post-doc offer: Social Media Analytics

CSIRO offers PhD graduates an opportunity to launch their scientific careers through our Postdoctoral Fellowships. These fellowships provide experience that will enhance career prospects and facilitate the development of potential leaders for CSIRO.

In this role you will find an attractive balance between theoretical research in social media analytics and its application to the mining sector.  The role also provides a unique opportunity to work in the emerging research area of Social Media analytics, looking at a number of aspects, such as Trust in the Social Web.

You will be part of a supportive, vibrant, multidisciplinary team of world -leading researchers, and contribute your expertise in modelling information flow, attitudes and influence in social media to develop a theoretical model for information flow and influence in social media such as:

Through this Postdoctoral Fellowship you will gain expertise in the real-time assessment of trust depicted in social media, with strong skills in big data analytics, data modelling, and visualisation.

hack begin box

Employer: CSIRO (Commonwealth Scientific and Industrial Research Organisation), Australia

Expiration date: Monday, October 31, 2016

More information: https://jobs.csiro.au/job/Sydney%2C-NSW-CSIRO-Postdoctoral-Fellowship-Social-Media-Analytics/365517500/

hack end box

Calls for Contribution

CFPs: Sponsored by ACM SIGMM

ACM MM 2017

ACM International Conference on Multimedia

hack begin box

Submission deadline: 07. April 2017

Location: Mountain View, CA, USA
Dates: 23. October 2017 -27. October 2017

More information: http://www.acmmm.org/2017/

Sponsored by ACM SIGMM

hack end box

Call for Regular Papers ACM Multimedia is the premier conference in multimedia, a research field that discusses emerging computing methods from a perspective in which each medium — e.g. images, text, audio — is a strong component of the complete, integrated exchange of information. The multimedia community has a tradition … Read more

ACM MoVid 2017

ACM Workshop on Mobile Video 2017

hack begin box

Submission deadline: 10. March 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw/movid/

Sponsored by ACM SIGMM

hack end box

ACM MoVid 2017 solicits original and unpublished research achievements in various aspects of mobile video services. The focus of this workshop is to present and discuss recent advances in the broad area of mobile video services. Specifically, the workshop intends to address the following topics: a) Novel mobile video applications … Read more

Demo Track @ ACM MMSys 2017

ACM Multimedia Systems 2017

hack begin box

Submission deadline: 10. February 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw/index.php/demo-track/

Sponsored by ACM SIGMM

hack end box

As in previous years, the demo sessions will promote applied research, application prototypes and systems along with the scientific program. The sessions will not only showcase the applicability of recent results to real-world problems but also trigger ideas exchanges between theory and practice and collaborations between MMSys attendees. Submissions from … Read more

Open Dataset and Software Track @ ACM MMSys 2017

ACM Multimedia Systems 2017

hack begin box

Submission deadline: 10. February 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw/index.php/dataset-track/

Sponsored by ACM SIGMM

hack end box

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems are regularly published in the various proceedings and transactions of the networking, operating system, realtime system, and database communities, MMSys … Read more

MMVE @ ACM MMSys 2017

International Workshop on Massively Multiuser Virtual Environments

hack begin box

Submission deadline: 10. March 2017

Location: Taipeh, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw/mmve/

Sponsored by ACM SIGMM

hack end box

Virtual Environment systems are spatial simulations that provide real-time human interactions with other users or a simulated virtual world. Virtual environments have experienced phenomenal growth in recent years in the form of massively multiplayer online games (MMOGs) such as World of Warcraft and Lineage, and social communities such as Second … Read more

ACM MMSys 2017

ACM Multimedia Systems Conference 2017

hack begin box

Submission deadline: 10. January 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw

Sponsored by ACM SIGMM

hack end box

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems are regularly published in the various proceedings and transactions of the networking, operating system, realtime system, and database communities, MMSys … Read more

ACM NOSSDAV 2017

The 27th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video

hack begin box

Submission deadline: 10. March 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://mmsys17.iis.sinica.edu.tw/nossdav/

Sponsored by ACM SIGMM

hack end box

NOSSDAV 2017, the 27th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video, will be co-located with MMSys 2017 in Taipei, Taiwan on June 20—23, 2017. As in previous years, the workshop will continue to focus on both established and emerging research topics, high-risk high-return … Read more

CFPs: Sponsored by ACM (any SIG)

TOMM

ACM Transactions on Multimedia Computing, Communication and Applications
Delay-Sensitive Video Computing in the Cloud

hack begin box

Submission deadline: 20. August 2017

Special issue

More information: http://tomm.acm.org/ACM-TOMM-SI-Delay-Sensitive-Video-Computing-in-Cloud.pdf

Sponsored by ACM

hack end box

Video applications are now among the most widely used and a daily fact of life for the great majority of Internet users. While presentational video services such as those provided by YouTube and NetFlix dominate video data, conversational video services such as video conferencing, multiplayer video gaming, telepresence, tele-learning, collaborative … Read more

ACM TVX 2017 WoP

ACM International Conference on Interactive Experiences for Television and Online Video

hack begin box

Submission deadline: 10. March 2017

Location: Hilversum, the Netherlands
Dates: 14. June 2017 -16. June 2017

More information: https://tvx.acm.org/2017/participation/

Sponsored by ACM

hack end box

ACM TVX, the leading international conference for research into online video, TV interaction and user experience is now calling for Work-in-progress, Doctoral Consortium, Demo and TVX-in-industry submissions.  

ACM TVX 2017

ACM International Conference on Interactive Experiences for Television and Online Video

hack begin box

Submission deadline: 20. January 2017

Location: Hilversum, The Netherlands
Dates: 14. June 2017 -16. June 2017

More information: http://www.tvx2017.com

Sponsored by ACM

hack end box

ACM TVX is the leading international conference for research into online video, TV interaction and user experience. It is a multi-disciplinary conference and we welcome submissions in a broad range of topics. Our aim is to foster discussions and innovative experiences amongst the academic research community and industry. In particular, … Read more

ACM ICMR 2017

ACM International Conference on Multimedia Retrieval

hack begin box

Submission deadline: 27. January 2017

Location: Bucharest, Romania
Dates: 06. June 2017 -09. June 2017

More information: http://www.icmr2017.ro/

Sponsored by ACM

hack end box

The Annual ACM International Conference on Multimedia Retrieval (ICMR) offers a great opportunity for exchanging leading-edge multimedia retrieval ideas among researchers, practitioners and other potential users of multimedia retrieval systems. This annual conference, which puts together the long-lasting experiences of the former ACM CIVR (International Conference on Image and Video … Read more

CFPs: Sponsored by IEEE (any TC)

DMIAF 2017

2nd Digital Media Industry & Academic Forum

hack begin box

Submission deadline: 31. March 2017

Location: Athens, Greece
Dates: 06. September 2017 -08. September 2017

More information: https://conferences.ece.ubc.ca/dmiaf2017/

Sponsored by IEEE

hack end box

ISM 2017

The 19th IEEE International Symposium on Multimedia (ISM 2017)

hack begin box

Submission deadline: 15. July 2017

Location: Taichung, Taiwan
Dates: 11. December 2017 -13. December 2017

More information: http://ism2017.asia.edu.tw/

Sponsored by IEEE

hack end box

DCER&HPE 2017

Joint Challenge and Workshop on Dominant and Complementary Emotion Recognition Using Micro Emotion Features and Head-Pose Estimation

hack begin box

Submission deadline: 24. March 2017

Location: Washington, DC., USA
Dates: 31. May 2017 -31. May 2017

More information: http://icv.tuit.ut.ee/fc2017

Sponsored by IEEE

hack end box

SmartMM2017

The 2017 International Workshop on Smart Multimedia (SmartMM2017)

hack begin box

Submission deadline: 10. March 2017

Location: Hong kong
Dates: 29. May 2017 -31. May 2017

More information: http://smartmm.org/

Sponsored by IEEE

hack end box

ACII 2017

Affective Computing and Intelligent Interaction

hack begin box

Submission deadline: 02. May 2017

Location: San Antonio, Texas
Dates: 23. October 2017 -26. October 2017

More information: http://www.acii2017.org

Sponsored by IEEE

hack end box

CFPs: Not ACM-/IEEE-sponsored

DMSVLSS 2017

The 23rd International Conference on Distributed Multimedia Systems, Visual Languages and Sentient Systems

hack begin box

Submission deadline: 22. March 2017

Location: Wyndham Pittsburgh University Center, Pittsburgh, USA
Dates: 07. July 2017 -08. July 2017

More information: http://ksiresearchorg.ipage.com/seke/dmsvlss17.html

hack end box

DMCIT 2017

International Conference on Data Mining, Communications and Information Technology

hack begin box

Submission deadline: 20. February 2017

Location: Phuket, Thailand
Dates: 25. May 2017 -27. May 2017

More information: http://www.dmcit.net/

In cooperation with ACM

hack end box

DIPEWC 2017

(e.g. The Second International Conference on Digital Information Processing, Electronics, and Wireless Communications (DIPEWC2017)orkshop on Tiny Details of TCP 2012)

hack begin box

Submission deadline: 15. August 2017

Location: ISGA (Higher Institute of Engineering and Business - Marrakesh), Marrakesh, Kingdom of Morocco
Dates: 28. September 2017 -30. September 2017

More information: http://sdiwc.net/conferences/2nd-international-conference-on-digital-information-processing-electronics-and-wireless-communications/

hack end box

ISDF 2017

The Third International Conference on Information Security and Digital Forensics (ISDF2017)

hack begin box

Submission deadline: 07. November 2017

Location: Metropolitan College, Thessaloniki, Greece
Dates: 08. December 2017 -10. December 2017

More information: http://sdiwc.net/conferences/3rd-international-information-security-digital-forensics/

hack end box

ICDIPC 2017

The Seventh International Conference on Digital Information Processing and Communications (ICDIPC2017)

hack begin box

Submission deadline: 11. June 2017

Location: Asia Pacific University of Technology and Innovation (APU), Kuala Lumpur, Malaysia
Dates: 11. July 2017 -13. February 2017

More information: http://sdiwc.net/conferences/7th-international-conference-digital-information-processing-communications/

hack end box

ICIAP 2017

19th International Conference on Image Analysis and Processing (ICIAP)

hack begin box

Submission deadline: 31. March 2017

Location: Catania, Italy
Dates: 11. September 2017 -15. September 2017

More information: http://www.iciap2017.com

hack end box

CEA2017 @ IJCAI2017

The 9th International Workshop on Multimedia for Cooking and Eating Activities

hack begin box

Submission deadline: 01. May 2017

Location: Melbourne, Australia
Dates: 19. August 2017 -21. August 2017

More information: http://www.mm.media.kyoto-u.ac.jp/CEA2017/

hack end box

MUST-EH 2017@ IEEE ICME 2016

7th IEEE ICME International Workshop on Multimedia Services and Technologies for E-health (MUST-EH 2017)

hack begin box

Submission deadline: 27. February 2017

Location: Hong Kong
Dates: 10. July 2017 -14. July 2017

More information: http://www.mcrlab.net/must-eh-workshop/

hack end box

SparDa @ CBMI 2017

(e.g. International Workshop on Tiny Details of TCP 2012)

hack begin box

Submission deadline: 28. February 2017

Location: Firenze, Italy
Dates: 19. June 2017 -21. June 2017

More information: http://www.micc.unifi.it/cbmi2017/

In cooperation with ACM SIGMM

hack end box

DigitalSec2017

The Fourth International Conference on Digital Security and Forensics (DigitalSec2017)

hack begin box

Submission deadline: 11. June 2017

Location: Kuala Lumpur, Malaysia
Dates: 11. July 2017 -13. July 2017

More information: http://sdiwc.net/conferences/4th-conference-digital-security-forensics/

hack end box

ICESS2017

The Third International Conference on Electronics and Software Science

hack begin box

Submission deadline: 30. June 2017

Location: Takamatsu Sunport Hall Building, Takamatsu, Japan
Dates: 31. July 2017 -02. August 2017

More information: http://sdiwc.net/conferences/3rd-international-electronics-software-science/

hack end box

CBMI2017

International Workshop on Content-Based Multimedia Indexing

hack begin box

Submission deadline: 28. February 2017

Location: Firenze, Italy
Dates: 19. June 2017 -21. June 2017

More information: https://www.micc.unifi.it/cbmi2017/

hack end box

NetGames 2017

The 15th Annual Workshop on Network and Systems Support for Games

hack begin box

Submission deadline: 10. February 2017

Location: Taipei, Taiwan
Dates: 20. June 2017 -23. June 2017

More information: http://netgames2017.web.nitech.ac.jp/

In cooperation with ACM SIGMM

hack end box

INFOSEC 2017

The Third International Conference on Information Security and Cyber Forensics (INFOSEC2017)

hack begin box

Submission deadline: 29. May 2017

Location: Comenius University in Bratislava, Slovakia
Dates: 29. June 2017 -01. July 2017

More information: http://sdiwc.net/conferences/3rd-international-conference-information-security-cyber-forensics/

hack end box

MVA 2017

The 15th IAPR Conference on Machine Vision Applications

hack begin box

Submission deadline: 05. December 2016

Location: Nagoya, Japan
Dates: 08. May 2017 -12. May 2017

More information: http://www.mva-org.jp/mva2017/

hack end box

ICETC 2017

The Fourth International Conference on Education, Technologies and Computers

hack begin box

Submission deadline: 01. April 2017

Location: St. Mary's University
Dates: 22. April 2017 -24. April 2017

More information: http://sdiwc.net/conferences/4th-international-education-technologies-computers/

hack end box

CSCESM2017

The Fourth International Conference on Computer Science, Computer Engineering and Social Media (CSCESM2017)

hack begin box

Submission deadline: 16. April 2017

Location: Jadara University
Dates: 16. May 2017 -18. May 2017

More information: http://sdiwc.net/conferences/4th-international-conference-computer-science-computer-engineering-social-media/

hack end box

Back Matter

Notice to Contributing Authors to SIG Newsletters

By submitting your article for distribution in this Special Interest Group publication, you hereby grant to ACM the following non-exclusive, perpetual, worldwide rights:

However, as a contributing author, you retain copyright to your article and ACM will refer requests for republication directly to you.

Impressum

Editor-in-Chief

Carsten Griwodz, Simula Research Laboratory

Editors

*/