Honglin Yu

Understanding the Popularity Evolution of Online Media: A Case Study on YouTube Videos

Supervisor(s) and Committee member(s): Lexing Xie (chair of panel), Scott Sanner (supervisor), Henry Gardner (advisor)

Understanding the popularity evolution of online media has become an important research topic. There are a number of key questions which have high scientific significance and wide practical relevance. In particular, what are the statistical characteristics of online user behaviors? What are the main factors that affect online collective attention? How can one predict the popularity of online content? Recently, researchers have tried to understand the way popularity evolves from both a theoretical and empirical perspective. A number of important insights have been gained: e.g. most videos obtain the majority of their viewcount at the early stage after uploading; for videos having identical content, there is a strong “first-mover” advantage, so that early uploads have the most views; YouTube video viewcount dynamics strongly correlate with video quality. Building upon these insights, the main contributions of the thesis are: we proposed two new representations of viewcount dynamics. One is popularity scale where we represent each video’s popularity by their relative viewcount ranks in a large scale dataset. The other is the popularity phase which models the rise and fall of video’s daily viewcount overtime; We also proposed four computational tools. The first is an efficient viewcount phase detection algorithm which not only automatically determines the number of phases each video has, but also finds the phase parameters and boundaries. The second is a phase-aware viewcount prediction method which utilizes phase information to significantly improve the existing state-of-the-art method. The third is a phase-aware viewcount clustering method which can better capture “pulse patterns” in viewcount data. The fourth is a novel method of predicting viewcounts using external information from the Twitter network. Finally, this thesis sets out results from large-scale, longitudinal measurement study of YouTube video viewcount history, e.g. we find videos with different popularity and categories have distinctive phase histories. And we also observed a non-trivial number of concave phases. And we also observed a non-trivial number of concave phases. Dynamics like this can not be explained in terms of existing models, and the terminology and tools introduced here have the potential to spark fresh analysis efforts and further research. In all, the methods and insights developed in the thesis improve our understanding of online collective attention. They also have considerable potential usage in online marketing, recommendation and information dissemination e.g. in emergency & natural disasters.

The Computational Social Science (CSS@CS) Lab of ANU

URL: http://css.cecs.anu.edu.au/

The Computational Social Science (CSS@CS) Lab is located within the Research School of Computer Science at the Australian National University. We collaborate closely with the Machine Learning and Optimization research groups in NICTA, and a growing set of ANU researchers in the social sciences.

We focus on laying the computational foundations from large amounts of social and behavioral data. The outcome of our work puts a strong emphasis on actionable insights, and in the long-term influence on policy.

Bookmark the permalink.