Modeling and understanding communities in online social media using probabilistic methods
Supervisor(s) and Committee member(s): Daniel Gatica-Perez (supervisor), Susanne Boll (jury member), Jose del R. Millan (jury member), Roelof van Zwol (jury member)
The goal of this thesis was to model and understand emerging online communities that revolve around multimedia content, more specifically photos, by using large-scale data and probabilistic models in a quantitative approach.
The disertation has four contributions. First, using data from two online photo management systems, this thesis examined different aspects of the behavior of users of these systems pertaining to the uploading and sharing of photos with other users and online groups.
Second, probabilistic topic models were used to model online entities, such as users and groups of users, and the new proposed representations were shown to be useful for further understanding such entities, as well as to have practical applications in search and recommendation scenarios. Third, by jointly modeling users from two different social photo systems, it was shown that differences at the level of vocabulary exist, and different sharing behaviors can be observed.
Finally, by modeling online user groups as entities in a topic-based model, hyper-communities were discovered in an automatic fashion based on various topic-based representations.
These hyper-communities were shown, both through an objective and a subjective evaluation with a number of users, to be generally homogeneous, and therefore likely to constitute a viable exploration technique for online communities.
Social computing group
Our recent work has investigated methods to analyze small groups at work in multisensor spaces, populations of mobile phones users in urban environments, and on-line communities in social media.