Mobile Systems Design Lab Principal Investigator:
Professor Sujit Dey

University of California, San Diego

Overview Projects People Publications News Contact Us Links

Efficiently Addressing Annotation Sparsity in Online Content



Several expert systems have been proposed to address the sparsity of tags associated with online content such as images and videos. However most of such systems either necessitate extracting domain-specific features, or are solely based on tag semantics, or have significant space requirements to store corpus based tag statistics. To address these shortcomings, in this work we show how ontological tag trees can be used to encode information present in a given corpus pertaining to interaction between the tags, in a space efficient manner. An ontological tag tree is defined as an undirected, weighted tree on the set of tags where each possible tag is treated as a node in the tree. We formulate the tag tree construction as an optimization problem over the space of trees on the set of tags and propose a novel local search based approach utilizing the co-occurrence statistics of the tags in the corpus. To make the proposed optimization more efficient, we initialize using the semantic relationships between the tags. The proposed approach is used to construct tag trees over tags for two large corpora of images, one from Flickr and one from a set of stock images. Extensive data-driven evaluations demonstrate that the constructed tag trees can outperform previous approaches in terms of accuracy in predicting unseen tags using a partially observed set of tags, as well as in efficiency of predicting all applicable tags for a resource.

Figure1: Comparison with WordNet

Figure 1 gives Two examples of subgraphs built using (left) the proposed data-driven approach and (right) corresponding sub-graphs obtained using WordNet. In example (a), 'holiday' and 'travel' are directly connected using our approach but are separated by multiple hops in the WordNet hierarchy. In example (b), the proposed approach is able to identify 'party' as the central node that connects several other party-related tags.
back to top

Below are the publications based on the above work:

  1. C. Verma, V. Mahadevan, N. Rasiwasia, G. Aggarwal, R. Kant, A. Jaimes and S. Dey, "Construction and Evaluation of Ontological Tag Trees". Elsevier Expert Systems and Applications VOL.42, NO.24, Dec. 2015. PDF, Science Direct

  2. C. Verma, V. Mahadevan, N. Rasiwasia, G. Aggarwal, R. Kant, A. Jaimes and S. Dey, "Construction of Tag Ontological Graphs by Locally Minimizing Weighted Average Hops", in ACM World Wide Web (WWW poster), Apr. 2014. PDF, ACM DL
See other related papers here: Here


Chetan Verma
Graduate student

Sujit Dey

Vijay Mahadevan (Yahoo Labs)

Nikhil Rasiwasia (Yahoo Labs)

Gaurav Aggarwal (Yahoo Labs)

Ravi Kant (Yahoo Lab)

Alejandro Jaimes (Yahoo Labs)

back to top


Yahoo Labs,
Center for Wireless Communications (CWC), UCSD