Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Position Bias in Recommender Systems for Digital Libraries
Andrew Collins1, Dominika Tkaczyk1, Akiko Aizawa2, Joeran Beel1
1Trinity College Dublin, Ireland; 2National Institute of Informatics, Tokyo, Japan
“Position bias” describes the tendency of users to interact with items on top of a list with higher probability than with items at a lower position in the list, regardless of the items’ actual relevance. In the domain of recommender systems, particularly recommender systems in digital libraries, position bias has received little attention. We conduct a study in a real-world recommend-er system that delivered ten million related-article recommendations to the users of the digital library Sowiport, and the reference manager JabRef. Rec-ommendations were randomly chosen to be shuffled or non-shuffled, and we compared click-through rate (CTR) for each rank of the recommendations. According to our analysis, the CTR for the highest rank in the case of Sowi-port is 53% higher than expected in a hypothetical non-biased situation (0.189% vs. 0.123%). Similarly, in the case of Jabref the highest rank re-ceived CTR of 1.276%, which is 87% higher than expected (0.683%). Chi-squared test confirms the strong relationship between the rank of the rec-ommendation shown to the user and whether the user decided to click it (p < 0.01 for both Jabref and Sowiport). Our study confirms the findings from other domains, that recommendations in the top positions are more often clicked, regardless of their actual relevance.
Unsupervised Citation Sentence Identification based on Similarity Measurement
Shiyan Ou, Hyonil Kim
Nanjing University, China, People's Republic of
Citation Context Analysis has obtained the interest of many researchers in the field of bibliometrics. To do this, the first step is to extract the context of each citation from a citing paper. In this paper, we proposed a novel unsupervised approach for the identification of implicit citation sentences without attaching a citation tag. Our approach selects the neighboring sentences around an explicit citation sentence as candidate sentences, calculates the similarity between a candidate sentence and a cited or citing paper, and deems those that are more similar to the cited paper to be implicit citation sentences. To calculate text similarity, we proposed four methods based on the Doc2vec model, the Vector Space Model (VSM) and the LDA model respectively. The experiment results showed that the hybrid method combing the probabilistic TF-IDF weighted VSM with the TF-IDF weighted Doc2vec obtained the best performance. Compared against other supervised methods, our approach does not need any annotated training corpus, and thus can be easy to apply to other domains in theory.
“What was this movie about this chick?”: A Comparative Study of Relevance Aspects in Book and Movie Discovery
Toine Bogers1, Maria Gäde2, Marijn Koolen3, Vivien Petras2, Mette Skov4
1Aalborg University Copenhagen, Denmark; 2Humboldt Universität zu Berlin; 3Huygens Institute, the Netherlands; 4Aalborg University, Denmark
In recent decades, information retrieval research has slowly expanded its focus to address the wealth of complex search requests present in our work and leisure environments. A better understanding of such complex needs could aid in the design of more effective, domain-specific search engines. In this paper we take a first step towards such domain-specific understanding. We present an analysis of a random sample of 1000+ complex book and movie search requests posted in the LibraryThing and Internet Movie Database forums. A coding scheme was developed that captures the 29 different relevance aspects expressed in these requests. We find that while the identified relevance aspects are remarkably similar for complex book and movie requests, their relative occurrence does vary considerably from domain to domain.