Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Rules for Inducing Hierarchies from Social Tagging Data
Hang Dong1,2, Wei Wang2, Frans Coenen1
1University of Liverpool, United Kingdom; 2Xi'an Jiaotong-Liverpool University, China, People's Republic of
Automatic generation of hierarchies from social tags is a challenging task. We identified three rules, set inclusion, graph centrality and information-theoretic condition from the literature and proposed two new rules, fuzzy set inclusion and probabilistic association to induce hierarchical relations. We proposed an hierarchy generation algorithm, which can incorporate each rule with different data representations, i.e., resource and Probabilistic Topic Model based representations. The learned hierarchies were compared to some of the widely used reference concept hierarchies. We found that probabilistic association and set inclusion based rules helped produce better quality hierarchies according to the evaluation metrics.
A new method of keyword selection and semantic measurement in co-word analysis
Liqin Zhou1, Zhichao Ba1, Hao Fan1, Bin Zhang2
1Center for the Studies of Information Resources of Wuhan University, Wuhan, 430072; 2Center of Traditional Chinese Cultural Studies, Wuhan University, Wuhan, 430072
Aiming at problems of the “same amount with different qualities” phenom-enon and the lack of semantics in co-occurring terms, this paper proposed a new keyword selection and semantic measurement method in co-word anal-ysis. The method firstly gave different weights to document units based on the Pointwise Mutual Information (PMI) method, and expanded them to the generation process of the Latent Dirichlet Allocation (LDA) model to extract core keywords. Then the word-2vec model was used to transform the Top-N keywords into low-dimensional value distributions, and the sematic correla-tion among keywords were calculated based on the length of windows. Final-ly, data from the domain of “deep learning” was used to verify the scienti-ficity and effectiveness of the method. Comparing the results of general co-word analysis with our proposed method in terms of clustering analysis, net-work parameters, distribution structures and other aspects, we can find that our method is scientific and effective in considering different feasibilities of terms and their semantic correlations.
Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions
Felix Hamborg, Sören Lachnit, Moritz Schubotz, Thomas Hepp, Bela Gipp
University of Konstanz, Germany
Extraction of event descriptors from news articles is a commonly required task for various tasks, such as clustering related articles, summarization, and news aggregation. Due to the lack of generally usable and publicly available methods optimized for news, many researchers must redundantly implement such methods for their project. Answers to the five journalistic W questions (5Ws) describe the main event of a news article, i.e., who did what, when, where, and why. The main contribution of this paper is Giveme5W, the first open-source, syntax-based 5W extraction system for news articles. The system retrieves an article’s main event by extracting phrases that answer the journalistic 5Ws. In an evaluation with three assessors and 60 articles, we find that the extraction precision of 5W phrases is p=0.7.