Completed Papers 5: Data Science and Analytics
Let's Workout! Exploring Social Exercise in an Online Fitness Community
1University of Washington, United States of America; 2University of Minnesota, United States of America
Increasing attention has been paid to promoting certain healthy habits through social interaction in online fitness communities. At the intersection of social media and activity tracking applications, these platforms capture information on physical activities and peer-to-peer interactions. Importantly, they also offer researchers a novel opportunity to understand health behaviors by utilizing the large-scale behavioral trace data they archive. In this study we explore the characteristics and dynamics of social exercise (i.e. exercise activities with at least one peer physically co-present) using data collected from an online fitness community popular with cyclists and runners. In particular, we ask if factors such as temporal seasonality, activity performance and social feedback vary by the number of people participating in an activity; we do so comparing associations for both men and women. Our results indicate that when peers are physically present for fitness activities (i.e. group workouts), exercise tends to be more intense and receive more feedback from other users, across both genders. These results have important implications for health and wellness interventions.
Access to Billions of Pages for Large-Scale Text Analysis
University of Illinois at Urbana-Champaign, United States of America
Consortial collections have led to unprecedented scales of digitized corpora, but the insights that they enable are hampered by the complexities of access, particularly to in-copyright or orphan works. Pursuing a principle of non-consumptive access, we developed the Extracted Features (EF) dataset, a dataset of quantitative counts for every page of nearly 5 million scanned books. The EF includes unigram counts, part of speech tagging, header and footer extraction, counts of characters at both sides of the page, and more. Distributing book data with features already extracted saves resource costs associated with large-scale text use, improves the reproducibility of research done on the dataset, and opens the door to datasets on copyrighted books. We describe the coverage of the dataset and demonstrate its useful application through duplicate book alignment and identification of their cleanest scans, topic modeling, word list expansion, and multifaceted visualization.
Writing to Persuade: Analysis and Detection of Persuasive Discourse
1Syracuse University, United States of America; 2University of Western Ontario, Canada
The relation between the dialogue behavior of participants in communicative settings and whether they are perceived persuasive by other participants has long been established in the literature. In this study, we are focused on the linguistic facets of written messages, and we aim to gain insight into the dimensions of the language that can lead to persuasion. Through the analysis of various linguistic dimensions, a set of features are selected to be utilized in a supervised manner to identify persuasive text. The selected features are independent of the semantics and are mainly surface-based attributes that are related to the structure and organization of the text. The use of certain language elements, such as pronouns and articles, is also taken into account. The evaluation results of supervised machine learning algorithms are promising, which suggests that surface-based linguistic attributes can greatly contribute toward the persuasiveness of text, regardless of the underlying claims and arguments.