Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Papers 2: Methodological Concerns in (Big) Data Research
10:30am - 12:00pm
Session Chair: Mohammad Hossein Jarrahi, University of North Carolina at Chapel Hill
Methodological Transparency and Big Data: A Critical Comparative Analysis of Institutionalization
M. R. Sanfilippo1, C. McCoy2
1Princeton University, United States of America; 2Indiana University, United States of America
Big data is increasingly employed in predictive social analyses, yet there are many visible instances of unreliable models or failure, raising questions about methodological validity in data driven approaches. From meta-analysis of methodological institutionalization across three scholarly disciplines, there is evidence that traditional statistical quanti-tative methods, which are more institutionalized and consistent, are important to develop, structure, and institutionalize data scientific ap-proaches for new and large n quantitative methods, indicating that data driven research approaches may be limited in reliability, validity, gen-eralizability, and interpretability. Results also indicate that interdisci-plinary collaborations describe methods in significantly greater detail on projects employing big data, with the effect that institutionalization makes data science approaches more transparent.
Spanning the Boundaries of Data Visualization Work: An Exploration of Functional Affordances and Disciplinary Values
J. Snyder1, K. Shilton2
1University of Washington, Seattle, WA; 2University of Maryland, College Park, MD
Creating data visualizations requires diverse skills including computer programming, statistics, and graphic design. Visualization practitioners, often formally trained in one but not all of these areas, increasingly face the challenge of reconciling, integrating and prioritizing competing disciplinary values, norms and priorities. To inform multidisciplinary visualization pedagogy, we analyze the negotiation of values in the rhetoric and affordances of two common tools for creating visual representations of data: R and Adobe Illustrator. Features of, and discourse around, these standard visualization tools illustrate both a convergence of values and priorities (clear, attractive, and communicative data-driven graphics) side-by-side with a retention of rhetorical divisions between disciplinary communities (statistical analysis in contrast to creative expression). We discuss implications for data-driven work and data science curricula within the current environment where data visualization practice is converging while values in rhetoric remain divided.
Modeling the process of information encountering based on the analysis of secondary data
T. Jiang1,2, S. Fu1, Q. Guo1, E. Song1
1School of Information Management, Wuhan University, China, People's Republic of; 2Center for Studies of Information Resources, Wuhan University, Wuhan, Hubei, China, People‘s Republic of
The critical incident technique (CIT) has been applied extensively in the research on information encountering (IE), and abundant IE incident descriptions have been accumulated in the literature. This study used these descriptions as secondary data for the purpose of creating a general model of IE process. The grounded theory approach was employed to systematically analyze the 279 IE incident descriptions extracted from 14 IE studies published since 1995. 230 conceptual labels, 33 subcategories, and 9 categories were created during the data analysis process, which led to one core category, i.e. “IE process”. A general IE process model was established as a result to demonstrate the relationships among the major components, including environments, foreground activities, stimuli, reactions, examination of information content, interaction with encountered information, valuable outcomes, and emotional states before/after encountering. This study not only enriches the understanding of IE as a universal information phenomenon, but also shows methodological significance by making use of secondary data to lower cost, enlarge sample size, and diversify data sources.