Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
Session | ||
WE 02: Metaparameter-Sensitivity, Heuristics and Data Integration in Machine Learning
| ||
Presentations | ||
Beyond Instinct: Exploring Revenue Forecasting with Heuristics and Machine Learning 1Berlin International University of Applied Sciences, Germany; 2Amazon Web Services; 3Northeastern University People’s predictions based on first impressions, also referred to as predictions from thin slices of data, can be surprisingly accurate. Investigating revenue predictions from thin slices, we analyze a simple expert algorithm, the “multiplier heuristic,” whereby revenues in the first t days are multiplied by a constant. Is there necessarily a trade-off between simplicity and predictive accuracy? Building on the bias–variance decomposition, we develop three conditions under which such a simple heuristic can match or even outperform more complex algorithmic prediction models. On 20 data sets, including 5 from the tech industry where the multiplier heuristic was originally applied, we show that a small sample size and a long observation duration provide performance advantages for the heuristic. Yet, given unpredictable changes over time, which can be characteristic of the tech industry, even a large sample size and a short observation duration may not yield any performance advantage for more complex algorithmic prediction models. Integrating Large Citation Data Sets for Measuring Article’s Scientific Prestige Zuse Institute Berlin, Germany Evaluating scientific impact necessitates precise measurement of individual articles' impact, which is commonly assessed through metrics reliant on citation counts. However, these metrics are subject to limitations, notably susceptibility to manipulation within the scholarly community. Recently, there has been a shift towards utilizing knowledge distilled from citation graphs rather than relying solely on citation counts. This shift mandates access to a comprehensive citation graph for more reliable measurement. In this study, we focus on methods for merging citation data sets incorporating big data to construct a comprehensive citation graph. We present our implementation results for merging two extensive citation databases, containing more than 63 million and 98 million article records, respectively, alongside more than 953 million and 1.3 billion citations. During our implementation, handling big data presented significant challenges, including quality issues that stemmed from semi-structured data lacking universal identification numbers. Through meticulous deduplication efforts, we streamlined the merged database to a single consistent dataset. Our work led to a citation graph that portrays inter-article associations more accurately than graphs derived from single databases. The presentation outlines our approach to managing big data for constructing the merged citation graph, emphasizing the challenges and our remedies to deal with these challenges. Sensitivity of Artificial Neural Networks for Metaparameters - an Empirical Evaluation on Sparse Data iqast & Lancaster University, United Kingdom The success of Deep Neural Networks for pattern recognition in speech, image and text data promises preeminent accuracy in time series patterns for forecasting. However, a recent survey and consultancy reports indicate that over 50% of all AI forecasting projects with Deep Networks in industry fail. In an attempt to reconcile this discrepancy, we run a large scale empirical study to assess the empirical accuracy of different deep and shallow neural networks architectures on a real world industry dataset from a FMCG manufacturer, using reliable error metrics, fixed multi-step horizons and multiple rolling time origins. Our experiments indicate that the standard implementations of both deep and shallow neural networks architectures - as well as recent data science methods including Facebook’s Prophet, Google’s bsts and machine learning methods such as XGBoost, and random forests, fail to outperform established statistical benchmarks of Exponential Smoothing and ARIMA on monthly industry data. A careful review of the methods' setup in standard packages in R suggests the use of inferior meta-parameters and flawed methodologies, which limit the accuracy of these "vanilla" implementations. To remedy their shortcomings, we tune the meta-parameters of selected algorithms and assess the sensitivity of these advanced methods to poor initial standard-meta-parameters in R and Python. As a result, we showcase how improved meta-parameters as well as carefully engineered feature creation, feature transformation and feature selection can significantly increase the accuracy and / or speed of both shallow and deep neural network architectures in industry practice leading to succesfull implementations. |
Contact and Legal Notice · Contact Address: Privacy Statement · Conference: OR 2024 |
Conference Software: ConfTool Pro 2.6.151+TC © 2001–2024 by Dr. H. Weinreich, Hamburg, Germany |