Abstract: Share valuations are known to adjust to new information entering the market, suc ...
Expand
Abstract: Share valuations are known to adjust to new information entering the market, such as regulatory disclosures. We study whether the language of such news items can improve short-term and especially long-term (24 months) forecasts of stock indices. For this purpose, this work utilizes predictive models suited to high-dimensional data and specifically compares techniques for data-driven and knowledge-driven dimensionality reduction in order to avoid overfitting. Our experiments, based on 75,927 ad hoc announcements from 1996–2016, reveal the following results: in the long run, text-based models succeed in reducing forecast errors below baseline predictions from historic lags at a statistically significant level. Our research provides implications to business applications of decision-support in financial markets, especially given the growing prevalence of index ETFs (exchange traded funds).
Collapse
Semantic filters:
data dredgingrandom forest classification
Topics:
decision support electronic mail decision making decision support system electronic trading
Methods:
experiment machine learning dimensionality reduction data transformation time series analysis
A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
2016 | Management Information Systems Quarterly | Citations: 0
Abstract: In this paper, we introduce a tree-based approach adjusting for observable self- ...
Expand
Abstract: In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different preintervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.
Collapse
Semantic filters:
data dredgingrandom forest classification
Topics:
outsourcing price management big data logistics management business process outsourcing
Methods:
propensity score method experimental group logistic regression simulation survey