In business research firm size is both ubiquitous and readily measured. Complexity another firm-related construct is also relevant but difficult to measure and not well defined. As a result complexity is less frequently incorporated in empirical designs. Firm segment counts or the readability of a firm’s financial filings are often used as proxies for some aspect of complexity. We argue that most extant measures of complexity are one-dimensional have limited availability and/or are frequently misspecified. Using both machine learning and an application specific lexicon we develop a text solution that is based on widely available data and that provides an omnibus measure of complexity. Three dependent variables are used that allow us to compare our measure with popular alternatives and to separate out the potential empirical overlap of size and complexity. Our proposed measure used in tandem with 10 K file size provides a useful proxy that dominates traditional measures.
D82 D83 G14 G18 G30 M40 M41
Firm complexity; textual analysis; Form 10-K; machine learning; lasso regression.