Measuring Firms’ Remote-Workforce Abilities

Blog post

Howard Zhang, Daniel R. Barrera, Manuel Rueda

July 14, 2020

The COVID-19 pandemic disrupted the operating models of many businesses and forced a shift to remote working, digitization and low-contact transactions and services, which we term "remote-operation capability" (ROC).
Using machine learning and natural language processing we built a ROC factor. Companies with high exposure to our ROC factor outperformed the MSCI USA IMI by around 15 percentage points YTD through June 30.
Our hypothetical "combined" ROC portfolio, built from our three other ROC portfolios, had high exposure to the beta, growth and profitability factors; and low exposure to dividend yield, value and long-term reversal.

The challenges posed to corporations by COVID-19 showed that some companies were better positioned to take advantage of a remote, automated and digitized operating environment. We utilized techniques from machine learning (ML) and natural language processing (NLP) to build a potential "remote-operational capability" (ROC) factor that seeks to estimate the extent to which a company was more likely to thrive in this scenario.

Constructing the ROC Factor

Our first step was employing the "topic modeling" approach to make the theme concrete. This is a research technology used in MSCI thematic indexes¹ and other products that leverage ML and NLP. We started with a set of "seed" words and phrases with strong, intuitive relations to the ROC theme (e.g., home working, remote work and telecommuting). We then used "word embedding" models² to expand the seed-word list to a larger "dictionary" of about 50 keywords and phrases. The following is the word cloud version of the keyword dictionary for ROC.

Next, this dictionary becomes an input to the three approaches to factor construction we tested:

Word count: We counted ROC keyword matches in the business section of a company's 10-K filing.
Semantic search: We identified a company's products and services from its 10-K filing's business section using semantic-role-labeling³ techniques and then counted ROC keyword matches for those products and services.
Concept exposure: We used a knowledge graph dataset⁴ that quantifies a company's exposures to high-level concepts, or themes, based on the co-occurrences of companies and those concepts; the centrality of the connections; and the links to similar concepts. Then we aggregated each company's exposure to concepts that contained any of our ROC keywords.

In the first two approaches, the count of keyword matches became our raw ROC-factor exposure. For the concept-exposure method, the aggregated exposure to concepts that contained any of our ROC keywords became our raw ROC-factor exposure. We normalized each raw ROC-factor exposure to avoid any outlier influence.⁵ Finally, we constructed hypothetical portfolios of the 250 stocks with the highest exposure to each of our ROC factors. We weighted the 250 stocks by the product of their normalized ROC exposure and market cap.⁶ The weights were then normalized to 100% and each issuer capped at 5% to reduce concentration. We also took a simple average of the normalized exposures from the three methods and constructed a fourth, "combined" portfolio in the same way as with each individual ROC-factor portfolio.

Combined ROC Portfolio Outperformed the Individual Ones

We evaluated the performance of the four ROC portfolios over the year-to-date period through June 30. As we see in the exhibit below, the ROC portfolios performed similarly to one another and, in all cases, outperformed the MSCI USA Investable Market Index (IMI) benchmark portfolio. We also note that the combined ROC-factor portfolio had the largest outperformance.

Breaking Down the Combined ROC Portfolio's Outperformance

When we examined the combined ROC portfolio's active sector weights using the MSCI USA IMI as the benchmark, we found it overweighted the information-technology (IT) sector. This is not surprising, since many IT companies offer solutions that enable remote operations and it makes sense that they mention related terms more often than other sectors.

Data as of Dec. 31, 2019

We also examined the combined ROC portfolio's active exposures to style factors in MSCI's Barra US Total Market Equity Model for Long-Term Investors (USSLOW). We found it had high exposures to the beta, growth and profitability factors while it had low exposures to dividend yield, value and long-term reversal.

Data as of Dec. 31, 2019

While we can't know how companies with high exposure to our ROC factor will perform in the future, we believe that the techniques and data sources we described can be utilized to capture emerging or long-term themes as the world adjusts to new realities created by the COVID-19 pandemic. The authors thank George Bonne, Stuart Doole, Neeraj Kumar and Gaurav Trivedi for their contributions to this blog post.

Subscribe today
to have insights delivered to your inbox.

¹Kumar, N., Doole, S., Garg, K., Bhalodia, V., and Ghate, D. 2019. “Indexing Change: Understanding MSCI Thematic Indexes.” MSCI Research Insight.²Word embeddings are language-modeling techniques in NLP where words or phrases from a text “corpus” (group of documents) are mapped to numerical vectors representing related and co-occurring words, or to vectors of linguistic contexts in which the words occur. We used word2vec and sense2vec and BERT embeddings.³Semantic role labeling is a technique in NLP that detects the predicate-argument structure of sentences by analyzing the semantic role of words. For a detailed description, see: “Semantic role labeling.” Wikipedia.⁴We used concept-exposure data from Yewno, which leverages the Yewno Knowledge Graph to extract information from various content sources, including news, company filings, conference-call transcripts and patent filings, to provide scores that quantify directional exposures from entities to concepts.⁵To avoid outlier influence, we trimmed raw ROC-factor exposure to the 95% of the maximum value from each raw exposure. We did so by dividing by the maximum value from each raw exposure such that the normalized exposures resided within [0, 1]. For each method, we used data available through the end of 2019.⁶We also evaluated other weighting schemes — equal weight, market-cap weight and exposure weight — and obtained similar results.—

The content of this page is for informational purposes only and is intended for institutional professionals with the analytical resources and tools necessary to interpret any performance information. Nothing herein is intended to recommend any product, tool or service. For all references to laws, rules or regulations, please note that the information is provided “as is” and does not constitute legal advice or any binding interpretation. Any approach to comply with regulatory or policy initiatives should be discussed with your own legal counsel and/or the relevant competent authority, as needed.

Featured solutions bring together a variety of products to create approaches customized to your needs

MSCI Research Weekly

By topic

By asset class

AI Portfolio Insights

Index categories

Index resources

MSCI All Country Public + Private Equity Index

By asset class

By theme

Investment Trends In Focus

Who we are

News and events

Innovation

Grow with us

Measuring Firms’ Remote-Workforce Abilities

Subscribe today
to have insights delivered to your inbox.

Featured solutions bring together a variety of products to create approaches customized to your needs

MSCI Research Weekly

By topic

By asset class

AI Portfolio Insights

Index categories

Index resources

MSCI All Country Public + Private Equity Index

By asset class

By theme

Investment Trends In Focus

Who we are

News and events

Innovation

Grow with us

Measuring Firms’ Remote-Workforce Abilities

Subscribe todayto have insights delivered to your inbox.

Subscribe today
to have insights delivered to your inbox.