- We examined three approaches to creating a combined ESG score: equal weighting; optimization using historical data; and industry-specific weights, represented by MSCI ESG Ratings.
- In the short term, we found that both equal-weighted and optimized approaches performed better because they had higher exposures to Governance Key Issues.
- Over our 13-year study period, however, an industry-specific weighted approach that changed weightings over time showed the strongest financial performance.
Our recent research suggests that environmental (E) and social (S) issues were more industry-specific and tended to show up in financial measures over a longer time frame than governance (G) issues. What were the implications for investors in combining E, S and G issues into an aggregate ESG score or rating?
In this blog post, we investigated three approaches to creating a combined ESG score or rating: equal weighting; an optimized approach that sets weights based on historical data; and industry-specific weights as represented by MSCI ESG Ratings. Our results highlight a trade-off for investors in creating an aggregate ESG score: The weighting scheme that achieved the strongest significance in the short term (one-year correlation to key financial variables) showed the worst stock-price performance over the long term (cumulative stock-price returns over 13 years).
We used the scores that underlie MSCI’s ESG Ratings from December 2006 to December 2019 to construct our test of these alternative ESG scores. Specifically, we used different methodologies for weighting the Key Issue scores that are categorized under the Environmental (E pillar score), Social (S pillar score) and Governance (G pillar score).
Approach #1: Equal Weights
Equal weighting has the benefit of being simple, transparent and more comparable across industries. If an investor does not have specific views about the relative importance of environmental, social or governance issues (either in a static or dynamic approach), then this “naïve” method could be appropriate.
For equal weighting, we computed an aggregate ESG score for each company on a monthly basis between December 2006 and December 2019 that comprised one-third E Key Issue scores (or E pillar score), one-third S Key Issue scores (S pillar score) and one-third G Key Issue Scores (G pillar score).
Approach #2: Backtested Weights
Similarly, an optimized weighting based on historical data may help investors that do not have a specific view to instead “let the data speak” in choosing the optimal E, S and G weights, based on their historical significance.
To create an optimized ESG score was more complicated than the simple equal-weighting approach. The first step in the optimization was to determine which target financial variable best represents investor objectives. For example, choosing the historical stock-price performance as the target could result in the best possible historical (in-sample) stock-price performance. However, as ESG ratings are designed to reflect the financial resilience of companies to long-term environmental, social and governance risks, optimizing the rating to correlations to short-term historical stock-price movements could limit its value.
Therefore, we chose company fundamental data to represent investor objectives. Specifically, we looked for a combination of pillar scores that maximized the economic effect via the three transmission channels that we identified in previous research:
- The cash-flow channel, whereby companies better at managing intangible capital (such as employees) may have been more competitive and hence more profitable over time.
- We selected gross profitability as the target financial variable.1
- Idiosyncratic risk, whereby companies with stronger risk-management practices may have experienced fewer incidents, such as accidents, that triggered unanticipated costs.
- We selected residual volatility2 as the target financial variable.
- Systematic risk, whereby companies that used resources more efficiently may have been less susceptible to market shocks such as fluctuations in energy prices.
- We selected systematic volatility as the target financial variable.
To reduce the risk of overfitting a model to a specific data sample, we limited the number of parameters and chose constant weights for the E, S and G pillars throughout the study period and across industries. Therefore, this approach optimized only two pillar weights (the third weight is given by the constraint that the weights have to add up to 100%).3 The contour lines in the exhibit below illustrate the average difference in fundamental variables between companies with the best and worst combined scores as a function of E, S and G pillar weights.
Backtesting ESG Pillar Weight Combinations
Source: MSCI ESG Research LLC. The charts display the average Q5 (top quintile) - Q1 (bottom quintile) differences for profitability, residual volatility and systematic volatility as a function of the pillar weights. Data are from December 2006 to December 2019 for the MSCI World Index.
Our results show that putting the most weight on the Governance pillar and the least weight on the Social pillar resulted in the greatest improvement in exposure to financial variables in the top quintile (Q5) over the bottom quintile (Q1). To arrive at final weights, we constructed a target variable that is the average of the three financial variables. The optimization maximized the Q5 - Q1 difference to this three-channel average score, yielding weights of 25% E pillar, 5% S pillar and 70% G pillar.
Approach #3: Industry-Specific Weights
The third approach of selecting and weighting E, S and G issues for each industry (the approach used in creating MSCI ESG Ratings) more precisely reflects industry exposures to E, S and G risks. However, it has the drawback of introducing complexity and less comparability across industries.
On average, each of the 158 Global Industry Classification Standard (GICS®)4 sub-industries uses six ESG Key Issues in assigning weights in the MSCI ESG Ratings. The selection of Key Issues and their respective weights are readjusted on an annual basis, through a process that combines quantitative assessment of industry exposures to emerging issues and wide consultation with investment practitioners.5
Using this process, weights have varied over time across sectors. During our 13-year study period, there were over 2,000 permutations of E, S and G weights. As of the end of 2019, the weight of the E pillar ranged from 5.8% for the communication services sector to 62.1% for utilities; the weight of the S pillar ranged from 16.3% for energy to 59.8% for the financials sector.
Over the 13-year period, the pillar weights averaged 30% for Environmental Key Issues, 39% for Social Key Issues and 31% for Governance Key Issues. These weights showed significant variation over time. The average G pillar weight increased from an average of 19% in the first half of the sample period (2007-2012) to 25% in the second half (2013-2019), highlighting the increasing importance of governance issues over time.
Comparison of Three Weighting Schemes
We compared the three approaches using the financial variables representing our three economic-transmission channels.
First, we took the difference between the top- and bottom-scoring companies (Q5 - Q1 difference) for each scoring approach and compared the significance of their average monthly correlation to key financial variables (profitability, residual CAPM volatility and residual volatility).
|Equal Weight||Backtested Weights||Industry-Specific Weights
(MSCI ESG Ratings)
|Average weights (2006-2019)||E= 33.3%
|Number of weighting schemes in the study period||1
(158 industry-specific weights X 13 years)
|Significance of monthly correlation to target financial variables (T-statistics)||Profitability (1.74)
Residual vol (3.74)
Systematic vol (3.33)
Residual vol (3.92)
Systematic vol (4.08)
Residual vol (3.01)
Systematic vol (2.67)
Source: MSCI ESG Research LLC. Data from December 2006 to December 2019 for the MSCI World Index. Industry-specific rates range from 5.8% - 62.1% for the E pillar, 16.3% - 59.8% for the S pillar and 21.2% - 43.6% for the G pillar. Data from December 2006 to December 2019 for the MSCI World Index.
The backtested weighting approach showed the strongest significance, which was in line with our expectations as the weighting scheme was optimized against the target variables. The equal-weighted approach also showed slightly stronger results than the industry-specific approach of the MSCI ESG Rating.
None of this is surprising. As shown in previous research, the economic-transmission channels analysis uses a one-year period in evaluating exposure to profitability and risk. This short timespan gave higher weights to Governance Key Issues which reflected greater “event” risk. Both the optimized and equal-weighted approaches had greater Governance weights.
Long-term Financial Significance
But what about over a longer time frame? When we compare the long-term stock price performance of these three approaches, the “horse race” flipped. Over the 13-year study period, the industry-specific approach represented by the MSCI ESG Rating outperformed both the equal-weighted ESG score and the backtest-weighted ESG score by 7.4% and 11.1% (see exhibit below).
The exhibit below shows the stock-price performance difference between the top-quintile companies and the bottom-quintile companies for each of the three approaches. We found that the industry-specific weighted approach represented by the overall MSCI ESG scores correlated to better stock performance during the 13-year study period and showed a lower level of cyclicality.
Cumulative Performance of Q5 - Q1 Quintile Portfolios (in Local Currency)
Source: MSCI ESG Research LLC. Data from December 2006 to December 2019 for the MSCI World Index. Comparison of MSCI ESG Industry-Adjusted scores, equal-pillar-weighted scores and optimized ESG scores.
When looking at long-term financial significance, we found that Social and Environmental Key Issues became more important, as they have tended to unfold more slowly over time. Our recent research suggests that ESG issues may reflect two types of risk: event risk, which can precipitate short-term falls in stock price, and erosion risk to companies’ long-term competitiveness, which can gradually depress performance over time.
However, taking a long-term view may not reveal the full story. After all, the equal-weighted ESG score had nearly the same average weight distribution to E, S and G as the MSCI ESG score.
A key difference is the industry specificity of the MSCI ESG score. Underneath the hood, both the selection of ESG issues and setting of their weights for each of the 158 GICS sub-industries were adjusted annually. The shifting balance between E, S and G Key Issues might help explain the superior long-term financial performance of this dynamic approach, compared to static weighting schemes.
Investors aiming to integrate ESG factors to achieve better long-term financial results have often overlooked how the combination of individual ESG indicators have been critical to their usefulness.6
In the short term, we found that both equal-weighted and optimized approaches more heavily weighted governance issues, but that short-term correlations did not mean long-term financial significance. The reverse was true for an approach that adjusted the weights of E, S and G Key Issues dynamically by industry; this approach displayed strong financial performance over the long term at the expense of short-term correlations to key financial variables.
An optimization-based approach using historical data and a static target function was too simplistic and too backward-looking, as the key risks are anything but static. What is clear from this simple study is that weighting schemes can play an important role in fine-tuning ESG-rating methodologies, enhancing their forward-looking assessment of ESG risks and how such risks may be reflected in the rating model.
1As we used a z-score format (which creates a standard unit of measurement), we were able to average these three quintile differences in one aggregated target function.
2Based on residual returns from the Capital Asset Pricing Model (CAPM).
3In our research paper, we also performed a more complex optimization with sector specific weights for E, S and G. However, sectors contained roughly 10% of the total stocks, which increased the confidence bounds of the optimized weights roughly three times. Therefore, we decided to use industry-agnostic optimization for our comparison.
4The Global Industry Classification Standard (GICS®) was jointly developed by MSCI and Standard & Poor’s.
5ESG Ratings Methodology. See also “2020 ESG Ratings Model Consultation.” (Client access only.) MSCI ESG Research’s annual consultation solicits feedback from its institutional-investor clients on proposals to enhance the ratings methodology and recalibrate industry-specific inputs.
6This report may contain analysis of historical data, which may include hypothetical, backtested or simulated performance results. There are frequently material differences between backtested or simulated performance results and actual results subsequently achieved by any investment strategy. The analysis and observations in this report are limited solely to the period of the relevant historical data, backtest or simulation. Past performance — whether actual, backtested or simulated — is no indication or guarantee of future performance. None of the information or analysis herein is intended to constitute investment advice or a recommendation to make (or refrain from making) any kind of investment decision or asset allocation and should not be relied on as such.