Leveraging Language Models to Capture Investment Strategies

Blog post
7 min read
October 29, 2025
Key findings
  • As equity investors adapt to shifting market and economic conditions, they increasingly need to evaluate company and portfolio exposures to new and emerging sources of risk and return.
  • Large language models offer quick and easy assessments, but in practice, they can suffer from inconsistency, inaccuracy and hallucinations, posing challenges for robust investment analysis. 
  • Our approach applies reasoning through “opposing agents” to generate more accurate, transparent and resilient exposure estimates. Applied to public data (news or filings), this approach makes strategy development faster, more intuitive and empirically grounded.

As AI euphoria and macro turmoil shape global equity markets in 2025, many investors are turning to focused thematic investment strategies that cut across traditional country and sector boundaries. They seek focused portfolio exposure to the influential trends they expect to determine market risk and return. Since 2018, MSCI has modeled such trends using a keyword-based matching approach, using a pool of synonyms for the products and services associated with the trend generated by natural language processing (NLP). 

However, the growing capabilities of generative AI and large language models (LLMs) offer the promise of a much more immediate and holistic assessment of the economic linkage between a company and an investment strategy. In this blog, we describe an “opposing AI agent” framework approach to using an LLM for this exposure estimation — an approach that is robust and resilient and, by design, provides transparency on the rationale behind any association. 

From facts to feelings? 

Popular AI tools like ChatGPT are easy to use to gather handy facts and plausible statements about a company. However, attempting to base a comprehensive company-level scoring system on this “fact gathering” driven by extended prompts that include as “context” examples of the proposed scheme is fraught with difficulties and prone to common errors and hallucination — as well as inconsistency across equity universes and over time. Instead, we convert the fact-finding problem into a sentiment-tagging classification problem. This is something LLMs excel at and allows us to decouple the question of alignment of a company with a particular investment idea from the mechanics of scoring. 

Opposing agents power a sentiment-based classification of stocks

Each tick or cross expresses whether the LLM evaluation of the segment description has been passed for the corresponding investment idea. The Overall column shows that a segment must pass both these tests in sequence.  

In the figure above, we demonstrate this idea in practice. In the top-left purple box, we have (an excerpt of) the description of the Finished Good division of Rainbow Robotics. Suppose we want to build a strategy around Humanoid Robots. To make an investable strategy, we describe in plain language what is in-scope and what is out-of-scope for Humanoid Robot manufacturing. We then use our “opposing agent” approach. We first build a prompt, rich with context, for the Positive Agent that asks, “Is this business division working on in-scope activities?” The company needs to pass this test. We then build another prompt to oppose this view for the Skeptic Agent. It asks, “Is this business division more aligned with in-scope or out-of-scope activities?” The Skeptic Agent is a key counterbalance that helps deliver the robustness required for an investment process. This two-agent approach also generates transparency by design.1

Scaling the assessment across large equity universes 

With structured prompts, we can derive a stand-alone description for each of the business lines of every stock in the MSCI ACWI Investable Market Index by splitting text from a company’s filings (or other business-information services). This augments the company’s GICS®-labeled segment breakdown that also carries with it a revenue-based percentage breakdown.2

In practice, investment strategies are built around a set of related components that reflect a value chain or ecosystem. For example, we can build an investable “Advanced Robotics” strategy with components like Humanoids, Advanced Legged Robotics, Robotics Manipulation Platforms and Robotics Intelligence. For each component, we write in plain language what is in scope and what is out of scope, and thus, we generate a big-picture description of the investment concept.

For speed and compute/cost-efficiency, the overall description is used to screen the wider equity universe for companies that could plausibly have a positive exposure to the main investment idea.3 Rainbow Robotics would pass this test to form part of the screened subset ready for the LLM-powered workflow. For each company in the subset, we apply the “opposing agents” approach to link business lines with one or more components of the overall strategy. We can then immediately calculate the company’s exposure score by summing the percentage of revenue (or earnings or assets, as appropriate) from all such linked business lines. In this way, the LLM’s role is limited to identifying the association of the company to the strategy idea by sentiment — it does not perform any scoring. Separating these steps improves the accuracy and transparency of the approach as well as of the final scores.4

Assessing model accuracy 

Any assessment based on an LLM needs careful testing to protect the process from going awry. For our investment-strategy explorer, we modeled a range of themes where we could derive alternative competing assessments of company and theme linkage for comparison, coming from external thematic experts or the MSCI Sustainability and Climate team, for example. 

Opposing-agent approach has high accuracy and power when assessed versus human expertise 

Each test set included stocks linked with the theme as well as unlinked names. EV & Batteries – ACWI IMI Future Mobility Index constituents as of Nov. 30, 2024. Alternative assessment based on external expert’s review of constituents. Alternative Energy – ACWI IMI constituents from Renewable Electricity, Electric Utilities, Construction & Engineering, Heavy Electrical Equipment as of Nov. 30, 2024. Alternative assessment based on Alternative Energy Theme revenue > 1% from MSCI Solutions LLC. Digitalization of Education – Top 500 most semantically similar stocks. Alternative assessment based on both keyword-based approach and human review. No. of Stocks = ACWI IMI stocks in the test set. % True Positives = % of test set agreed classification as linked to the theme, % True Negatives = % of test set agreed as classified as not linked, % Assessment Differ - % where HITL and model approaches differ.  

The above figure assesses the accuracy of the “opposing agents” approach by comparing its output against alternative assessment over hundreds of securities (the theme test set) and across three quite different themes. In each of these themes, the “opposing agents” and the alternative assessment agreed (or was correctly classified by “opposing agent”) on more than 95% of the stocks and the agreement was as high as 99% for a focused theme like Digitalization of Education. 

Sentiment matters 

Our analysis has outlined a new approach that can much more effectively and holistically assess the linkage of companies to a wide range of investment-strategy ideas and estimate other nontraditional exposures. The robust two-agent approach was used to establish company exposure based on filings-style information. Additionally, by replacing the series of segment descriptions with associated summarized news stories, we can equally create a grounded news-based attention score for an investment idea using the very same “alignment” engine.5 This sort of flexible modeling and exposure calculation is well aligned with investors’ demand for greater agility in their investment-strategy development in current markets. 

Subscribe today
to have insights delivered to your inbox.

Mapping Market Turmoil with ‘Material’ News Attention

Using an AI-assisted analysis of global news coverage, we created top-down signals of elevated geopolitical and tariff-related risks at the company and sector levels to help spot disruptions early. 

Indexing Change: Understanding MSCI Thematic Indexes

Thematic investing has become increasingly popular with institutional and retail investors. We review the concept and how it differs from factor and ESG investment processes. Additionally, we lay out how we model various themes to build a rule-based index methodology representing the performance of companies exposed to a certain trend.

The Pace of Fast Change: Growth vs. Thematic Investing

Investors considering thematic investments to gain exposure to firms whose fortunes may not be captured by fundamental growth measures, may ask: What are the key opportunities – and challenges – that distinguish thematic from growth investing? 

1 The use of multiple LLM agents “debating” the correctness of an assessment to reduce the risk of hallucination and error has become a popular topic in the last two years but the complexity of such schemes can make them unscalable in practice. See, for example:

Y. Du, et al., “Improving Factuality and Reasoning in Language Models through Multiagent Debate,” Proceedings of the 41st International Conference on Machine Learning, 2024.

Y. Liu, et al., “GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion.”

2 Earnings and asset breakdowns are also accessible. GICS is the Global Industry Classification Standard, which was jointly developed by MSCI and S&P Dow Jones Indices.

3 We use, for example, a cosine similarity score with a suitable embedding to narrow the universe. We used this approach in the “semantic filter” in our recent blog on measuring material company exposure to tariffs.

4 We can repeat the two-agent assessment as a prudent guardrail. For different strategies, we have found that 95%-99% of segment assessments for selected securities are unanimous (positive or negative) and there are no “reversals” of the direction of the initial verdicts by each agent. 

5  We used this approach to news attention to measure the level of market and industry stress from material tariff risks in a recent blog post.

The content of this page is for informational purposes only and is intended for institutional professionals with the analytical resources and tools necessary to interpret any performance information. Nothing herein is intended to recommend any product, tool or service. For all references to laws, rules or regulations, please note that the information is provided “as is” and does not constitute legal advice or any binding interpretation. Any approach to comply with regulatory or policy initiatives should be discussed with your own legal counsel and/or the relevant competent authority, as needed.