← ahead.market blog
Leveraging Large Language Models for Earnings Report Analysis
In the rapidly evolving landscape of financial technology, artificial intelligence (AI) is revolutionizing the way investors analyze and interpret market data. One particularly promising application is the use of Large Language Models (LLMs) to analyze earnings reports, a critical component of investment decision-making. This article explores how LLMs can be leveraged to structure unstructured data from earnings reports, potentially leading to more accurate investment predictions.

As discussed in our previous article on AI and investment partnerships, one of the key advantages of applying LLMs to investment applications is their ability to structure unstructured data. Here, we'll delve into a case study of how this approach was used to analyze earnings reports for a small hedge fund, achieving a prediction accuracy of approximately 65%.


The Challenge of Timely Earnings Analysis

The earnings reporting process typically follows a specific sequence:
Press release
Unstructured Data
Earnings call with management
Unstructured Data
Official SEC report
Structured Data
While the final SEC report has a relatively standardized structure, the initial press release and earnings call transcript are often in a less structured format. Given that most price movements occur before the SEC report is published, it's crucial for investors to react quickly to the initial information.

The hedge fund aimed to achieve three main objectives:
  • Speed
    Rapidly generate signals for significant deviations from expectations
  • Accuracy
    Improve the accuracy of the report analysis
  • Coverage
    Automate the analysis of reports for companies not covered by analysts
Overcoming Initial Hurdles

The first step in the process is obtaining the press release as soon as it's published. Companies are required to distribute these through news wire services such as GlobeNewswire and PRNewswire.

A naive approach would be to feed the entire press release into an LLM and expect it to output key metrics (EPS, revenue, guidance, etc.) and provide an assessment of whether the report would be perceived positively or negatively by investors. However, this method is bound to fail, as the model lacks the context of analyst expectations to determine whether the reported figures are better or worse than anticipated.


Enhancing Context for Improved Results

By incorporating analyst expectations for metrics like revenue and EPS into the model's context, the results become significantly more accurate. However, this approach revealed another challenge: even the most advanced LLMs (such as GPT-4 and Claude 3.5) struggle with basic arithmetic comparisons between reported figures and analyst expectations.

Despite this limitation, the results were still impressive, as arithmetic errors occurred relatively infrequently.


Further Refinements

To enhance the model's performance, several additional steps can be taken:
Enrich the context with:
  • Information from previous earnings reports
  • Recent news analysis
  • Earnings results from competitors and related industries
Implement an iterative process
  • Extract data from the report using the LLM
  • Compare with expectations
  • Add context describing the relationship between published figures and analyst expectations (e.g., "higher than expected," "significantly lower," etc.)
Train Separate Model
Train a separate model on the numerical data, using the LLM's analysis as input parameters.
This approach provides the LLM with more natural language information, which it can process more effectively than raw numbers.


Achievable Results and Monetization Strategies

Using this approach, we achieved a prediction accuracy of 65%. There's potential for further improvement, particularly by training a multi-modal transformer, although budget constraints limited our ability to explore this option in the current project.

It's important to note that directly profiting from this analysis is challenging due to processing time. Even with local inference, the LLM analysis takes several seconds, which is too slow to compete with high-frequency trading (HFT) firms that can react to significant events much faster.

However, the insights gained from this analysis can be valuable for identifying post-earnings drift, increased volatility, or excessive price movements, which can present profitable trading opportunities.


Conclusion

The application of Large Language Models to earnings report analysis represents a significant advancement in AIdriven investment strategies. While challenges remain, particularly in processing speed and arithmetic capabilities, the potential for improved accuracy and automation in financial analysis is substantial.

As AI technology continues to evolve, we can expect even more sophisticated applications in the investment world, potentially leveling the playing field between large institutions and smaller investors.
MORE ARTICLES