Search for a command to run...
Data Sources: This study analyses earnings call transcripts from S&P Global Market Intelligence's Machine Readable Transcripts database, supplemented with financial data from S&P Capital IQ, stock returns from S&P Market Data, and analyst information from S&P Capital IQ Estimates. These proprietary datasets are subject to commercial licensing restrictions and cannot be publicly shared by the authors. Researchers can obtain access through S&P Global Market Intelligence (https://www.marketplace.spglobal.com) or through institutional subscriptions where available.Methodology Replication: Implementation code demonstrating our textual analysis approach is publicly available at: https://github.com/Snowflake-Labs/sfguide-s-and-p-market-intelligence-analyze-earnings-transcripts-in-cortex-ai/blob/main/0_start_here.ipynbThe repository demonstrates the complete feature creation process for both proactiveness and on-topic alignment measures, including:LLM-based benchmark answer generation for proactiveness measurementVector embedding procedures using snowflake-arctic-embed-m modelCosine similarity calculations for both proactiveness (question vs. LLM answer) and on-topic alignment (question vs. executive answer)RAG implementation with 60% context retrieval optimizationFactor Construction Details: The complete factor construction procedure is detailed in Sections 3.2.1-3.2.2:Question-level cosine similarities are averaged within each earnings call to produce call-level scoresCall-level scores are resampled to monthly frequency using 4-month rolling windows to ensure sufficient observations per firm-monthFor firms with multiple calls in the formation window, scores are averagedThe resulting firm-month panel contains one observation per firm per monthPortfolio and Statistical Analysis: Portfolio construction and statistical testing follow standard empirical asset pricing methodologies detailed in Section 3.3, including sector-neutral quintile formation, monthly equal-weighted rebalancing, factor-adjusted active returns, and Fama-MacBeth cross-sectional regressions. These procedures are widely documented in finance literature and can be implemented by researchers with standard econometric tools given textual scores at the question-answer pair level.Reproducibility: Researchers with access to S&P Global data can:Implement the question-answer similarity calculations using the demonstrated approach in our public repositoryAggregate to call and monthly levels following specifications in Sections 3.2.1-3.2.2Apply portfolio methodologies specified in Section 3.3Replicate our empirical findingsDisclosure: The lead author is employed by S&P Global Market Intelligence. The public repository demonstrates our core methodological innovation (semantic similarity measurement using LLMs) while complete end-to-end implementation details are provided in the manuscript methodology sections. Due to commercial licensing agreements, the underlying data cannot be shared directly by the authors but is accessible through the channels described above.