Comparative Analysis of Pre-trained Language Models for ESG Financial News Sentiment Classification and Market Signal Correlation
DOI:
https://doi.org/10.66372/JGER.v3i1.5Keywords:
ESG sentiment analysis, pre-trained language models, financial NLP, market signal predictionAbstract
Environmental, Social, and Governance (ESG) investing has become a central framework in modern capital markets, with the volume of ESG-related news growing at an unprecedented pace. Accurate sentiment extraction from ESG financial news constitutes a foundational capability for investment decision support, portfolio risk monitoring, and regulatory compliance analytics. This paper presents a systematic comparative study of pre-trained language model (PLM) approaches applied to ESG financial news sentiment classification, alongside an empirical analysis of their market signal predictive validity. Experiments are conducted across three publicly available benchmarks: FinancialPhraseBank (4,840 sentences; 75%-agreement subset: 3,453), FiQA Task 1 (1,174 entries), and the Twitter Financial News Sentiment dataset (11,932 entries). Market correlation analysis draws on S&P 500 component-level daily price data spanning January 2016 to July 2023, sourced from Yahoo Finance. The comparative evaluation covers lexicon-based methods (VADER, TextBlob, Loughran-McDonald) and domain-adapted transformer models (BERT-base, RoBERTa-base, FinBERT, FinBERT-Tone). Domain-adapted PLMs consistently outperform lexicon-based baselines by 12–19 weighted F1 percentage points on FinancialPhraseBank. Sector-level market correlation analysis reveals significant heterogeneity: Energy and Utilities sectors exhibit the strongest ESG sentiment–return linkages (Pearson r > 0.25), while Technology shows the weakest correlations (r < 0.16), providing sector-specific guidance for ESG-integrated quantitative investment strategies.

