Development of an Indonesian NLP-Based ESG Media Intelligence System Using TF-IDF and IndoBERT
DOI:
https://doi.org/10.36085/jsai.v9i2.10586Abstract
Monitoring Environmental, Social, and Governance (ESG) issues in Indonesia’s nickel mining industry has become increasingly important due to growing demands for transparency and sustainability. However, automated ESG media analysis for Indonesian-language news remains limited. This study aims to develop an ESG media intelligence system based on Natural Language Processing (NLP) to analyze media perception toward PT Indonesia Weda Bay Industrial Park (IWIP) and PT Weda Bay Nickel (WBN). The proposed system employs an eight-stage pipeline consisting of automated news collection, Indonesian text preprocessing, ontology-based ESG labeling, text classification using TF-IDF + LinearSVC and IndoBERT, as well as sentiment and ESG risk trend analysis. A total of 1,693 news articles published between January 2020 and May 2026 were collected, with 1,320 articles successfully labeled using an ontology-based weak supervision approach. Experimental results show that the best TF-IDF configuration achieved a Macro-F1 score of 0.7693, while IndoBERT achieved 0.7698. The findings indicate that TF-IDF remains competitive with transformer-based models on limited Indonesian ESG datasets. Media analysis revealed that IWIP received predominantly negative media perception on environmental and social issues, while WBN showed relatively more positive governance-related coverage. This research contributes to the development of Indonesian-language ESG media intelligence for the mining industry.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Lukman Hakim Moeslich, Cahyono Budy Santoso

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.




