Evaluating the Effectiveness of the Lexrank and LSA Algorithm in Automatic Text Summarization for Indonesian Language

Automatic text summarization Latent sematic analysis Lexrank Bahasa Indonesia

Authors

  • Galih Wiratmoko
    L280220006@student.ums.ac.id
    Universitas Muhamadiyah Surakarta, Indonesia
February 20, 2025

Downloads

The aim of this study is to evaluate how effective the Lexrank algorithm and Latent semantic analysis (LSA) are in automatic text summarization for the Indonesian language. This research focuses on natural language processing and handling of excessive data. We applied both algorithms to generate text summaries using the INDOSUM dataset, which contains about 20,000 news articles in Indonesian with manual summaries. To assess performance, the ROUGE metric was used, which includes aspects of precision, recall, and F1 score. In all tested metrics, LSA outperformed Lexrank. LSA had a precision of 0.57, recall of 0.67, and an F1 score of 0.59, whereas Lexrank had a precision of 0.46, recall of 0.52, and an F1 score of 0.48. These result indicate that LSA is better at gathering important information from the original text than Lexrank.