Romanian Journal of Information Science and Technology (ROMJIST)

An open – access publication

  |  HOME  |   GENERAL INFORMATION  |   ROMJIST ON-LINE  |  KEY INFORMATION FOR AUTHORS  |   COMMITTEES  |  

ROMJIST is a publication of Romanian Academy,
Section for Information Science and Technology

Editor – in – Chief:
Radu-Emil Precup

Honorary Co-Editors-in-Chief:
Horia-Nicolai Teodorescu
Gheorghe Stefan

Secretariate (office):
Adriana Apostol
Adress for correspondence: romjist@nano-link.net (after 1st of January, 2019)

Founding Editor-in-Chief
(until 10th of February, 2021):
Dan Dascalu

Editing of the printed version: Mihaela Marian (Publishing House of the Romanian Academy, Bucharest)

Technical editor
of the on-line version:
Lucian Milea (University POLITEHNICA of Bucharest)

Sponsor:
• National Institute for R & D
in Microtechnologies
(IMT Bucharest), www.imt.ro

ROMJIST Volume 25, No. 2, 2022, pp. 117-132
 

Salih BAL, Efnan SORA GUNAL
The Impact of Features and Preprocessing on Automatic Text Summarization

ABSTRACT: Automatic text summarization obtains a shortened and informative version of a given text without manual intervention based on specific features, preprocessing methods, and decision mechanisms. This paper aims to thoroughly analyze the impact of common features and preprocessing techniques on the performance of automatic text summarization, particularly in the Turkish language. Also, a new distinctive feature based on latent semantic analysis is proposed as another contribution. Two datasets consisting of a total of 120 documents and 1,466 sentences were used for the analysis. Two different success metrics were utilized to assess the performance of automatic text summarization. A set of comprehensive experimental studies revealed the optimal feature subset and the most useful preprocessing methods that can improve the summarization performance. Moreover, it has been verified that the proposed feature further improves the performance.

KEYWORDS: Computational linguistics, automatic text summarization, feature extraction, feature selection, machine learning, preprocessing

Read full text (pdf)






  |  HOME  |   GENERAL INFORMATION  |   ROMJIST ON-LINE  |  KEY INFORMATION FOR AUTHORS  |   COMMITTEES  |