Romanian Journal of Information Science and Technology (ROMJIST)

An open – access publication

  |  HOME  |   GENERAL INFORMATION  |   ROMJIST ON-LINE  |  KEY INFORMATION FOR AUTHORS  |   COMMITTEES  |  

ROMJIST is a publication of Romanian Academy,
Section for Information Science and Technology

Editor – in – Chief:
Radu-Emil Precup

Honorary Co-Editors-in-Chief:
Horia-Nicolai Teodorescu
Gheorghe Stefan

Secretariate (office):
Adriana Apostol
Adress for correspondence: romjist@nano-link.net (after 1st of January, 2019)

Founding Editor-in-Chief
(until 10th of February, 2021):
Dan Dascalu

Editing of the printed version: Mihaela Marian (Publishing House of the Romanian Academy, Bucharest)

Technical editor
of the on-line version:
Lucian Milea (University POLITEHNICA of Bucharest)

Sponsor:
• National Institute for R & D
in Microtechnologies
(IMT Bucharest), www.imt.ro

ROMJIST Volume 26, No. 1, 2023, pp. 3-20, DOI: 10.59277/ROMJIST.2023.1.01
 

Bob CHEN, Weiming PENG, Jihua SONG
A Frequent Construction Mining Scheme Based on Syntax Tree

ABSTRACT: Natural language processing (NLP) is one of the main research directions in artificial intelligence. One of the goals of NLP is to identify various semantic information in the text. Currently, the mainstream semantic recognition tasks focus more on using the semantic information of each word in the text to perform semantic analysis of the entire sentence. The research on semantics in cognitive linguistics indicates that semantics is determined by both the words contained in the sentence and the arrangement of the words. Linguists refer to permutations and combinations containing certain semantic information as constructions. Since the construction plays an essential role in semantic information, identifying various constructions in text is a crucial work of semantic recognition tasks. Based on this background, the main works performed in this paper are as follows: 1) The definition and program representation of constructions and the corresponding constraints in NLP tasks are proposed. 2) A frequent construction mining algorithm is proposed to extract frequent structures that meet the construction requirements in the grammar structure tree. Based on the above works, the corresponding construction database can be extracted for the specified natural language corpus, which is helpful for more effective text semantic analysis.

KEYWORDS: Construction; data mining; semantic recognition; sequential pattern mining

Read full text (pdf)






  |  HOME  |   GENERAL INFORMATION  |   ROMJIST ON-LINE  |  KEY INFORMATION FOR AUTHORS  |   COMMITTEES  |