Construction and Optimization of Co-occurrence-attribute-interaction Model for Column Semantic Recognition

doi:10.21655/ijsi.1673-7288.00294

Home > Archive>Volume 13, Issue 1, 2023 >5-26. DOI:10.21655/ijsi.1673-7288.00294

Construction and Optimization of Co-occurrence-attribute-interaction Model for Column Semantic Recognition
DOI:
                        10.21655/ijsi.1673-7288.00294
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Government data governance is undergoing a new phase of transition from ``physical data aggregation'' to ``logical semantic unification''. Thus far, long-term ``autonomy'' of government information silos, leads to a wide spectrum of metadata curation issues, such as attributes with the same names but having different meanings, or attributes with different names but having the same meanings. Instead of either rebuilding/modifying legacy information systems or physically aggregating data from government information silos, logical semantic unification solves this problem by unifying the semantic expression of the metadata in government information silos and achieves the standardized metadata governance. This paper focuses on the logical semantic unification that semantically aligns the metadata in each government information silo with the existing standard metadata. Specifically, the names of the standard metadata are abstracted as semantic labels, and the column projections of silo relational data are semantically recognized to semantically align column names with the standard metadata and ultimately achieve the standardized governance of silo metadata. The existing semantic recognition techniques based on column projection fail to capture the column order-independent features of relational data and the correlation features among attributes and semantic labels. To address the above problem, we propose a two-phase model based on a prediction phase and a correction phase. In the prediction phase, a Co-occurrence-Attribute-Interaction (CAI) model is proposed to guarantee the column order-independent property by employing the parallelized self-attention mechanism; in the correction phase, a correction mechanism is introduced to optimize the prediction results of the CAI model by utilizing the co-occurrence of semantic labels. Experiments are conducted on a government benchmark dataset and several public English datasets, such as Magellan, and the results show that the two-phase model with a correction mechanism outperforms the current optimal model in macro-average and weighted average by up to 20.03% and 13.36%, respectively.

Reference

Cited by

Get Citation

Shan Gao, Wanzhu Yuan, Wei Lu, Lan Wang, Jing Zhang, Xiaoyong Du. Construction and Optimization of Co-occurrence-attribute-interaction Model for Column Semantic Recognition. International Journal of Software and Informatics, 2023,13(1):5~26

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:May 15,2022
Revised:July 29,2022
Adopted:September 23,2022
Online: March 30,2023
Published:

Home

About Journal

Editorial Board

Guidelines

Content

News

Top papers

E-mail Alert

Publication Ethics

Old Version

Get Citation

Share

Article Metrics

History