A New Approach for Text Style Transfer Based on Deletion and Generation

被引：0

作者：

Liu, Guoqiang ^{[1
]}

Xiao, Ruochen ^{[1
]}

Li, Wenkai ^{[1
]}

Zhang, Youcheng ^{[1
]}

机构：

[1] Blended Learning MIT, Apple Sin Project, Cambridge, MA 02139 USA

来源：

2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT | 2022年

关键词：

Masked language modeling; SHAP algorithm; Glove; Spacy; Sequence Classification modeling;

D O I：

10.1109/FCSIT57414.2022.00013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose using the Sequence Classification modeling, SHAP algorithm and masked-language modeling (MLM) for the task of text style transfer. To tackle cases when no parallel source-target pairs are available, we train Sequence Classification model based on Bert model with SST-2 task of GLUE for both source and target domain; and we use SHAP values, which are computed based on Sequence Classification model we gained, to detect and then delete words associated with original attributes. The deleted tokens are replaced by MLM trained with the target domain to retrieve new phrases associated with the target attributes. Based on this, we detect the part of speech (POS) of each word in the sentence in order to replace the suitable positions without much impact on the semantics. Additionally, we use GloVe to determine semantic similarity between the word generated by MLM and the original word so that we can trade off content versus attribute by using grid search to gain their weighting percentage. The experiments show that our methods improve style conversion rate by 9.7% and get a semantic similarity compared to original contents 28.2% on average higher than best previous system.

引用

页码：6 / 11

页数：6

共 19 条

[1]

Antwarg L, 2020, Arxiv, DOI arXiv:1903.02407

[2]

Canete J., 2020, PML4DC ICLR 2020

[3]

Chall J.S., 1995, Readability revisited: The new Dale-Chall readability formula

[4] Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].

Che, Zhengping ;

Purushotham, Sanjay ;

Cho, Kyunghyun ;

Sontag, David ;

Liu, Yan .

SCIENTIFIC REPORTS, 2018, 8

[5]

Chin-Yew L., 2004, Text Summarization Branches Out, 2004, P74

[6]

Dale E, 1948, EDUC RES BULL, V27, P11

[7]

Honnibal M., 2017, arXiv

[8]

Hu Z., 2017, INT C MACHINE LEARNI

[9]

Li JC, 2018, Arxiv, DOI arXiv:1804.06437

[10]

Liu YH, 2019, Arxiv, DOI arXiv:1907.11692

← 1 2 →