A New Approach for Text Style Transfer Based on Deletion and Generation

被引:0
作者
Liu, Guoqiang [1 ]
Xiao, Ruochen [1 ]
Li, Wenkai [1 ]
Zhang, Youcheng [1 ]
机构
[1] Blended Learning MIT, Apple Sin Project, Cambridge, MA 02139 USA
来源
2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT | 2022年
关键词
Masked language modeling; SHAP algorithm; Glove; Spacy; Sequence Classification modeling;
D O I
10.1109/FCSIT57414.2022.00013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose using the Sequence Classification modeling, SHAP algorithm and masked-language modeling (MLM) for the task of text style transfer. To tackle cases when no parallel source-target pairs are available, we train Sequence Classification model based on Bert model with SST-2 task of GLUE for both source and target domain; and we use SHAP values, which are computed based on Sequence Classification model we gained, to detect and then delete words associated with original attributes. The deleted tokens are replaced by MLM trained with the target domain to retrieve new phrases associated with the target attributes. Based on this, we detect the part of speech (POS) of each word in the sentence in order to replace the suitable positions without much impact on the semantics. Additionally, we use GloVe to determine semantic similarity between the word generated by MLM and the original word so that we can trade off content versus attribute by using grid search to gain their weighting percentage. The experiments show that our methods improve style conversion rate by 9.7% and get a semantic similarity compared to original contents 28.2% on average higher than best previous system.
引用
收藏
页码:6 / 11
页数:6
相关论文
共 19 条
  • [1] Antwarg L, 2020, Arxiv, DOI arXiv:1903.02407
  • [2] Canete J., 2020, PML4DC at ICLR 2020, P1
  • [3] Chall Jeanne Sternlicht, 1995, Readability Revisited: The new Dale-Chall readability formula
  • [4] Recurrent Neural Networks for Multivariate Time Series with Missing Values
    Che, Zhengping
    Purushotham, Sanjay
    Cho, Kyunghyun
    Sontag, David
    Liu, Yan
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [5] Dale E, 1948, EDUC RES BULL, V27, P11
  • [6] Honnibal M., 2017, arXiv
  • [7] Hu Z., 2017, INT C MACHINE LEARNI
  • [8] Li JC, 2018, Arxiv, DOI arXiv:1804.06437
  • [9] Lin Chin-Yew, 2004, Text Summarization Branches Out, DOI DOI 10.1179/CIM.2004.5
  • [10] Liu YH, 2019, Arxiv, DOI [arXiv:1907.11692, DOI 10.48550/ARXIV.1907.11692]