Phrase-aware Unsupervised Constituency Parsing

被引:0
作者
Gu, Xiaotao [1 ]
Shen, Yikang [3 ,4 ]
Shen, Jiaming [5 ]
Shang, Jingbo [2 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Univ Calif San Diego, San Diego, CA 92103 USA
[3] Univ Montreal, Mila, Montreal, PQ, Canada
[4] Tencent Inc, WeChat AI, Pattern Recognit Ctr, Beijing, Peoples R China
[5] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task. Despite their high accuracy in identifying low-level structures, prior arts tend to struggle in capturing high-level structures like clauses, since the MLM task usually only requires information from local context. In this work, we revisit LM-based constituency parsing from a phrase-centered perspective. Inspired by the natural reading process of human readers, we propose to regularize the parser with phrases extracted by an unsupervised phrase tagger to help the LM model quickly manage low-level structures. For a better understanding of high-level structures, we propose a phrase-guided masking strategy for LM to emphasize more on reconstructing non-phrase words. We show that the initial phrase regularization serves as an effective bootstrap, and phrase-guided masking improves the identification of high-level structures. Experiments on the public benchmark with two different backbone models demonstrate the effectiveness and generality of our method.
引用
收藏
页码:6406 / 6415
页数:10
相关论文
共 50 条
[1]   Unsupervised Parsing via Constituency Tests [J].
Cao, Steven ;
Kitaev, Nikita ;
Klein, Dan .
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, :4798-4808
[2]   Rule Augmented Unsupervised Constituency Parsing [J].
Sahay, Atul ;
Nasery, Anshul ;
Maheshwari, Ayush ;
Ramakrishnan, Ganesh ;
Iyer, Rishabh .
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, :4923-4932
[3]   Word Segmentation as Unsupervised Constituency Parsing [J].
Alhama, Raquel G. .
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, :4103-4112
[4]   On the Role of Supervision in Unsupervised Constituency Parsing [J].
Shi, Haoyue ;
Livescu, Karen ;
Gimpel, Kevin .
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, :7611-7621
[5]   An Empirical Comparison of Unsupervised Constituency Parsing Methods [J].
Li, Jun ;
Cao, Yifan ;
Cai, Jiong ;
Jiang, Yong ;
Tu, Kewei .
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, :3278-3283
[6]   Phrase-Aware Financial Sentiment Analysis Based on Constituent Syntax [J].
Xiang, Chunli ;
Zhang, Junchi ;
Zhou, Jun ;
Li, Fei ;
Teng, Chong ;
Ji, Donghong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 :1994-2005
[7]   Unsupervised Discourse Constituency Parsing Using Viterbi EM [J].
Nishida, Noriki ;
Nakayama, Hideki .
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 :215-230
[8]   Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars [J].
Yang, Songlin ;
Levy, Roger P. ;
Kim, Yoon .
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, :5747-5766
[9]   Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads [J].
Li, Bowen ;
Kim, Taeuk ;
Amplayo, Reinald Kim ;
Keller, Frank .
1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, :409-424
[10]   Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing [J].
Shayegh, Behzad ;
Wen, Yugiao ;
Mou, Lili .
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, :15135-15156