Phrase-aware Unsupervised Constituency Parsing

被引:0
作者
Gu, Xiaotao [1 ]
Shen, Yikang [3 ,4 ]
Shen, Jiaming [5 ]
Shang, Jingbo [2 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Univ Calif San Diego, San Diego, CA 92103 USA
[3] Univ Montreal, Mila, Montreal, PQ, Canada
[4] Tencent Inc, WeChat AI, Pattern Recognit Ctr, Beijing, Peoples R China
[5] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task. Despite their high accuracy in identifying low-level structures, prior arts tend to struggle in capturing high-level structures like clauses, since the MLM task usually only requires information from local context. In this work, we revisit LM-based constituency parsing from a phrase-centered perspective. Inspired by the natural reading process of human readers, we propose to regularize the parser with phrases extracted by an unsupervised phrase tagger to help the LM model quickly manage low-level structures. For a better understanding of high-level structures, we propose a phrase-guided masking strategy for LM to emphasize more on reconstructing non-phrase words. We show that the initial phrase regularization serves as an effective bootstrap, and phrase-guided masking improves the identification of high-level structures. Experiments on the public benchmark with two different backbone models demonstrate the effectiveness and generality of our method.
引用
收藏
页码:6406 / 6415
页数:10
相关论文
共 50 条
[21]   The Limitations of Limited Context for Constituency Parsing [J].
Li, Yuchen ;
Risteski, Andrej .
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, :2675-2687
[22]   Language model as an Annotator: Unsupervised context-aware quality phrase generation [J].
Zhang, Zhihao ;
Zuo, Yuan ;
Lin, Chenghua ;
Wu, Junjie .
KNOWLEDGE-BASED SYSTEMS, 2024, 283
[23]   Phrase2Vec: Phrase embedding based on parsing [J].
Wu, Yongliang ;
Zhao, Shuliang ;
Li, Wenbin .
INFORMATION SCIENCES, 2020, 517 (517) :100-127
[24]   Constituency Parsing of Complex Noun Sequences in Hindi [J].
Batra, Arpita ;
Paul, Soma ;
Kulkarni, Amba .
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2014, PT I, 2014, 8403 :285-296
[25]   Order-sensitive Neural Constituency Parsing [J].
Wang, Zhicheng ;
Shi, Tianyu ;
Xiao, Liyin ;
Liu, Cong .
2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, :282-287
[26]   Constituency Parsing by Cross-Lingual Delexicalization [J].
Kaing, Hour ;
Ding, Chenchen ;
Utiyama, Masao ;
Sumita, Eiichiro ;
Sudoh, Katsuhito ;
Nakamura, Satoshi .
IEEE ACCESS, 2021, 9 :141571-141578
[27]   Challenges to Open-Domain Constituency Parsing [J].
Yang, Sen ;
Cui, Leyang ;
Ning, Ruoxi ;
Wu, Di ;
Zhang, Yue .
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, :112-127
[28]   Fast and Accurate Neural CRF Constituency Parsing [J].
Zhang, Yu ;
Zhou, Houquan ;
Li, Zhenghua .
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, :4046-4053
[29]   Constituency Parsing with a Self-Attentive Encoder [J].
Kitaev, Nikita ;
Klein, Dan .
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, :2676-2686
[30]   Improving Sequence-to-Sequence Constituency Parsing [J].
Liu, Lemao ;
Zhu, Muhua ;
Shi, Shuming .
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, :4873-4880