Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

被引:0
|
作者
Mehrotra, Anay [1 ]
Zampetakis, Manolis [2 ]
Kassianik, Paul [3 ]
Nelson, Blaine [3 ]
Anderson, Hyrum [3 ]
Singer, Yaron [3 ]
Karbasi, Amin [4 ]
机构
[1] Yale University, Robust Intelligence, United States
[2] Yale University, United States
[3] Robust Intelligence, United States
[4] Yale University, Google Research, United States
来源
arXiv | 2023年
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Iterative methods
引用
收藏
相关论文
共 5 条
  • [1] Black-Box Audits for Group Distribution Shifts
    Juarez, Marc
    Yeom, Samuel
    Fredrikson, Matt
    arXiv, 2022,
  • [2] Explainable AI: To Reveal the Logic of Black-Box Models
    Chinu
    Bansal, Urvashi
    New Generation Computing, 42 (01): : 53 - 87
  • [3] Research Status of Black-Box Intelligent Adversarial Attack Algorithms
    Wei, Jian
    Song, Xiaoqing
    Wang, Qinzhao
    Computer Engineering and Applications, 2023, 59 (13) : 61 - 73
  • [4] High-dimensional black-box optimization under uncertainty
    Anahideh, Hadis
    Rosenberger, Jay
    Chen, Victoria
    Computers and Operations Research, 2022, 137
  • [5] Randomized Black-Box PIT for Small Depth +-Regular Non-commutative Circuits
    Bharadwaj, G.V. Sumukha
    Raja, S.
    arXiv,