计算机科学
人工智能
深度学习
生成语法
任务(项目管理)
自然语言处理
机器学习
领域(数学分析)
变压器
生成模型
计算机的法律问题
语言模型
情绪分析
光学(聚焦)
任务分析
计算语言学
特征工程
自然语言
多任务学习
法律文件
作者
Eoin O’Connell,William Duffy,Niall McCarroll,Katie Sloan,Kevin Curran,Eugene McNamee,Angela Clist,Andrew Brammer
标识
DOI:10.1007/s10506-025-09484-4
摘要
Abstract Recent advances in Generative Language Models (GLMs) have renewed focus on promising results in zero-shot text classification. However, their off-the-shelf performance on unfamiliar and domain specific tasks remains uncertain. In this legal clause classification task we evaluate a plug-and-play zero-shot prompting strategy for OpenAI’s GPT-4 GLM on a contract clause dataset. We introduce the new CUAD-SL dataset that has been refactored as a single label classification problem as a fairer and more robust legal classification benchmark. In a comparative study, we show that fine-tuning on legal domain data adapts smaller, less complex models to the task at hand, with significant classification accuracy improvement of up to 20.6%, with a best overall performance of 87.8% for the DeBERTa Transformer model compared to GPT-4's 67.2% performance. This study also takes the novel approach of assessing the business feasibility of deploying each of these machine learning models through a detailed cost–benefit analysis that measures the trade-off between performance metrics and low and high usage running costs.
科研通智能强力驱动
Strongly Powered by AbleSci AI