数学优化
启发式
计算机科学
随机森林
集合(抽象数据类型)
树(集合论)
采样(信号处理)
灵敏度(控制系统)
整数(计算机科学)
最优化问题
数学
人工智能
工程类
程序设计语言
数学分析
滤波器(信号处理)
计算机视觉
电子工程
作者
Max Biggs,Rim Hariss,Georgia Perakis
摘要
Abstract In this paper, we examine a data‐driven optimization approach to making optimal decisions as evaluated by a trained random forest, where these decisions can be constrained by an arbitrary polyhedral set. We model this optimization problem as a mixed‐integer linear program. We show this model can be solved to optimality efficiently using pareto‐optimal Benders cuts for ensembles containing a modest number of trees. We consider a random forest approximation that consists of sampling a subset of trees and establish that this gives rise to near‐optimal solutions by proving analytical guarantees. In particular, for axis‐aligned trees, we show that the number of trees we need to sample is sublinear in the size of the forest being approximated. Motivated by this result, we propose heuristics inspired by cross‐validation that optimize over smaller forests rather than one large forest and assess their performance on synthetic datasets. We present two case studies on a property investment problem and a jury selection problem. We show this approach performs well against other benchmarks while providing insights into the sensitivity of the algorithm's performance for different parameters of the random forest.
科研通智能强力驱动
Strongly Powered by AbleSci AI