Resource Allocation for Sequential Decision Making Under Uncertainaty : Studies in Vehicular Traffic Control, Service Systems, Sensor Networks and Mechanism Design

计算机科学 计算机网络 资源配置 服务质量 分布式计算 控制(管理) 实时计算 服务(商务)
作者
L A Prashanth
链接
摘要

A fundamental question in a sequential decision making setting under uncertainty is “how to allocate resources amongst competing entities so as to maximize the rewards accumulated in the long run?”. The resources allocated may be either abstract quantities such as time or concrete quantities such as manpower. The sequential decision making setting involves one or more agents interacting with an environment to procure rewards at every time instant and the goal is to find an optimal policy for choosing actions. Most of these problems involve multiple (infinite) stages and the objective function is usually a long-run performance objective. The problem is further complicated by the uncertainties in the sys-tem, for instance, the stochastic noise and partial observability in a single-agent setting or private information of the agents in a multi-agent setting. The dimensionality of the problem also plays an important role in the solution methodology adopted. Most of the real-world problems involve high-dimensional state and action spaces and an important design aspect of the solution is the choice of knowledge representation. The aim of this thesis is to answer important resource allocation related questions in different real-world application contexts and in the process contribute novel algorithms to the theory as well. The resource allocation algorithms considered include those from stochastic optimization, stochastic control and reinforcement learning. A number of new algorithms are developed as well. The application contexts selected encompass both single and multi-agent systems, abstract and concrete resources and contain high-dimensional state and control spaces. The empirical results from the various studies performed indicate that the algorithms presented here perform significantly better than those previously proposed in the literature. Further, the algorithms presented here are also shown to theoretically converge, hence guaranteeing optimal performance. We now briefly describe the various studies conducted here to investigate problems of resource allocation under uncertainties of different kinds: Vehicular Traffic Control The aim here is to optimize the ‘green time’ resource of the individual lanes in road networks that maximizes a certain long-term performance objective. We develop several reinforcement learning based algorithms for solving this problem. In the infinite horizon discounted Markov decision process setting, a Q-learning based traffic light control (TLC) algorithm that incorporates feature based representations and function approximation to handle large road networks is proposed, see Prashanth and Bhatnagar [2011b]. This TLC algorithm works with coarse information, obtained via graded thresholds, about the congestion level on the lanes of the road network. However, the graded threshold values used in the above Q-learning based TLC algorithm as well as several other graded threshold-based TLC algorithms that we propose, may not be optimal for all traffic conditions. We therefore also develop a…

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
1秒前
田様应助令莞采纳,获得10
1秒前
香槟完成签到,获得积分10
1秒前
大气如雪完成签到,获得积分10
1秒前
MP应助科研通管家采纳,获得60
2秒前
大个应助科研通管家采纳,获得10
2秒前
在水一方应助科研通管家采纳,获得10
2秒前
Lucas应助科研通管家采纳,获得10
2秒前
蓝天应助科研通管家采纳,获得10
2秒前
bkagyin应助科研通管家采纳,获得10
2秒前
2秒前
2秒前
NexusExplorer应助科研通管家采纳,获得10
2秒前
脑洞疼应助科研通管家采纳,获得10
3秒前
3秒前
李爱国应助科研通管家采纳,获得10
3秒前
3秒前
3秒前
3秒前
3秒前
单纯紫菱发布了新的文献求助10
3秒前
晓书斋完成签到,获得积分10
3秒前
nihaoaaaa完成签到,获得积分10
3秒前
5秒前
隐形曼青应助迦叶采纳,获得10
5秒前
6秒前
6666应助lsz采纳,获得10
6秒前
Tang完成签到,获得积分10
6秒前
6秒前
我是老大应助yekindar采纳,获得10
8秒前
自由井发布了新的文献求助10
10秒前
月亮打烊完成签到,获得积分10
10秒前
tomorrow发布了新的文献求助10
10秒前
lsl599完成签到,获得积分10
12秒前
天天快乐应助yutian采纳,获得10
13秒前
苹果飞荷完成签到,获得积分10
14秒前
LIVE完成签到,获得积分10
15秒前
15秒前
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Adhesion Science: Principles & Practice 800
The Graphene Handbook (2019 Edition) 700
Signals, Systems, and Signal Processing 610
IEST-RP-CC018: Cleanroom Cleaning and Sanitization: Operating and Monitoring Procedures 600
Fundamentals of Pharmaceutical and Biologics Regulations: A Global Perspective, Second Edition 600
Fundamentals of Modern Mathematics: A Practical Review (Dover Books on Mathematics) 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6532182
求助须知:如何正确求助?哪些是违规求助? 8325045
关于积分的说明 17827296
捐赠科研通 5633509
什么是DOI,文献DOI怎么找? 2933093
邀请新用户注册赠送积分活动 1909678
关于科研通互助平台的介绍 1768686