计算机科学                        
                
                                
                        
                            云计算                        
                
                                
                        
                            插补(统计学)                        
                
                                
                        
                            缺少数据                        
                
                                
                        
                            可靠性(半导体)                        
                
                                
                        
                            数据质量                        
                
                                
                        
                            数据挖掘                        
                
                                
                        
                            光学(聚焦)                        
                
                                
                        
                            数据库                        
                
                                
                        
                            服务(商务)                        
                
                                
                        
                            机器学习                        
                
                                
                        
                            操作系统                        
                
                                
                        
                            功率(物理)                        
                
                                
                        
                            物理                        
                
                                
                        
                            经济                        
                
                                
                        
                            量子力学                        
                
                                
                        
                            光学                        
                
                                
                        
                            经济                        
                
                        
                    
            作者
            
                Fangkai Yang,Wenjie Yin,Saravan Rajmohan,Tianci Li,Pu Zhao,Bo Liu,Paul Wang,Bo Qiao,Yudong Liu,Mårten Björkman,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang            
         
                    
            出处
            
                                    期刊:Cornell University - arXiv
                                                                        日期:2023-08-03
                                                                
         
        
    
            
            标识
            
                                    DOI:10.48550/arxiv.2309.02564
                                    
                                
                                 
         
        
                
            摘要
            
            Reliability is extremely important for large-scale cloud systems like Microsoft 365. Cloud failures such as disk failure, node failure, etc. threaten service reliability, resulting in online service interruptions and economic loss. Existing works focus on predicting cloud failures and proactively taking action before failures happen. However, they suffer from poor data quality like data missing in model training and prediction, which limits the performance. In this paper, we focus on enhancing data quality through data imputation by the proposed Diffusion+, a sample-efficient diffusion model, to impute the missing data efficiently based on the observed data. Our experiments and application practice show that our model contributes to improving the performance of the downstream failure prediction task.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI