Transformers have excelled in image restoration due to their advanced representational abilities. However, their reliance on a fixed local window for attention often undermines translation invariance and local relationship preservation. This limitation can reduce network stability, especially when dealing with positional changes in degradation scenarios. In this research, we present a new Bayesian Window Transformer, which innovates by employing a probability distribution for window shifts, overcoming the limitations of fixed window configurations in traditional transformers. This approach allows for more flexible coverage beyond a predetermined region. During the evaluation procedure, we further develop two approximate inference algorithms: Layer Expectation Propagation and Monte Carlo Average. These two algorithms calculate expectations derived from the introduced distribution to effectively approximate the marginalization results of the probabilistic variables. Hence, our Bayesian Window Transformer not only inherits the powerful representation ability but also maintains essential properties like translation invariance and local relationship preservation for image restoration. We also provide a theoretical guarantee, demonstrating that our method is aligned with the classic sliding window technique in terms of receptive field sizes and sliding behavior. Comprehensive experiments validate the exceptional effectiveness of our Bayesian Window Transformer across multiple image restoration tasks, including image deraining, denoising, and deblurring.