Glass surfaces, such as windows, are ubiquitous in our daily life. However, conventional object detection methods in the field of computer vision perform poorly in detecting them due to their transparent nature and varying shapes. In this paper, we propose a glass surface detection network named CGSDNet, which employs ConvNeXt backbone network and a novel Cascade Atrous Pooling (CAP) module to extract multi-scale contextual features (e.g., the difference between objects located inside and outside the glass, the reflection, texture and obstructed situation of the glass). Additionally, we present a lightweight Holistic Boundary Detection (HBD) module to capture boundary features from the contextual features. Finally, we propose a Cascaded Architecture to fuse the contextual features with the boundary features, generating dense large-field contextual features with enhanced boundaries, which are utilized for robust glass surface detection. Extensive experiments on benchmark datasets demonstrate that our proposed method outperforms state-of-the-art (SOTA) methods from relevant fields.