This study introduces a novel model based on self-attention mechanisms to generate out-of-sample forecasts of cross-sectional returns. This model is designed to capture the non-linearity, heterogeneity, and interaction between stocks inherent in cross-sectional pricing problems. The empirical results from the Chinese stock market reveal compelling findings, surpassing other benchmarks in terms of out-of-sample R2. Moreover, this model demonstrates both practical applicability and robustness. These results provide valuable evidence supporting the existence of the three aforementioned properties in cross-sectional pricing problems from a theoretical standpoint, and this model offers a powerful tool for implementing profitable long-short strategies.