Object detection is a pivotal task in the realm of autonomous driving, where reliance on single-modality information often proves inadequate for high-precision detection tasks. In current research, object detection networks based on point clouds effectively identify objects with dense point clouds. However, these networks face challenges when dealing with detection tasks involving objects with higher point cloud sparsity. To address this issue, this paper proposes a 3D object detection network based on point cloud and image fusion. This network employs a fusion module named PRF (Point-Region Fusion), utilizing the K-Nearest Neighbors (KNN) algorithm to find the nearest K points corresponding to point cloud features. It then collects regional features from image feature maps and fuses them with the point cloud features after aggregation. The designed Image Feature Fusion module (IF-Fusion) fuses image feature maps of varying sizes in a pairwise manner. This fusion process is key in preserving features of small objects and subsequently enriching the point cloud’s features. In evaluations conducted on the KITTI benchmark, the method presented here surpasses prior fusion networks in detection accuracy, achieving rates of 91.95%, 81.10%, and 78.08%.