ABSTRACT Point‐of‐interest (POI) extraction aims to extract text POIs from real‐world data. Existing POI methods, such as social media‐based user generating and web crawling, either require massive human resources or cannot guarantee integrity and reliability. Therefore, in this paper, an end‐to‐end POI extraction framework based on digital city is proposed. It is built of digital models, textures, tiles and other digital assets collected by aircraft. The extraction process for POIs consists of segmenting it into four sequential stages: collecting, segmentation, recognition and cleaning, each enhanced through fine‐tuning on a proposed specialised digital scene dataset or via the development of tailored algorithms. Specifically, in the last stage, the application of large language model (LLM) is explored in the POI data cleaning field. By testing several LLMs of different scales using diverse chain‐of‐thought (CoT) strategies, the relatively optimal prompt scheme for different LLMs is identified regarding noise handling, formatted output and overall cleaning capability. Ultimately, POIs extracted through the proposed methodology exhibit superior quality and accuracy, surpassing the comprehensiveness of existing public commercial POI datasets, with the F 1‐score increased by 19.6%, 21.1% and 23.8% on Amap, Baidu and Google POI datasets, respectively.