Abstract Surgical scene understanding (SSU) uses artificial intelligence (AI) to interpret visual data from surgeries, such as laparoscopic videos. Despite promising foundational research on instrument and anatomy recognition, clinical adoption remains minimal. This systematic review and meta-analysis (PROSPERO: CRD420251005301) evaluates current SSU research in minimally invasive abdominal surgery, focusing on data curation, model design, validation, reporting standards, and clinical relevance. A total of 188 studies were reviewed. Most relied on small, single-center datasets (70.7%), primarily laparoscopic cholecystectomies (59.0%), reflecting an overall narrow topical breadth. Validation practices were often weak, rarely involving external datasets (10.1%) or clinical experts. Few studies addressed clinical translation (5.9%), model performance variability estimation (38.3%), or made code available (29.8%). Overall, limited progress toward clinical integration has been made over the past decade. Our findings highlight the need for diverse, multi-institutional datasets, robust validation practices, and clinically driven development to unlock the full potential of SSU in surgical practice.