Abstract Multimodal optical and ultrasound imaging (USI) provides complementary diagnostic insights. However, because conventional USI uses opaque ultrasound (US) transducers, integrating these two modalities results in a bulky and complicated handheld probe in which neither modality performs efficiently. Although transparent ultrasound transducers (TUTs) solve these issues by acting as optical windows, enabling the seamless combination of light and US beams, single-element TUTs are not common in clinical environments. Here, we demonstrate a clinical triple-modal US, photoacoustic, and fluorescence imaging system, seamlessly integrated via a linear TUT-array. This system, with 64 channels and a 7-MHz center frequency achieves 72.7% optical transparency in the near infrared region. The system’s handheld opto-US probe coaxially integrates the TUT-array with a miniaturized camera and an optical fiber in a small form factor. The triple-modal imaging system effectively visualizes tissue structures, vasculatures, and lymphatics in real time in live animals, healthy volunteers, and lymphedema patients. By accurately mapping superficial tissues, blood vessels, and lymphatic vessels, we use the prototype system to successfully guide lymphovenous anastomosis microsurgery. These preclinical demonstrations illustrate the potential use of our system in various clinical procedures requiring microsurgical guidance, paving the way for future advances in multimodal imaging.