AI applications are employed in diverse scenarios, including data centers, personal computers, smart cars, and so on. Their privacy is threatened by the intricate software stacks and the potential malfeasance of system maintainers. The Trusted Execution Environment (TEE) has become popular for safeguarding applications from untrusted system software. However, AI applications are always speeded up with heterogeneous accelerators, e.g., GPU, which requires the TEE to be heterogeneous. A heterogeneous TEE should satisfy three requirements: 1) the joint heterogeneous abstraction that covers the CPU and GPUs and minimizes cooperation overhead among enclaves on them; 2) the high performance for supporting high-speed GPUs and introducing limited performance overhead; and 3) the compatibility with existing CPUs and GPUs so that existing machines can directly benefit from it. To meet the above requirements, this paper introduces XpuTEE, a practical and high-performance heterogeneous TEE system. XpuTEE provides a new abstraction called XpuEnclave, comprising the CEnclave to protect CPU-side logic and numerous XEnclaves to guard GPU tasks. XpuEnclave is a joint TEE crossing the CPU and connected GPUs, and it removes all cryptographic operations and extra memory copies for CPU-GPU communication, which allows XpuTEE to achieve high performance. The results demonstrate that XpuTEE has an average performance overhead of 2.48% for common AI applications.