Distributed lock services are extensively utilized in distributed systems to serialize concurrent accesses to shared resources. The need for fast and scalable lock services has become more pronounced with decreasing task execution times and expanding dataset scales. However, traditional lock managers, reliant on server CPUs to handle lock requests, experience significant queuing delays in lock grant latency. Advanced network hardware (e.g., programmable switches) presents an avenue to manage locks without queuing delays due to their high packet processing power. Nevertheless, their constrained memory capacity restricts the number of locks they can manage, thereby limiting their efficiency and efficacy in large-scale workloads with millions of locks. This paper introduces the concept of lock fission, which enables efficient management of million-scale locks by exploiting both programmable switches and servers. Lock fission decouples lock management into a memory-efficient grant decision process and a latency-insensitive participant maintenance process. This allows the programmable switch to efficiently make grant decisions for numerous locks, while servers asynchronously maintain participants (i.e., holders and waiters). Furthermore, by using the programmable switch for routing, lock fission supports on-demand, fine-grained lock migration, reducing network traffic and lock release delays. Building on this idea, we present FissLock , a fast and scalable in-network lock service for two representative lock management settings: with lock managers on dedicated servers or colocated with applications. Evaluation using various benchmarks and a real-world application shows FissLock ’s efficiency and efficacy. Compared to the state-of-the-art in-network lock manager, FissLock cuts up to 82.9% (from 44.3%) of median lock grant time in the microbenchmark and improves transaction throughput for TATP and TPC-C by 2.26 × and 2.46 ×.