Abstract: Today, DNN inference is widely adopted, with numerous inference services being spawned from scratch across instances in scenarios such as spot serving, serverless scaling and edge computing, ...