The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight