Humans tend to mine objects by learning from a group of images or several frames video since we live in dynamic world. In the computer vision area, many researchers focus on co-segmentation (CoS), co-saliency detection (CoSD) and salient object (VSOD) discover co-occurrent objects. However, previous approaches design different networks for these similar tasks separately, they are difficult appl...