Parallel text acquisition from the Web is an attractive way for augmenting statistical models (e.g., machine translation, crosslingual document retrieval) with domain representative data. The basis for obtaining such data is a collection of pairs of bilingual Web sites or pages. In this work, we propose a crawling strategy that locates bilingual Web sites by constraining the visitation policy o...