An effective and efficient Web content extractor for optimizing the crawling process. (27th March 2013)