A tree-based learning approach for document structure analysis and its application to web search. (August 2015)