Short version: How can I convert an XMLElement that represents part of a HTML document to plain text? Long version: The more general problem is extracting information from webpages. We can import the webpage as an XMLObject and extract the relevant part. But this may still be a complex expression of many nested XMLElement s (several paragraphs, links, emphasis, etc.), while I'm typically only interested in the text. Let's take a random example: extracting the text from this article . Using the developer tools of any modern browser it's easy to find out that the relevant part is in a div with id="article-body-blocks" . So we do page = Import[ "http://www.guardian.co.uk/science/blog/2012/nov/13/science-enforced-humility", "XMLObject"]; body = Cases[page, XMLElement["div", {"id" -> "article-body-blocks"}, ___], Infinity]; The body is still a compound expression. Is there a built-in, direct way to extr...