BNC Baby

  1. 4 million-word sub corpus of the 100 million-word British National Corpus

  2. British English at the end of the 20th century

  3. four distinct registers (academic texts, newspapers, fiction, conversations)

  4. Part-of-Speech (POS) tagging

  5. XML format

For more details refer to the BNC website.