Monthly Archives: June 2017

Machine Learning that Matters

A summary of the paper Machine Learning that Matters by Kiri L. Wagstaff (2012). The paper express concern over the ML research detached from the larger world of science and society. Most of the ML researchers work on synthetic/benchmark data, use some statistical methods to evaluate the result and wind up the research by just […]

Perwad English Corpus

There are thirteen million words from two sets of text, top books from Project-Gutenberg and Forbes’ billionaire profiles, in PEC currently. Here is the statistics. Vocabulary size Share in written English by Oxford by Perwad 10 25% 26.1% 100 50% 54.8% 1000 74.1% 79.2% 2000 81.3% 85.4% 3000 85.2% 88.7% 4000 87.6% 90.7% 5000 89.4% […]