"They discovered that paid posters tend to post more new comments than replies to other comments.
They also post more often with 50 per cent of them posting every 2.5 minutes on average. They also move on from a discussion more quickly than legitimate users, discarding their IDs and never using them again. What's more, the content they post is measurably different. These workers are paid by the volume and so often take shortcuts, cutting and pasting the same content many times.
This would normally invalidate their posts but only if it is spotted by the quality control team. (soooo who's on our QC team here at Anandtech?) So Cheng and co built some software to look for repetitions and similarities in messages as well as the other behaviors they'd identified. (http://lmgtfy.com/http://www.lmfgtfu.com ????) They then tested it on the dataset they'd downloaded from Sina and Sohu and found it to be remarkably good, with an accuracy of 88 per cent in spotting paid posters."