Ok, so this isn’t mind blowing, since people have been doing frequency analysis for years, letter and word, for code breaking, stenography, and general nosey stuff, but with the advert of really fast computers and the internet, we have lots of things happening at once.
GenderGenie is a rather neat little script that uses the words in you rblog to work out if the author is male or female. The accuracy rate is about 60% from the website, though the authors claim it should be 80%. I tried it, and it worked. Taking the front page of this site, it came back as male (no surprise, anyone could work that out!) and so did a few of the comments I tested.
The Belle du Jour “blog” gave some interesting results – the text on the page, which is supposedly written by a London call-girl (No chance!) gave back the result that the entries were written by a man, if the option of “blog” was choosen. However, if written as “fiction” it returns a rating of “Female”! Either way, a telling result!
Of course, because the page shows the words that are used, and the weights given to each, it is rather fun to work out ways to subvert it…
“Now this is as good as something good” is a sentence almost sure to get your work returned as Male. Getting a good Female sentence is actually quite a lot harder, perhaps ‘cos I’m a guy?
“so since everything actually, like, happened to him too because everything actually, like, actually more happened to him since.” was about the ‘best’ I could manage…
You can play for yourself at the link above. My efforts got, firstly, 258 male, and secondly 715 female. Added to my page, it swings the gender a bit, but it still won’t rescue the front page (I just type “The” far too often!)