Saturday, October 31, 2009

Spammers are exploiting UTF8 homonyms to hide the spam keywords

Don't know it that is the correct name, Twitter spammer are currently using UTF8 encodings of characters that are actually normal latin chars to hide their spam keywords e.g. EFBD89 -> i, EFBD8F -> o etc.

The code EFBD89 translates to FF49 in Unicode which is a 2nd i character in the Unicode table (I wonder what is the point of this) that display as spaced characters.

This can be fixed easily, but right now any keyword filter will fail on this I think.


Thursday, October 15, 2009

Idle ...

Meine Schwester hat sich beschwert dass hier keine neuen Posts auftauchen, deswegen mal ein paar Links:

Twitter.com

post.lehmann.cx (Posterous)

Facebook

www.lehmann.cx

Friendfeed

Aardvark

Redux