[ Mælström ]

This came in 2006 after attending a talk on bioinformatics. I had the idea of making an email client that would take the methods of bioinformatics and apply them to spam-detection.

Searches through input and outputs sequences that are repeated. Because it's intended for text files, control characters are ignored.

FindPatterns [filename] [-b] [-e] [-i] [-o] [-v] [-m<n>] [-l<n>] [-g<n>] [-?|h]

Attempt to read input from this file, otherwise uses stdin.
Keep a buffer to count repeated matches (!o -> b.)
Echo input.
Case-insensitive (not implemented.)
Don't display matches at the end.
Output matches immediately as they are found.
Silent mode - plain output with no extra characters.
Verbose comments while outputting.
Set memory buffer granularity to the closest power of two lower than <n> bytes (default 1024.)
Set match limit to <n> matches (default 4096; 0 -> no limit.)
Set minimum match length to <n> symbols (default 3).
Display this help screen and exit.

Adding -<s>- will turn off switch <s>.

Also included is a simple KillSpam email client that takes the patterns generated (from FindPatterns) and eliminates all the emails that have matching patterns.

[ Dir ]
[ File ]
Source code for FindPatterns and KillSpam.
[ File ]
killspam.exe (62 KB)
The mail client for Windows console (included with the source code.)
[ File ]
spamregx (1 KB)
Sample regX.
-- π

From http://neil.chaosnet.org/code/FindPatterns/.