PureMessage identifies spam by analyzing messages according to a set of anti-spam rules. Each rule has a test and a corresponding "weight". For each rule that matches the message, the weight is added to the message's total spam score. After all rules are applied, the spam score is converted to a percentage. The PureMessage policy performs actions (such as quarantining a message) based on the percentage that expresses the message's total spam score.
The PureMessage applications and configuration files used to configure spam detection from the command line are:
Anti-Spam Policy Related Configuration Files
PureMessage spam detection uses a number of 'feature groups'. Each feature group implements a different method of message analysis. One or more feature groups can be enabled at the same time. Feature groups are enabled via the configuration files stored in the /opt/pmx/etc/spam.d/compile.d directory.
The spam.conf configuration file sets general message-scanning parameters for all feature groups. These general configuration options are combined with the feature-group-specific options in the other configuration files.
After altering anti-spam configuration, enabling or disabling a feature group, or adding or modifying rules, you must re-start the PureMessage milter (using the command pmx-milter restart) in order for the changes to take effect.
PureMessage is distributed with a set of pre-configured anti-spam rules. These rules are regularly updated as part of the PureMessage Anti-Spam heuristic update. Only the weight and probability delta can be altered for default rules; these alterations are done using the pmx-spam program (see the pmx-spam man page for more information).
Custom rules are stored in the re.rules file, located in the etc/spam.d directory located beneath the default PureMessage installation directory. Custom rule files are never updated as part of the PureMessage Anti-Spam heuristic update.
When rules are applied to messages, both default and custom rules are used.
Rule status (enabled or disabled), weights and probabilities are stored in a database, rather than in the rule definition files. To adjust rule weights, use the pmx-spam program.
In addition to the pre-configured anti-spam rules distributed with PureMessage, a secondary set of rules can be generated from your own message set. These rules are generated using "adaptive classification", which is a training system derived from a branch of statistics called Bayesian analysis. An adaptive classifier is a classification algorithm combined with a dynamic set of features. Features are the words (for example, "viagra") that define spam characteristics. The adaptive classifier's function is described as "training" because the rule set is generated by "learning" about the characteristics of messages contained in the training database.
If adaptive classification is enabled, the adaptive classification anti-spam rules are used together with the standard anti-spam rules to analyze messages for spam. Enabling adaptive classification rules has no effect on the standard anti-spam rule set; custom modifications to the standard rule set are preserved, and updates to the standard rule set can continue to be applied.