All Things Email

About | Contact

Bayesian Noise Reduction: Contextual Symmetry Logic Utilizing Pattern Consistency Analysis

by Jonathan A. Zdziarski

2004-01-14
Language: English

External links

Full text: PDF

Information about this paper

Abstract

Modern day language classification requires the use of machine learning, which relies heavily on presented learning input. Most of today's algorithms (Bayes, Chi-Square, etcetera) are inherently sound and accurate, however regardless of which algorithm is used, a great deal of any algorithm's accuracy is related directly to the quality of data provided - the Garbage In, Garbage Out rule. Bayesian Noise Reduction is a statistical approach to evaluating coherence by instantiating a series of machine-generated contexts to serve as a means of contrast. This makes it possible to identify text that is out of context using a form of pattern consistency checking. BNR attempts to solve the problem commonly referred to as "Bayesian Noise" which, in its simplest definition, refers to irrelevant data present in a message being classified. Bayesian Noise Reduction dubs irrelevant text in order to provide cleaner classification and is implemented as a pre-filter to existing language classification functions.

Creative Commons. Some Rights Reserved.
Copyright © 2004 Jochen Topf
Unless otherwise noted the contents on this site are licensed under the
Creative Commons Attribution-ShareAlike License.