Baraka Network Baraka Network
Translate this page in :
French
German Italian Portuguese Arabic Japanese Korean Spanish
Mastering Networks made easy

 

"I needed help urgently with my office network.
I called up Baraka and they solved the issue in no time at all. "
            -Mike Kendrick.


Easy Solutions at Baraka

 

ontact Details:
Head office
6-353 Broadway
Shawinigan Que.
Canada G9N-1M2
(819) 531-2340

 


How ReSieve does its email classification 

 ReSieve uses a technique called Naive Bayes to calculate the probability that the words in an email mean that that email falls into a specific bucket.

A bucket is represented by a collection of words and their frequency. The set of buckets is called the corpus and determines that different buckets that an email can be placed in, the probability of an individual word existing in an email for a specific bucket and the probability of an email being in a bucket to start with.

Suppose there are n buckets B1 to Bn and there are m words in total W1 to Wm. We want to know for a specific email E which bucket it is most likely to belong to.

We want to calculate the P(Bi|E) for each bucket Bi. That calculation can be performed using Bayes rule as follows

               P(E|Bi) x P(Bi)
P(Bi|E) = ---------------
                     P(E)

Here P(Bi|E) is the probability that email E is in bucket Bi; that is the probability that given a set of words E they appear in bucket Bi.

P(E|Bi) is the probability that for a given bucket Bi the words in E appear in that bucket.

P(Bi) is the probability of a given bucket; that is the probability of any email being in bucket Bi.

P(E) is the probability of that specific email occurring.

To calculate which bucket E should go in we need to calculate P(Bi|E) for each of the buckets and find the largest. Since each of those calculations involves the value P(E) we just ignore it and pretend that we need to calculate

P(Bi|E) = P(E|Bi) x P(Bi)

First E is split into the set of words in E, call them E1 through Eo. To calculate P(E|Bi) we calculate the product of the probabilities for each word. That is the likelihood that each word appears in Bi. Here's the "naive" step; we assume that words appear independent from other words which is clearly not true for most languages!

P(E|Bi) = P(E1|Bi) x P(E2|Bi) x ... x P(Eo|Bi)

For any bucket P(Ej|Bi) is calculated as the number of times Ej appears in Bi divided by the total number of words in Bi.

P(Bi) is calculated as the total number of words in Bi divided by the total number of words in all the bucket put together.

Finally we calculate P(Bi|E) as

P(Bi|E) = P(E1|Bi) x P(E2|Bi) x ... x P(Eo|Bi) x P(Bi)

for each bucket and pick the largest.

 

First Time Setup

First Time Setup:

Step 1
 Bucket Setup
Step 2
 Email Programs 
Step 3
 Training 

ReSieve Howto

Config
Tools
Dns Manager

Using ReSieve with:

Other Proxies 
Proxy Chaining
Norton Anti Virus
Misc

Email Client Config:

Outlook
Outlook Express
Eudora
Pegasus
Avast
Forte Agent

How it works

Classification

Home :: :: Products :: :: How to :: :: FAQs :: :: Download :: :: Contact Us