Fuzzy text matching for Email Loop Protection
Email Loop Protection is a feature of Email to Case Premium which can be used to help identify and stop email loops in their tracks. Each time a case is created via Email to Case Premium, a check is performed to see if the Subject and Sender of the new case matches a previous case which created a defined amount of minutes prior. Email Loop Protection applies fuzzy text matching to the case subject field, and the aggressiveness of this can be configured following the details outlined below.
Email to Case Premium (E2CP) will first search for cases created within the time frame specified on Inbound Configuration, where the supplied email and the subject exactly match the incoming email. If a match is found, the new case will be marked as a duplicate by prefixing the subject with [Email Loop Protection] and inbound message processing will continue executing.
If a match is not found then E2CP will query cases with the same supplied email and check each for similarity to the incoming email subject, if their subjects are more than 5 characters. This entails two parts:
1) If the case subject and email subject are greater than 20 characters then each will be split in half (if less than 20 characters, it uses portions greater than half for each check to reduce false positives) and E2CP will check if both halves are contained in the other full value. For example, suppose a case has the subject "Ticket 10000: The front office computer is broken" and an email is received with the subject "Ticket 10001: The front office computer is broken". E2CP will check to see if "Ticket 10000: The front " and "office computer is broken" are contained within "Ticket 10001: The front office computer is broken", and vice versa. Each one of the four checks is assigned 2.5 points. If the total is greater than 5 points then the new case will be marked as a duplicate by prefixing the subject with [Email Loop Protection] and inbound message processing will continue executing.
2) If the score obtained in part 1 is too low then an edit difference comparison will be performed. This results in a value of 10 or less which corresponds in a broad sense to how many characters the case and email subjects have in common in nearby positions. A value of 10 would indicate identical subjects, 5 would mean that 50% of the characters would need to be edited in one to transform it into the other, and 0 would mean that every character would need to be changed (no similarity at all). If this value is greater than the minimum required similarity (default: 5) then the new case will be marked as a duplicate by prefixing the subject with [Email Loop Protection] and inbound message processing will continue executing.
Two preferences can be applied to the fuzzy text matching enhancement. If you would like to customize the fuzzy text matching to be more or less aggressive:
Navigate to Setup | Custom Code | Custom Settings, and click Manage next to "casepref".
MINIMUM_DUPE_SCORE: [0-10, default 5], the similarity required to be marked as a duplicate.
SHOW_DUPE_SCORE: [true/false, default: false], instructs E2CP to include the similarity score in the duplicate case's subject