Summary:

This KBA illustrates a technique for identifying the cause of duplicate companies being returned when doing a Company search. This issue may also manifest as views being missing when creating a group on the Company entity.

Symptoms:

An number of existing scripts for the identification fo duplicates are available, posted under article 636-16827. The purpose if this article is to explain how to identify duplicates where an existing cause has not been determined.

Here's how I usually do this:

Since the Company search is carried out using vSearchListCompany, the duplicates must be caused by one of the tables that's part of that view. It usually isn't a problem with the Company table, more often something on the phone or email links. The trick is to search vSearchListCompany for a known duplicate company.

Next, you take the view SQL from vSearchListCompany and use it to search for a known duplicate, commenting out one of the linked tables. Here's an example where I commented out the Person email and phone information:

SELECT RTRIM(ISNULL(Pers_FirstName, '')) + ' ' + RTRIM(ISNULL(Pers_LastName, '')) AS Pers_FullName,
-- epd_pers.epd_EmailAddress as Pers_EmailAddress, epd_pers.epd_PhoneCountryCode as Pers_PhoneCountryCode,
-- epd_pers.epd_PhoneAreaCode as Pers_PhoneAreaCode,
-- epd_pers.epd_PhoneNumber as Pers_PhoneNumber, epd_pers.epd_PhoneFullNumber AS Pers_PhoneFullNumber,
-- epd_pers.epd_FaxCountryCode as Pers_FaxCountryCode, epd_pers.epd_FaxAreaCode as Pers_FaxAreaCode,
-- epd_pers.epd_FaxNumber as Pers_FaxNumber, epd_pers.epd_FaxFullNumber AS Pers_FaxFullNumber,
epd_comp.epd_EmailAddress as Comp_EmailAddress, epd_comp.epd_PhoneCountryCode as Comp_PhoneCountryCode,
epd_comp.epd_PhoneAreaCode as Comp_PhoneAreaCode, epd_comp.epd_PhoneNumber as Comp_PhoneNumber,
epd_comp.epd_PhoneFullNumber AS Comp_PhoneFullNumber, epd_comp.epd_FaxCountryCode as Comp_FaxCountryCode,
epd_comp.epd_FaxAreaCode as Comp_FaxAreaCode, epd_comp.epd_FaxNumber as Comp_FaxNumber,
Company.*,
Person.*,
Account.*,
Address.*,
(SELECT CASE WHEN CompanyCount = 0 or AccountCount = 0 THEN NULL ELSE 'Y' END
FROM vSharedAddress
WHERE vSharedAddress.AdLi_AddressId = Addr_AddressId) as Addr_IsSharedAddress
FROM Company
LEFT JOIN CRMEmailPhoneData epd_comp ON epd_comp.epd_EntityID = 5 AND epd_comp.epd_RecordID = Comp_CompanyID
LEFT JOIN Person ON Comp_PrimaryPersonId = Pers_PersonId
-- LEFT JOIN CRMEmailPhoneData epd_pers ON epd_pers.epd_EntityID = 13 AND epd_pers.epd_RecordID = Pers_PersonID
LEFT JOIN Address ON Comp_PrimaryAddressId = Addr_AddressId
LEFT JOIN Account ON Comp_PrimaryAccountId = Acc_AccountId
WHERE Comp_Deleted IS NULL and Acc_Deleted IS NULL
and Comp_Name like N'Duplicate company name%'

If you run the commented-out statement and only one record is returned, then you know where the duplicates are. If you still get duplicates, try commenting out another table (e.g. epd_comp).

Once you've spotted where the duplicates are, you can take a look at a sample of them to decide whether the duplicates are identical (e.g. the same emailaddresses), or whether there's something weirder going on (e.g. different email addresses, but multiple Business email addresses for a each company).

What you'll decide to do with them really depends on what's wrong with the data. It it's a problem with identical emails, you might just delete the duplicates. If you've multiple records of the same types, you might want to do something clever to assign the additional records to new email or phone types.

More info:

There are multiple case references and KBAs available with SQL snippets that have been used to identify the source of duplicates, and how to remove them. The fixes for these cases range from the trivial to very involved, depending on the root cause of the issue.

Executing untested SQL snippets from cases that look superficially similar is not recommended - it's very important in these cases to be sure that you understand why the duplicates are being returned, and what exactly needs to be done to resolve the issue before any action is attempted.

Once the cause of the issue has been identified, a backup should be taken, and full testing should be planned prior to running any fix.