Find Duplicate Foreign Keys

Posted by & filed under , , , , , .

A few weeks back I was working on a process that would utilize foreign keys to crawl from table to table to export data from a production database to an archive database.  More on some of what went into that process probably at some point in the future.

When the project was released to production a strange issue occurred.  I was getting duplicates in a few of the tables that previously didn’t cause duplicates to occur.  I went through the code and was able to determine that the issue was tied to duplicated INSERT statements that derived from foreign keys.

To solve the issue I ended up creating the query below that reviewed the foreign keys on the tables to determine if there were duplicates in the database.


;WITH ForeignKeys1 AS
(
SELECT fk.object_id
,OBJECT_NAME(fk.parent_object_id) as table_name
,fk.name as foreign_key_name
,fk.create_date
,(SELECT CAST(parent_object_id as varchar(50))
+ SPACE(1) + CAST(parent_column_id as varchar(50))
+ SPACE(1) + CAST(referenced_object_id as varchar(50))
+ SPACE(1) + CAST(referenced_column_id as varchar(50)) as [data()]
FROM sys.foreign_key_columns fkc
WHERE fk.object_id = fkc.constraint_object_id
ORDER BY constraint_column_id
FOR XML PATH('')) foreign_key
FROM sys.foreign_keys fk
)
,ForeignKeys2 AS
(
SELECT object_id, table_name, foreign_key_name, foreign_key
,ROW_NUMBER() OVER (PARTITION BY foreign_key ORDER BY create_date) as ForeignKeyRank
FROM ForeignKeys1
)
SELECT x.table_name as [Table Name]
,x.foreign_key_name as [Foreign Key]
,y.foreign_key_name as [Foreign Key Duplicate]
FROM ForeignKeys2 x
INNER JOIN ForeignKeys2 y ON x.foreign_key = y.foreign_key AND x.ForeignKeyRank > y.ForeignKeyRank

Hope someone else finds some use out of this.

  • Aaron

    Why wouldn’t you just do something like:

    SELECT OBJECT_NAME(parent_object_id) AS ‘Parent’,
    OBJECT_NAME(referenced_object_id) AS ‘References’,
    COUNT(referenced_object_id) AS ‘Count”’
    FROM sys.foreign_keys
    GROUP BY parent_object_id, referenced_object_id
    HAVING COUNT(referenced_object_id) > 1
    ORDER BY OBJECT_NAME(parent_object_id)
    GO

    • http://feeds2.feedburner.com/StrateSql jstrate

      That wouldn’t work in situations where the same parent table was connected to multiple columns in the same table. For instance, this would happen fairly often in data warehouses where fact tables were related across multiple columns to the date dimension.

    • StrateSQL

      Somehow I missed this comment, but the reason that script doesn’t work is it doesn’t take into account the tables between the two tables. For instance, a date dimension would likely have multiple foreign key relationships between itself and the fact table.

  • Pingback: Memorias de un DBA

  • Bibeesh

    Great script..

    • StrateSQL

      Thanks.

  • SQL Server Monkey

    Hit this issue the other week on a new system that we are building, something went awry during a build. Done a bit more digging and SQL Server lets you create multiple foreign keys on a column, they can even reference different tables which I find a bit strange.

    • StrateSQL

      That’s an interesting way to find this issue. I guess if the database is leveraging sub-types and super-types in the table design, then you’d see foreign key relationships like this. I’d imagine this would kill performance on deletes.