Project Description
Duplicate Finder is a simple utility to look though files and find identical lines, which indicates duplicate code or cut-and-paste coding. It is written in C# using some .net 2.0 features.

It is aimed at detecting duplicate statements in source code, but uses simple text comparison techniques that should work on other kinds of text files.

I wrote this program for people like myself: C# programmers with an interest in code quality and automated tools to find code quality problems. However you may find other uses for it.

The source code program consists of:
DuplicateFinderLib - the engine for finding duplicates
DupFinder.exe - the command-line tool that uses the engine
DuplicateFinder.Tasks - the MSBuild task for the engine. This allows integration with C# automated builds.
DuplicateFinder.TestLibrary unit test cases using NUnit

DuplicateFinder 1.5 is out.

Usage of the command-line tool is as follows, eg:
>DupFinder.exe -t4 test5*.txt
Processing in C:\Code\DuplicateFinder\TestData
2 files read
Duplicate of length 5 at:
 Line 2-6 in C:\Code\DuplicateFinder\TestData\Test5Lines1.txt
 Line 2-6 in C:\Code\DuplicateFinder\TestData\Test5Lines2.txt
1 duplicate found

A more realistic example for C# code, looking through all files in the source tree, for duplicates of 9 lines or more, excluding the generated files called AssemblyInfo.cs:
> C:\Code\DuplicateFinder>DupFinder.exe -t9 -r -eAssemblyInfo.cs *.cs

Processing in C:\Code\DuplicateFinder
11 files read
Duplicate of length 11 at:
 Line 1-11 in C:\Code\DuplicateFinder\TestLibrary\TestAllFiles.cs
 Line 1-11 in C:\Code\DuplicateFinder\TestLibrary\TestFiles.cs
 Line 1-11 in C:\Code\DuplicateFinder\TestLibrary\TestProgramFile.cs
1 duplicate found

The duplicate finder has found that the test cases have the same using and namespace lines at the top.

Duplicate Finder Commandline arguments explained
The MSBuild Task

Last edited Aug 12, 2009 at 10:05 PM by AnthonySteele, version 23