Basic Excel Help!
Author
Discussion

samrr

Original Poster:

2,379 posts

252 months

Friday 12th March 2010
quotequote all
Hi - can someone help me out here?

I need to dedupe a few lists and need to go by the phone number.

The formula I have found is
=IF(COUNTIF(A$1:A$7,A1)>1,"Duplicated","")

Now it throws up loads of dupes which is good however when I sort the column by the phone number it has 'duplicated' next to all the dupes but also has a few more in there?

As it's sorted by accending order the dupes should be next to each other which most are however it has 'duplicated' next to some numbers which I can't find another number the same?

Also - (once I have sorted the above) how do I delete one of the dupe records in one go?

Thanks in advance!

FunkyGibbon

3,847 posts

288 months

Friday 12th March 2010
quotequote all
I take it that formula correctly adds duplicate to the row if a duplicate number is found?

If so, you can use the Filter function (top menu -> data -> autofilter) this will allow you to filter the data on rows that only contain the word duplicate. If you are unfamiliar with Filters the help examples are quite good.

You can then highlight the filtered rows and delete.

samrr

Original Poster:

2,379 posts

252 months

Friday 12th March 2010
quotequote all
FunkyGibbon said:
I take it that formula correctly adds duplicate to the row if a duplicate number is found?
It does however it also adds 'duplicate' next to some numbers that dont seem to have another number the same? I'm not sure if the formula is playing up of something is wrong in it?

Will keep playing around with it....

Once sorted i'll do what you say to delete! Thanks!

ewenm

28,506 posts

269 months

Friday 12th March 2010
quotequote all
If they are sorted by the field you want to check for duplicates, you can use the formula =IF(B1=A1,"Dupe","") and just copy it down - no need to check every number every time, only the one immediately before.

samrr

Original Poster:

2,379 posts

252 months

Friday 12th March 2010
quotequote all
That is ok now - not sure why it was throwing in the extras.

I have filtered by the dupe column and now I have a long list of dupe numbers which is good however I want to only delete one of the duplications - is that possible without doing it one by one?

sgrimshaw

7,574 posts

274 months

Friday 12th March 2010
quotequote all
ewenm said:
If they are sorted by the field you want to check for duplicates, you can use the formula =IF(B1=A1,"Dupe","") and just copy it down - no need to check every number every time, only the one immediately before.
I do the same, but then select all the range that has the formula above in it, copy then "paste special / values" back op top of itself.

Then sort the whole database by the column that has DUPE in it.

This puts all the DUPE's together and you can easily select all the DUPE rows in one go and delete them.

ewenm

28,506 posts

269 months

Friday 12th March 2010
quotequote all
samrr said:
That is ok now - not sure why it was throwing in the extras.

I have filtered by the dupe column and now I have a long list of dupe numbers which is good however I want to only delete one of the duplications - is that possible without doing it one by one?
The advantage of doing it the way I suggested is that one of the duplicates (the first) is always left un-marked and you can delete all the others. I too use the copy/paste-special method as I don't trust the filtering hehe

mrmr96

13,736 posts

228 months

Friday 12th March 2010
quotequote all
ewenm said:
If they are sorted by the field you want to check for duplicates, you can use the formula =IF(B1=A1,"Dupe","") and just copy it down - no need to check every number every time, only the one immediately before.
This is my preferred method as it's the fastest. However, as others have said you can de-dup the entire database in one go as follows:

1 - Highlight the column with the phone numbers in it.
2 - (In excel 2007) click on the "Data" tab of the ribbon, then in the "Sort and Filter" section click on "Advanced".
3 - Leave "Filter the list in-place" checked, make sure the "list range" and "criteria range" refer only to the phone numbers column (click on the little funny box to activate the select tool and highlight the phone numbers if this isn't already the case. Do this for each of the two boxes: list and criteria). Finally make sure "Unique records only" is checked, then click "Ok".
4 - The duplicated rows are now hidden, so you can select the whole table and copy it somewhere else and it will arrive without the duplicate rows.

Hope that helps.

mrmr96

13,736 posts

228 months

Friday 12th March 2010
quotequote all
ewenm said:
If they are sorted by the field you want to check for duplicates, you can use the formula =IF(B1=A1,"Dupe","") and just copy it down - no need to check every number every time, only the one immediately before.
Presumably you meant to put:

=IF(A2=A1,"DUPE","")

since we are going down, not across.

ewenm

28,506 posts

269 months

Friday 12th March 2010
quotequote all
mrmr96 said:
ewenm said:
If they are sorted by the field you want to check for duplicates, you can use the formula =IF(B1=A1,"Dupe","") and just copy it down - no need to check every number every time, only the one immediately before.
Presumably you meant to put:

=IF(A2=A1,"DUPE","")

since we are going down, not across.
Well spotted (deliberate mistake to test the students, honest hehe) paperbag

samrr

Original Poster:

2,379 posts

252 months

Friday 12th March 2010
quotequote all
ewenm said:
samrr said:
That is ok now - not sure why it was throwing in the extras.

I have filtered by the dupe column and now I have a long list of dupe numbers which is good however I want to only delete one of the duplications - is that possible without doing it one by one?
The advantage of doing it the way I suggested is that one of the duplicates (the first) is always left un-marked and you can delete all the others. I too use the copy/paste-special method as I don't trust the filtering hehe
Oh yeah hehe sorted!!!!

Now I'll just delete all the DUPE's.

Thanks!

ETA Mrmr96 I did try your way but the above seemed easier for me to understand!


Edited by samrr on Friday 12th March 10:37

mrmr96

13,736 posts

228 months

Friday 12th March 2010
quotequote all
samrr said:
ETA Mrmr96 I did try your way but the above seemed easier for me to understand!
No probs. My way will work no matter what the size of the list and preseves the original order. The other way is a bit more manual (as you have to manually select all the rows with Dupe on them, even if they are grouped.) The "best" answer is whatever you understand and works for you!