Access/SQL Noob needs some advice

Netopia · Nov 15, 2013

I regularly have to deal with data sets that have multiple records for a single person or entity. When there are multiples of a record, it means that whatever that unique person/entity (I'll just say entity from here forward) has done, there are several values for it.

The values could be a date, or a product purchased, or a code of any sort. I'm tasked with taking these files and flattening them so that there is only one record per entity, but with that entity having as many fields as necessary to accommodate all of the values from all of the records for that entity. I currently do this manually in Excel, but I'm hoping that there would be some quicker way I could do it using Access or SQL Server (I know very little about either) where I won't have to constantly reinvent the wheel. I'll give a description of what I currently do with some screenshots as examples.

Thanks to everyone who has even read to this point. I appreciate anything you with knowledge give to this beggar of your skills! 😕

==================================
After I get the files, I open them in a piece of software that works with de-duplication. I just mark the recurrences of the records so that I can group them and see how many iterations there are of each.

I then export that file and bring it into Excel. I and sort by the entity, or whatever the common value is, and then by dupe #. (MUST BE IN DUPE ORDER after common value). In this case, the common value is company.

Then, in the header row, I build logic that will pull the other values into a single record. I do that for each column for as many columns as the max dupe count -1 (e.g. if the highest dupe number is 5, then I'll do this expression for 4 additional columns)

Formulas displayed to make it easier to understand:

Once those formulas are created, they are dragged down to fill all the rows needed:

And then I filter on Dupe #1 which gives me one record for each entity, with all the values for that entity on a single line.

After that, I just copy/paste the filtered cells into a new Excel file, save that second file, and I've got the data in the format I need. But it's a PITA. I know there must be a simpler way, but I don't have the knowledge to get there.

If you've actually read ALL THE ABOVE, you're a saint and I truly am grateful for your time and effort!

Joe

EagleKeeper · Nov 15, 2013

Access has/had a normalization function.
It will look through the database and identify areas of duplication of a key.

IF you have the raw file in a CSV type format; email and I will TRY to look at it over the next week with respect to Normalization.

Netopia · Nov 15, 2013

If you would PM me with an email address, I'd be quite happy to email as requested. Thanks!

KLin · Nov 15, 2013

I wonder if a crosstab query would be sufficient...

KLin · Nov 15, 2013

Forgot address. I added another query that gets my all the records for dupe 1. I then joined it to the crosstab query to add the address. See screenshots below

EagleKeeper · Nov 15, 2013

And then he can sort by the Dupe number and he should be done

I will try this on his data this weekend.

KLin · Nov 15, 2013

The crosstab actually takes care of that since the dupeno value is the column header.

Netopia · Nov 18, 2013

WOW. I didn't have time to check the forums over the weekend! Thanks guys.

So KLin, would there be a way to do this without even marking dupes ahead of time?

Thanks again guys.

EagleKeeper · Nov 18, 2013

You can normalize by any field.
separate table for that field and then a data field in the overall record that links to that table index.

IN your example above; you have company123 and companyabc
You would create a table called company with fields having an unique index, name of company, location, contact number, etc.

Then in the overall data record; you have a field instead of company as text, it contains the index of the company in the Company Table.

For users, create a table of users.
then a new table that contains user index and company index; this becomes you cross reference between multiple users for a company and multiple companies for a user.

This would be the base for the data; getting it displayed as shown in the last spreadsheet becomes a pain.

KLin · Nov 18, 2013

Netopia said:
WOW. I didn't have time to check the forums over the weekend! Thanks guys.

So KLin, would there be a way to do this without even marking dupes ahead of time?

Thanks again guys.

Can you post a screenshot of what the data looks like with duplicates?

Access/SQL Noob needs some advice

Netopia

Diamond Member

EagleKeeper

Discussion Club Moderator<br>Elite Member

Netopia

Diamond Member

KLin

Lifer

KLin

Lifer

EagleKeeper

Discussion Club Moderator<br>Elite Member

KLin

Lifer

Netopia

Diamond Member

EagleKeeper

Discussion Club Moderator<br>Elite Member

KLin

Lifer

TRENDING THREADS