I regularly have to deal with data sets that have multiple records for a single person or entity. When there are multiples of a record, it means that whatever that unique person/entity (I'll just say entity from here forward) has done, there are several values for it.
The values could be a date, or a product purchased, or a code of any sort. I'm tasked with taking these files and flattening them so that there is only one record per entity, but with that entity having as many fields as necessary to accommodate all of the values from all of the records for that entity. I currently do this manually in Excel, but I'm hoping that there would be some quicker way I could do it using Access or SQL Server (I know very little about either) where I won't have to constantly reinvent the wheel. I'll give a description of what I currently do with some screenshots as examples.
Thanks to everyone who has even read to this point. I appreciate anything you with knowledge give to this beggar of your skills! 😕
==================================
After I get the files, I open them in a piece of software that works with de-duplication. I just mark the recurrences of the records so that I can group them and see how many iterations there are of each.
I then export that file and bring it into Excel. I and sort by the entity, or whatever the common value is, and then by dupe #. (MUST BE IN DUPE ORDER after common value). In this case, the common value is company.
Then, in the header row, I build logic that will pull the other values into a single record. I do that for each column for as many columns as the max dupe count -1 (e.g. if the highest dupe number is 5, then I'll do this expression for 4 additional columns)
Formulas displayed to make it easier to understand:
Once those formulas are created, they are dragged down to fill all the rows needed:
And then I filter on Dupe #1 which gives me one record for each entity, with all the values for that entity on a single line.
After that, I just copy/paste the filtered cells into a new Excel file, save that second file, and I've got the data in the format I need. But it's a PITA. I know there must be a simpler way, but I don't have the knowledge to get there.
If you've actually read ALL THE ABOVE, you're a saint and I truly am grateful for your time and effort!
Joe
The values could be a date, or a product purchased, or a code of any sort. I'm tasked with taking these files and flattening them so that there is only one record per entity, but with that entity having as many fields as necessary to accommodate all of the values from all of the records for that entity. I currently do this manually in Excel, but I'm hoping that there would be some quicker way I could do it using Access or SQL Server (I know very little about either) where I won't have to constantly reinvent the wheel. I'll give a description of what I currently do with some screenshots as examples.
Thanks to everyone who has even read to this point. I appreciate anything you with knowledge give to this beggar of your skills! 😕
==================================
After I get the files, I open them in a piece of software that works with de-duplication. I just mark the recurrences of the records so that I can group them and see how many iterations there are of each.
I then export that file and bring it into Excel. I and sort by the entity, or whatever the common value is, and then by dupe #. (MUST BE IN DUPE ORDER after common value). In this case, the common value is company.
Then, in the header row, I build logic that will pull the other values into a single record. I do that for each column for as many columns as the max dupe count -1 (e.g. if the highest dupe number is 5, then I'll do this expression for 4 additional columns)
Formulas displayed to make it easier to understand:
Once those formulas are created, they are dragged down to fill all the rows needed:
And then I filter on Dupe #1 which gives me one record for each entity, with all the values for that entity on a single line.
After that, I just copy/paste the filtered cells into a new Excel file, save that second file, and I've got the data in the format I need. But it's a PITA. I know there must be a simpler way, but I don't have the knowledge to get there.
If you've actually read ALL THE ABOVE, you're a saint and I truly am grateful for your time and effort!
Joe