How to normalize inconsistent spelling in Excel file with +1000 rows?

Joined
Jan 19, 2024
Messages
2
Reaction score
0
Hi folks,
I have an Excel file on which I want to run some analysis (sum if, count if, etc.). However the problem is my data set is inconsistent with abbreviations, spelling mistakes, etc. So for example (Germany, Deutschland, GER, germany, gerany,...)
How can I align and harmonize this?
Thanks
 
Joined
Jan 31, 2024
Messages
1
Reaction score
0
Harmonize - lets call it "normalize".

You have at least two options: Macros and/or formulas.

Between these, there's no "easier" way - the macro method is not better than the formula method. A macro will take more setup work, but it's more-easily portable between workbooks (in other words, more future-proof). But formulas will be easier to set up; however, unless they're set up the right way (and you know a few other tips) they're limited to the workbook you're in (in other words, not as future-proof).

Let's go the formula route for this just so you can get it up and going:

You want to normalize data. You'll need a helper column and a helper sheet (another blank sheet in your workbook) that will convert the gobbledygook into acceptable data ("normalize").

The helper sheet will store the equivalency of words; that is, GER, Germany, germany, gerany, etc. are all be equal to the same thing (all these will be equal to "GER", for example). That'll be your helper sheet: Two columns: One of all the gobbledygook, and across from each cell of gobbledygook will be the converted "normalized" equivalency. You'll have to teach this to Excel instance by instance, row by row. So, over time this reference sheet will get better (which is why a macro would be cool; portability).

The helper column will be off to the right of your main page (where the gobbledygook is). Here's you'll run an XLOOKUP formula on each row like so:

=XLOOKUP(cell with gobbledygook, helper sheet column of gobbledygook, helper sheet column of equivalent data). The result will be "GER" in each row where your examples show up. In other words "look up this gobbledygook in the Gobbledygook column of the helper sheet, and give me the normalized data".

After this you can run SUMIFS and whatnot against this helper column.

If you need more information let me know.
 
Joined
Jan 19, 2024
Messages
2
Reaction score
0
Thank you, I get the approach. But I have like 10,000 rows, so building the sheet will take ages. Is there a way to lets say create an automated helper column?
 
Top