Extract list of unique words used

A

AndyDT

I have an excel sheet and one of the colums contains a description usin
keywords. A cell in this column may contain several keywords separate
by whitespace (i.e. one or more spaces).

I want to create a new column listing all uniqe words used in tha
column

So e.g. given the starting list
A1=green round
A2=yellow
A3=yellow round
A4=Yellow square
A5=Square

I want to create a new list containing (order is not significant)
green
round
yellow
Yellow
square
Squar
 
A

AlfD

Hi!

The following should get you there:

1. Use Data > Text to Columns with <space> as delimiter.
2. I don't know how many columns this will generate (being the numbe
of words in your wordiest cell...).
3. Put all of these columns into a single column (same or differen
sheet).
4. Now it would be nice just to use the Advanced Filter to extract you
unique list. However, I see from your sample that Square is not th
same as square. The filter doesn't allow us that luxury, AFAIK. So w
improvise with a helper column next to your composite data column.
5. If your word list is A2:A100 (with a heading in A1) then enter

=if(CODE(A1)>96,1,0) in cell B2 and copy it down to B100.

If all you have in the column are words begining with letters of th
alphabet, this puts a 1 opposite words beginning with lower case
otherwise 0. If you have other characters it might need tweaking. Com
back if that's an issue

6. Use Data > Filter >Advanced Filter on the two columns you now hav
in order to extract the unique records. If you are not familiar wit
it, post back.

7. Dump the helper column.

Al
 
M

Max

Maybe tinker with these steps ..

Assume the keywords are col A, in A1:A1000

Put in B1: =SUBSTITUTE(TRIM(A1)," ",CHAR(10))
Copy down to B1000

Select B1:B1000 > Copy

Open Word
----------
In Word
----------

Click Edit > Paste special > Unformatted text > OK

Press Ctrl + A
(this selects all the pasted text)

Click Edit > Replace > Replace tab

Enter in the box for Find what: " [i.e. enter a double quote]
Leave the box for Replace with: blank

Click 'Replace All'

Answer 'No' to exit the prompt to search the remainder of the doc
Click 'Close' to exit the Find and Replace dialog

Right-click on the entire selection of text > Copy
---------
Toggle back to Excel
----------
In a new sheet, right-click on A2
Choose Paste special > Text > OK
If the 1000 rows contained between them a total of say 5,000 words,
these words will be pasted in 5,000 separate cells in col A

Type a col label in A1
Select col A
Click Data > Filter > Advanced Filter
[Click OK to the Excel prompt to use the first row as labels]

In the Advanced Filter dialog box:
----------------------------------------
Check "Copy to another location"
Ensure that the selected range correctly appears in the "List range:" box
Put for "Copy to:" : B1
Check "Unique records only"
Click OK

The unique list of keywords in col A
will be extracted into col B

Caveat: Case will not be distinguished in extracting the uniques,
e.g.: "Yellow" and "yellow" are the same
 
Top