Find hashtags within a column

I

Iain Broughton

Hi,

I've a rather large spreadsheet, and I want to extract the hastags from it (Literally twitter like so my column may contain 'My query is about #excel #spreadsheets today')

I've tried search, find, vlookup but can't find anything that's going to work without causing errors, or without requiring a predifined list of tags to lookup.

I want the result to extract the actual hash tag(s) so I can report on how many times it has been used, so if anyone can help with either a formula or a pivot table with this it'd be greatly appreciated.

Just to clarify, there is no set list of pre-defined hash tags, and they can appear in any location within the column

Many thanks in advance!
Iain
 
R

Ron Rosenfeld

Hi,

I've a rather large spreadsheet, and I want to extract the hastags from it (Literally twitter like so my column may contain 'My query is about #excel #spreadsheets today')

I've tried search, find, vlookup but can't find anything that's going to work without causing errors, or without requiring a predifined list of tags to lookup.

I want the result to extract the actual hash tag(s) so I can report on how many times it has been used, so if anyone can help with either a formula or a pivot table with this it'd be greatly appreciated.

Just to clarify, there is no set list of pre-defined hash tags, and they can appear in any location within the column

Many thanks in advance!
Iain

This can be done with a VBA Macro.
I would first generate a list of unique hash tags
Then count each one.
Finally, write the results back to the worksheet.

The following assumes your data source is in column A; and that you want the results in Columns D:E (both of those assumptions are easily changed)
In addition, the results are unsorted (essentially in the order of the first appearance of the tags in your list), but a sort routine could be added to sort either alphabetically, or by count, if that is preferable.

To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.

===========================================
Option Explicit
Sub HashTagCount()
Dim colHashTags As Collection
Dim vSrc As Variant, rSrc As Range
Dim rDest As Range
Dim vRes() As Variant
Dim L As Long
Dim v As Variant
Dim c As Range, sFirstAddress As String
Dim re As Object, mc As Object, m As Object

Set rSrc = Range("A1", Cells(Rows.Count, "A").End(xlUp))
vSrc = rSrc

Set rDest = Range("D1")

'generate list of unique hash tags
Set colHashTags = New Collection
Set re = CreateObject("vbscript.regexp")
With re
.Global = True
.Pattern = "#\S+"
End With

For Each v In vSrc
If re.test(v) = True Then
Set mc = re.Execute(v)
On Error Resume Next
With WorksheetFunction
For Each m In mc
colHashTags.Add Item:=.Proper(m), Key:=.Proper(m)
Next m
End With
On Error GoTo 0
End If
Next v

'get count of each hash tag
ReDim vRes(1 To colHashTags.Count, 0 To 1)
For L = 1 To colHashTags.Count
vRes(L, 0) = colHashTags(L)
With re
.Pattern = colHashTags(L)
.ignorecase = True
End With
With rSrc
Set c = .Find(what:=colHashTags(L), _
LookIn:=xlValues, _
lookat:=xlPart, _
MatchCase:=False)
If Not c Is Nothing Then
sFirstAddress = c.Address
Do
Set mc = re.Execute(c.Text)
vRes(L, 1) = vRes(L, 1) + mc.Count
Set c = .FindNext(after:=c)
If c Is Nothing Then Exit Do
Loop While c.Address <> sFirstAddress
End If
End With
Next L

Set rDest = rDest.Resize(rowsize:=UBound(vRes, 1), columnsize:=2)
rDest.EntireColumn.ClearContents
rDest = vRes
rDest.EntireColumn.AutoFit
End Sub
====================================
 
I

Iain Broughton

This looks very promising, however I get runtime error 9 on the line


ReDim vRes(1 To colHashTags.Count, 0 To 1)

Many thanks again for your swift reponse!

Iain
 
R

Ron Rosenfeld

This looks very promising, however I get runtime error 9 on the line


ReDim vRes(1 To colHashTags.Count, 0 To 1)

Many thanks again for your swift reponse!

Iain

That will occur there were no hash tags added to the collection.

That can happen if the data is not exactly as you have indicated in your one line example (e.g: words that start with a "#" and contain no spaces until the end of the word )
It can also happen if the data being processed does not contain any hash tags. Perhaps all of the data is not in column A?
Furthermore, it can happen if your data is the result of formulas, and not text strings.

If those hints don't give enough information for you to puzzle out the problem, post a copy of your worksheet on some accessible website (e.g. Skydrive) and post a link to it here.
 
R

Ron Rosenfeld

That will occur there were no hash tags added to the collection.

That can happen if the data is not exactly as you have indicated in your one line example (e.g: words that start with a "#" and contain no spaces until the end of the word )
It can also happen if the data being processed does not contain any hash tags. Perhaps all of the data is not in column A?
Furthermore, it can happen if your data is the result of formulas, and not text strings.

If those hints don't give enough information for you to puzzle out the problem, post a copy of your worksheet on some accessible website (e.g. Skydrive) and post a link to it here.

Actually, formula vs text string probably would not have any effect.
 
I

Iain Broughton

Please ignore me, the script works a charm! I must have had a blind spot about the data source being column A!

Many many thanks!
Iain
 
R

Ron Rosenfeld

Please ignore me, the script works a charm! I must have had a blind spot about the data source being column A!

Many many thanks!
Iain

Glad to help. Thanks for the feedback. As I wrote, if you need to sort it, that can be easily added.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top