Extract name, address and ph#

Discussion in 'Word' started by don linccoln, Nov 11, 2017 at 7:14 PM.

  1. don linccoln

    don linccoln

    Joined:
    Saturday
    Messages:
    5
    Likes Received:
    0
    Hello

    I’m looking for a macro that will extract

    Name, address and phone numbers

    from a RAW DATA file.

    Attached is the RAW DATA file.

    Sample RAW Data block ( in the data file ) where text can be search and extracted.
    ----------------------------------------------------------------------------------------------------
    Melanie Pursglove </a>
    </h3>
    <div class="c-people-result__address">5720 Premier Park DR, West Palm Beach, FL 33407</div>
    <div class="c-people-result__phone">(412) 264-6619</div>


    The best way to handle this I think is to search for phone number in word as (^?^?^?) ^?^?^?-^?^?^?^? format and once that format is found then go up to the name area and extract the whole block of text in a new document and then once all of them are extracted then with the SEARCH and REPLACE cleanup the extra raw data.

    I was able to write a macro doing above, but the whole macro takes up about ½ hr and that’s too much.

    A friend wrote this macro USING RANGES for extracting data from a financial document and it works REALLY FAST.

    I was wondering if anyone has any knowledge about RANGES and modify the following macro to extract the

    Name, address and phone numbers

    From the attached RAW DATA file.

    Thanks.


    Macro that was written. USING RANGES
    ---------------------------------------------------
    Option Explicit

    Sub Macro1()
    Dim oDoc As Document
    Dim oNewDoc As Document
    Dim oRng As Range, oRng2 As Range, oFound As Range
    Dim vFind As Variant
    Dim fso As Object
    Dim strPath As String
    Const strFind As String = "Add to watchlist|TOTAL REVENUE"

    strPath = Environ("USERPROFILE") & "\Desktop\DataExtract.doc" 'The name of the document to save the extract
    Set fso = CreateObject("Scripting.FileSystemObject")
    vFind = Split(strFind, "|")
    Set oDoc = ActiveDocument
    If fso.FileExists(strPath) Then
    Set oNewDoc = Documents.Open(FileName:=strPath, AddToRecentFiles:=False)
    Else
    Set oNewDoc = Documents.Add
    oNewDoc.SaveAs FileName:=strPath
    End If
    Set oRng = oDoc.Range
    With oRng.Find
    Do While .Execute(FindText:=vFind(0))
    oRng.MoveStart wdParagraph, -2
    oNewDoc.Range.InsertAfter _
    Left(oRng.Paragraphs(1).Range.Text, _
    Len(oRng.Paragraphs(1).Range.Text) - 1)
    Set oFound = oRng
    oFound.End = oDoc.Range.End
    With oFound.Find
    Do While .Execute(FindText:=vFind(1))
    oFound.End = oFound.Paragraphs(1).Range.End - 1
    Set oRng2 = oNewDoc.Range
    oRng2.End = oRng2.End - 1
    oRng2.Collapse 0
    oRng2.Text = vbTab & oFound.Text & vbCr
    oRng.Collapse 0
    Exit Do
    Loop
    End With
    oRng.Collapse 0
    Loop
    End With
    With oNewDoc.Range
    .ParagraphFormat.TabStops.ClearAll
    .ParagraphFormat.TabStops.Add CentimetersToPoints(6.5)
    .ParagraphFormat.SpaceAfter = 0
    .Font.Name = "Arial"
    .Font.Size = 8
    End With

    'oNewDoc.Close wdSaveChanges 'Optional
    lbl_Exit:
    Set fso = Nothing
    Set oDoc = Nothing
    Set oNewDoc = Nothing
    Set oRng = Nothing
    Set oRng2 = Nothing
    Set oFound = Nothing
    Exit Sub
    End Sub
     

    Attached Files:

    don linccoln, Nov 11, 2017 at 7:14 PM
    #1
    1. Advertisements

  2. don linccoln

    macropod Microsoft MVP

    Joined:
    Mar 2, 2012
    Messages:
    287
    Likes Received:
    24
    Try:
    Code:
    Sub Demo()
    Application.ScreenUpdating = False
    Dim StrOut As String, wdDoc As Document
    With ActiveDocument.Range
      With .Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = "name^34\>[!\<]@(\</a\>\</li\>)*address^34\>[!\<]@\1*phone^34\>[!\<]@\1"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindStop
        .Format = False
        .MatchWildcards = True
        .Execute
      End With
      Do While .Find.Found
        StrOut = StrOut & Split(Split(.Text, "</a></li>")(0), ">")(1) & vbTab
        StrOut = StrOut & Split(Split(.Text, "</a></li>")(1), ">")(2) & vbTab
        StrOut = StrOut & Split(Split(.Text, "</a></li>")(2), ">")(2) & vbCr
        .Collapse wdCollapseEnd
        .Find.Execute
      Loop
    End With
    Set wdDoc = Documents.Add
    wdDoc.Range.Text = StrOut
    Application.ScreenUpdating = True
    End Sub
    
     
    Last edited: Nov 11, 2017 at 8:20 PM
    macropod, Nov 11, 2017 at 8:13 PM
    #2
    1. Advertisements

  3. don linccoln

    don linccoln

    Joined:
    Saturday
    Messages:
    5
    Likes Received:
    0
    Hello

    When i ran the macro, this is what the output that i got.
    -------------------------------------------------------------------------

    By Name By Address By Phone Number
    By Name By Address By Phone Number
    By Name By Address By Phone Number


    Attached are both the

    1) Raw file that contains the Raw Data, and
    2) The output file

    This is the output i'm looking for.

    Melanie Pursglove 5720 Premier Park DR, West Palm Beach, FL 33407 (412) 264-6619
    Cheryl L Martiner 5720 Premier Park DR, West Palm Beach, FL 33407 (941) 926-2727
    Steve E Arons 5720 Premier Park DR, West Palm Beach, FL 33407 (707) 864-0878
    Gail M Perez 5720 Premier Park DR, West Palm Beach, FL 33407 (717) 796-6578

    Thanks.
     
    don linccoln, Nov 12, 2017 at 1:59 AM
    #3
  4. don linccoln

    macropod Microsoft MVP

    Joined:
    Mar 2, 2012
    Messages:
    287
    Likes Received:
    24
    Evidently there are two forms of the strings in your file. Try:
    Code:
    Sub Demo()
    Application.ScreenUpdating = False
    Dim StrOut As String, wdDoc As Document
    With ActiveDocument.Range
      With .Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = "^34\>[!\<]@\</a\>^13[ ]@\</h3\>*address^34\>[!\<]@\</div\>*phone^34\>[!\<]@\</div\>"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindStop
        .Format = False
        .MatchWildcards = True
        .Execute
      End With
      Do While .Find.Found
        StrOut = StrOut & Trim(Split(Split(.Text, "</a>")(0), vbCr)(1)) & vbTab
        StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(0), ">")(2) & vbTab
        StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(1), ">")(1) & vbCr
        .Collapse wdCollapseEnd
        .Find.Execute
      Loop
    End With
    Set wdDoc = Documents.Add
    wdDoc.Range.Text = StrOut
    Application.ScreenUpdating = True
    End Sub
     
    macropod, Nov 12, 2017 at 6:59 AM
    #4
  5. don linccoln

    don linccoln

    Joined:
    Saturday
    Messages:
    5
    Likes Received:
    0
    Hello

    It worked kind of good, except its extracting some more RAW Data,

    attached is the 1) RAW file and the 2) Output file.

    Here is the output.
    -------------------------
    Gail M Perez 5720 Premier Park DR, West Palm Beach, FL 33407 (717) 796-6578
    Curtis L Sober 5720 Premier Park DR, West Palm Beach, FL 33407

    <a href="/whitepages/show?fp=eyJwcm9maWxlaWQiOiIiLCJmaXJzdCI6IkN1cnRpcyIsImxhc3QiOiJTb2JlciIsInN0cmVldCI6IjU3MjAgUHJlbWllciBQYXJrIERSIiwiY2l0eSI6Ildlc3QgUGFsbSBCZWFjaCIsInN0YXRlIjoiRkwiLCJwaG9uZSI6Ii0tIiwiemlwIjoiMzM0MDcifQ%3D%3D&form=sba"

    William D Rodriguez 5720 Premier Park DR, West Palm Beach, FL 33407 (305) 207-1641
    Richard E Troy 5720 Premier Park DR, West Palm Beach, FL 33407

    <a href="/whitepages/show?fp=eyJwcm9maWxlaWQiOiIiLCJmaXJzdCI6IlJpY2hhcmQiLCJsYXN0IjoiVHJveSIsInN0cmVldCI6IjU3MjAgUHJlbWllciBQYXJrIERSIiwiY2l0eSI6Ildlc3QgUGFsbSBCZWFjaCIsInN0YXRlIjoiRkwiLCJwaG9uZSI6Ii0tIiwiemlwIjoiMzM0MDcifQ%3D%3D&form=sba"

    Thanks.
     

    Attached Files:

    don linccoln, Nov 12, 2017 at 5:30 PM
    #5
  6. don linccoln

    macropod Microsoft MVP

    Joined:
    Mar 2, 2012
    Messages:
    287
    Likes Received:
    24
    That's because, unlike your original data, some of the records in your latest set don't include phone #s. Try replacing:
    Code:
       StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(1), ">")(1) & vbCr
    with:
    Code:
        If InStr(.Text, "<span>") = 0 Then
          StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(1), ">")(1)
        End If
        StrOut = StrOut & vbCr
     
    macropod, Nov 12, 2017 at 10:15 PM
    #6
  7. don linccoln

    don linccoln

    Joined:
    Saturday
    Messages:
    5
    Likes Received:
    0
    That worked. Great.

    Thanks a lot :)
     
    don linccoln, Nov 13, 2017 at 2:47 AM
    #7
  8. don linccoln

    macropod Microsoft MVP

    Joined:
    Mar 2, 2012
    Messages:
    287
    Likes Received:
    24
    Refinement -
    Change:
    .Collapse wdCollapseEnd
    to:
    .MoveStart wdCharacter, InStr(.Text, Split(Split(Split(.Text, "</a>")(1), "</div>")(0), ">")(2))
    .Collapse wdCollapseStart
    Without this, the record following the one lacking a phone # will be skipped.
     
    macropod, Nov 13, 2017 at 4:02 AM
    #8
  9. don linccoln

    don linccoln

    Joined:
    Saturday
    Messages:
    5
    Likes Received:
    0
    Great, Got it, Thanks. :)
     
    don linccoln, Nov 14, 2017 at 12:43 AM
    #9
    1. Advertisements

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Similar Threads
  1. jsduiahd
    Replies:
    0
    Views:
    655
    jsduiahd
    Jul 2, 2011
  2. kittykatoo
    Replies:
    0
    Views:
    716
    kittykatoo
    Sep 20, 2012
  3. rosscortb
    Replies:
    0
    Views:
    515
    rosscortb
    May 13, 2014
  4. fiesta
    Replies:
    1
    Views:
    447
  5. user03
    Replies:
    0
    Views:
    289
    user03
    Jan 27, 2016
  6. Jerry 00769
    Replies:
    0
    Views:
    513
    Jerry 00769
    Jan 30, 2016
  7. ZaidaBa
    Replies:
    1
    Views:
    201
    macropod
    May 8, 2017
  8. Robin Keeter
    Replies:
    3
    Views:
    830
    macropod
    Jul 20, 2017
Loading...