Cleaning data - extracting Post Codes

  • Thread starter SimonG via AccessMonster.com
  • Start date
J

John Spencer

The wildcard * represents any number of characters (zero is a number).

John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
 
R

RD

I have a set of data which I am trying to tidy. The address table has just
been filled in from the top line down, so the address structure is not common
between records. As part of tiding this I would like to able to extract Post
Codes to their own field.

Currently Post Codes are typically found within AddressLine4 or AddressLine5,
but may also share the field with town or county,
eg
AB12 34CD
London EF56 7GH
West Midlands J9 12KL

Has any one achieved anything similar, or any ideas how this could be done?

Many thanks,
Simon

Use a regular expression.

Here is a function from one of my apps that extract a US post code:

'---------------------------------------------------------------------------------------
' Procedure : fFindZip
' DateTime : 4/10/2007 11:24
' Author : RD
' Purpose : Extracts valid zip code from a string
'---------------------------------------------------------------------------------------
'
Function fFindZip(sInput As Variant) As String
Dim oRegEx As New RegExp, colMatch As MatchCollection, oMatch As Match
oRegEx.Pattern = "([0-9]{5}(-[0-9]{4})?)"
oRegEx.Global = True
Set colMatch = oRegEx.Execute(sInput)
For Each oMatch In colMatch
' Debug.Print oMatch.Value
fFindZip = oMatch.Value
Next

Set colMatch = Nothing
Set oRegEx = Nothing

End Function


Here are some expressions from Wikipedia
(http://en.wikipedia.org/wiki/UK_postcodes).

Just swap out a UK expression for my US one. Line wrap made that look
ugly. You might want to look up the Wiki article yourself.

From Mikipedia:
A regular expression to implement the BS 7666 rules in a basic fashion
is provided in the BS7666 schema:[20]

[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z-[CIKMOV]]{2}
A more complex regular expression is also given in the comments of the
schema, which implements full checking of all the stated BS 7666
postcode format rules. That regular expression can be restated as a
"traditional" regular expression:

(GIR
0AA|[A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]|[A-HK-Y][0-9]([0-9]|[ABEHMNPRV-Y]))|[0-9][A-HJKS-UW])
[0-9][ABD-HJLNP-UW-Z]{2})
The BS 7666 rules do not match British Forces Post Office postcodes,
which have the format "BFPO NNN" or "BFPO c/o NNN", where NNN is 1 to
4 numerical digits.
 
Top