parsing HTML table into excel. How?

D

Dan

Hi there,

I do have (many) links with HTML text formated in tables.
I would like to convert those links (pages) regularily to an
excel file just containing parts of the original HTMP table.

What is the best way to achieve this? Cut and paste is out
of the question. There are too many links.

Any pointer is welcome.

Thanks a lot

Dan
 
B

Bob Flanagan

Dan, when you say "page", do you mean worksheet? Or are you talking
sections of a worksheet as a page?

Bob Flanagan
Macro Systems
Delaware, U.S. 302-234-9857
http://www.add-ins.com
Productivity add-ins and downloadable books on VB macros for Excel
 
T

Tom Ogilvy

Possibly look at Tools=>Get External Data => Create a web query.

You can turn on the macro recorder while you do it manually. then modify
the code to loop through your list of links and extract the URL, sending it
to the web query (with a new destination).
 
R

Roland Hall

in message : I do have (many) links with HTML text formated in tables.
: I would like to convert those links (pages) regularily to an
: excel file just containing parts of the original HTMP table.
:
: What is the best way to achieve this? Cut and paste is out
: of the question. There are too many links.

Dan...

Until you provide some specific examples of what you want, you're not going
to get specific answers. Your request is blurred on this end.

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
D

Dan

You are right, Roland.

So here is an example:
http://www.motorradland.ch/default.cfm?id=paessedetails&bgcolor=001E3C&PaesseID=71

This opens a page that contains some (german, sorry) data. The structure of
the page is always the same.

The "PaesseID" range is 1 to 321.

I am interested in the following fields (in brackets the value of the above
link):
Name (Furkapass)
Type (Typ: Pass)
Rating (Rating: 43.1)
GPS coordinates (46° 34' 21.09" N 8° 24' 54.28" E)
Anfahrt (West: Gletsch, Ost: Realp)
Streckenlänge (Total: 23.7 km (von Gletsch: 11.1 km, nach Realp: 12.6
km))
Anschluss (N: Sustenpass - NE: Klausenpass - E: Oberalppass - SE: St.
Gotthardpass - S: Nufenenpass - SW: Simplonpass - )

The result should be one row per PaesseID entry, with the above cells filled
in.
String values are ok. I just need to convert the GPS coordinates into
decimal Latitudes/Longitudes for Microsoft Autoroute.

I have tried the Web Query, but it does not work that way. It just copies
what is on the web page.
I am not a (good) programmer. If you had some ideas, I would really
appreciate it.

Regards
Dan
 
D

Dick Kusleika

Dan

Here's a macro to get you started. It's a little funky, so I think I should
explain part of it. First, it took 6 minutes to run for me (cable modem),
so beware of that. The inner Do Loop is just bad programming. I was
getting an error and I'm not sure why. I suspect the page was loaded (thus
ReadyState was 4) but the frame wasn't finished yet. So I ended up just
trying to get the information until there was no error - dangerous. There's
most certainly a better way to do this (hopefully Jake Marx is reading this
thread) but I don't know what it is - yet.

The array of numbers was created with trial and error. I looped through
every item in the page until I found the correct information, then just
recorded which item it was.

Finally, the output is nowhere near what you will eventually want. The row
heights are all screwed up and there's nonprintable characters in there.
Also, Rating and Type are in the same Item, so they end up in the same
column with some superfluous info. All this can be handled in the code, I
just want to make sure we're not barking up the wrong tree. If it seems
close to what you want, we can clean it up. Here's the code

Sub GetWebInfo()

Dim appIE As Object
Dim sUrl As String
Dim lPaesseID As Long
Dim i As Long
Dim arrItem As Variant
Dim lTry As Long

Debug.Print Now

Set appIE = CreateObject("InternetExplorer.Application")
sUrl = "http://www.motorradland.ch/default.cfm?"
sUrl = sUrl & "id=paessedetails&bgcolor=001E3C&PaesseID="
arrItem = Array(57, 72, 88, 92, 133, 153)

For lPaesseID = 1 To 321
appIE.navigate sUrl & lPaesseID
Do
Loop Until appIE.readystate = 4
For i = LBound(arrItem) To UBound(arrItem)
Do
On Error Resume Next
Sheet1.Cells(lPaesseID, i + 1).Value = _
appIE.document.all.Item(arrItem(i)).innertext
lTry = Err.Number
On Error GoTo 0
Loop Until lTry = 0
Next i
Next lPaesseID

appIE.Quit

Debug.Print Now

End Sub
 
D

Dan

thanks dick,

cool.. did not know it was possible to approach thi that way.
I will try and give feedback. will be a good learning exercise
for me

stay tuned

dan
 
Top