Excel VB website query

MrScott81

Golden Member
Aug 31, 2001
1,891
0
76
I have some cells in excel which will contain a product id for a monoprice.com item. I want to be able to read a corresponding URL (where ${product_id} is the value of the cell), and get a description for the item.

Example:
http://www.monoprice.com/products/product.asp?p_id=${product_id}

Can anyone point me in the right direction?

I know I need to read that webpage and find a specific tag (possibly the title tag, the meta description tag, or something else).

Just have no idea how to do this in excel.

Any help would be appreciated.
 

Tweak155

Lifer
Sep 23, 2003
11,449
264
126
You have code in VBA already, correct?

There are probably a few ways to do this, but first that comes to mind is MSXML2.XMLHTTP. You can also create an InternetExplorer object and use it as though it is the web browser. You can tell it to load web pages and it is actually stored in the object.

If you can give an example of an item page and the text you want to grab, it might be easier to help.
 

KLin

Lifer
Feb 29, 2000
30,426
744
126
1. Create a new module in excel vba project and save it.

PHP:
Public Function GetDescription(ByVal productid As String) As String
Dim wb As Object
Dim doc As Object
Dim sURL As String
Dim metaname As Variant
Set wb = CreateObject("InternetExplorer.Application")

sURL = "http://www.monoprice.com/products/product.asp?p_id=" & productid

wb.Navigate sURL

While wb.Busy
    DoEvents
Wend

''HTML Document
Set doc = wb.document

''Title
For Each elem In doc.all
        If InStr(1, elem.tagName, "META") > 0 Then
            metaname = elem.GetAttribute("name")
            If metaname = "description" Then
                GetDescription = elem.GetAttribute("content")
                Exit For
            Else
                GetDescription = ""
            End If
        End If
    Next

wb.Quit
Set wb = Nothing
End Function

paste code into newly created module.

reference GetDescription(A1) in a cell where A1 is the value that represents the productid.

pm me if you want the excel sheet I created.
 
Last edited:

KLin

Lifer
Feb 29, 2000
30,426
744
126
Added wb.Quit line to close out of IE. setting the object to nothing does not close IE.
 

MrScott81

Golden Member
Aug 31, 2001
1,891
0
76
Thanks for the help, it is working for the most part (although slowly). Not sure if the xmlhttp way might be better
 

KLin

Lifer
Feb 29, 2000
30,426
744
126
Pricing info is harder to retrieve as it's stored in an html table.
 

MrScott81

Golden Member
Aug 31, 2001
1,891
0
76
There could be like 20-30 on a single worksheet. I actually found that this is pretty fast, but as you mention, parsing the pricing is much harder beause their html is not id's in any way.

PHP:
Public Function GetData(ByVal prodID As String) As String
    Dim htm As Object
    Dim theURL, desc As String
    
    theURL = "http://www.monoprice.com/products/product.asp?p_id=" + prodID

    Set htm = CreateObject("htmlFile")
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", theURL, False
        .send
        htm.body.innerhtml = .responseText
    End With
    
    desc = htm.getelementbyid("product-name").innerhtml
    GetData = StripHTML(desc)

End Function
 

Tweak155

Lifer
Sep 23, 2003
11,449
264
126
Try the following line to strip what you want out of the HTML...

curHTML is the current HTML of the page - replace "id=...." with your starting point. It reads to the next closing tag.

Code:
Trim(Split(Split(curHTML, "id=""tab-view-description"" href=""", , vbTextCompare)(1), "<")(0))

EDIT:

Try:

Code:
Trim(Split(Split(curHTML, "class=""red12px"" style=""text-align:center;"">", , vbTextCompare)(1), "<")(0))

This gives me the price of one cable off your example page.
 
Last edited:

MrScott81

Golden Member
Aug 31, 2001
1,891
0
76
Finally got everything working correctly, figured I would paste the code in case anyone happens to stumble across this thread....some of the values are specif

PHP:
Public Function GetData(ByVal prodID As String) As String
    Dim htm As Object
    Dim theURL, desc As String
    
    theURL = "http://www.monoprice.com/products/product.asp?p_id=" + prodID

    Set htm = CreateObject("htmlFile")
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", theURL, False
        .send
        htm.body.innerHTML = .responseText
    End With
    
    desc = htm.getelementbyid("product-name").innerHTML
    GetData = StripHTML(desc)
End Function

Public Function GetPrice(ByVal prodID As String) As String
    Dim htm
    Dim hElem, divElem As MSHTML.HTMLGenericElement
    Dim theURL, desc As String
    Dim row As Range
    Dim index As Integer
    Application.Volatile
    
    theURL = "http://www.monoprice.com/products/product.asp?p_id=" + prodID

    Set htm = CreateObject("htmlFile")
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", theURL, False
        .send
        htm.body.innerHTML = .responseText
    End With

    'Loop through all the rows and sum items with the same PID
    'row.Column(2) = PID
    'row.Column(3) = Quantity purchased
    prodCount = 0
    For Each row In ActiveSheet.UsedRange.Rows
        If row.Columns(2) = prodID And row.Columns(2) <> "PID" Then
            prodCount = prodCount + row.Columns(3).Value
        End If
    Next row

    'loop through table tags
    For Each hElem In htm.getElementsByTagName("table")
        If hElem.getAttribute("width") = "260" Then
            Set divHTML = CreateObject("htmlFile")
            divHTML.body.innerHTML = hElem.innerHTML
            index = 0
            For Each divElem In divHTML.getElementsByTagName("div")
                index = index + 1
                If prodCount = 1 And index = 4 Then
                    GetPrice = divElem.innerHTML
                    Exit For
                ElseIf prodCount >= 2 And prodCount <= 9 And index = 6 Then
                    GetPrice = divElem.innerHTML
                    Exit For
                ElseIf prodCount >= 10 And prodCount <= 19 And index = 8 Then
                    GetPrice = divElem.innerHTML
                    Exit For
                ElseIf prodCount >= 20 And prodCount <= 49 And index = 10 Then
                    GetPrice = divElem.innerHTML
                    Exit For
                ElseIf prodCount >= 50 And nerdCount = 12 Then
                    GetPrice = divElem.innerHTML
                    Exit For
                End If            
            Next divElem
        End If
    Next hElem
End Function

Function StripHTML(sInput As String) As String
 Dim RegEx As Object
 Set RegEx = CreateObject("vbscript.regexp")

 Dim sOut As String

 With RegEx
    .Global = True
    .IgnoreCase = True
    .MultiLine = True
    .Pattern = "<[^>]+>|\?" 'Regular Expression for HTML Tags.
    sOut = RegEx.Replace(sInput, "")
    .Pattern = "[^A-Za-z0-9 \|\/\.\-]" ' & Chr(39) & "]"
    sOut = RegEx.Replace(sOut, "")
 End With 
 StripHTML = sOut
 Set RegEx = Nothing
End Function