• home
  • forum
  • my
  • kt
  • download
  • How to make a program to validate web links

    Author: 2008-09-13 10:09:51 From:

    You can be pretty sure that www.microsoft.com will always be a valid link, but many other web links tend to come and go as their authors move on, get bored, or whatever. If you maintain a web page with lots of external links, it is in your interest to ensure that the links are functioning. If lots of the links on your page are dead, your visitors will not be impressed! Many web page design tools, such as FrontPage, have a "validate links" command but I have found these to be fairly useless. They tell you which links are OK and which are not, but they do not permit you to do anything about the bad links (other than manually removing them from the pages). I needed a better solution.

    I keep my links database in an Excel worksheet - one column for URL, one for Title, etc. I have written an Excel Basic macro that reads through the table and writes the HTML code for each of my links pages. I decided to write a Visual Basic program that would:

    1. Use DDE to read a URL from the links worksheet.
    2. Attempt to connect to the URL and note whether the URL is valid or not.
    3. Use DDE to put the label "OK" or "BAD" in the worksheet to indicate whether the URL was good or not.
    4. Continue for all the links in the worksheet.

    Then, I would re-write the Excel Basic macro so that a link will be written to the HTML file only if the URL is marked as "OK." This approach has worked very well for me. Here's the code. Create a form with a Text Box, an Internet Transfer control, four Label controls, and a control array of two Command Buttons.  In the General section put the following code:

    Option Explicit
    Const FILENAME = "c:\documents\stamps\links.xls"
    Dim XLObj As Excel.Application

    Change the FILENAME constant to point to your worksheet file. Here is the code for the Form_Load and Command1_Click event procedures:

    Private Sub Command1_Click(Index As Integer)

    Select Case Index
    Case 0 'Start
    Command1(0).Enabled = False
    Call CheckLinks
    Command1(0).Enabled = True
    Case 1 ' Quit
    End
    End Select

    End Sub

    Private Sub Form_Load()

    ' Create the Excel object and open the worksheet.
    Set XLObj = CreateObject("Excel.Application")
    XLObj.Workbooks.Open FILENAME
    Inet1.Protocol = icHTTP

    End Sub

    The bulk of the action goes on in the CheckLinks() procedure.

    Public Sub CheckLinks()

    Dim row As Integer, url As String
    Dim buf As String, msg As String, fnf As Integer
    Dim snf As Integer, tout As Integer, ok As Integer

    On Error Resume Next
    ' Make row equal to the Worksheet row where
    ' your data starts.
    row = 4
    tout = 0
    fnf = 0
    ok = 0
    snf = 0

    ' Minimize the form.
    Form1.WindowState = 1
    Do
    ' I keep URLs in column 3 (C) of the worksheet.
    url = XLObj.Cells(row, 3)
    ' If it's empty we are done.
    If url = "" Then Exit Do
    ' Try to open the URL.
    Text1.Text = Inet1.OpenURL(url)
    DoEvents
    ' If the URL returned any text, put the
    ' first 50 characters in a buffer. Error
    ' messages will be found here.
    If Len(Text1.Text) > 50 Then
    buf = Left(Text1.Text, 50)
    Else
    buf = Text1.Text
    End If
    ' Catch a time out error.
    If Err = 35761 Then
    msg = "Timed out"
    tout = tout + 1
    Err.Clear
    ' If nothing is returned it usually means
    ' that the server was not found.
    ElseIf Text1.Text = "" Then
    msg = "Server not found"
    snf = snf + 1
    ' If error 404 is returned from the URL
    ' it means the server was found but
    ' the requested file was not present.
    ElseIf InStr(1, buf, "404") Then
    msg = "File not found"
    fnf = fnf + 1
    ' Otherwise the link is OK.
    Else
    msg = "OK"
    ok = ok + 1
    End If
    ' Put the result in column 5 of the worksheet.
    XLObj.Cells(row, 5) = msg
    ' Move to the next row.
    row = row + 1
    ' Display current status on form.
    Form1.Caption = ok + fnf + snf + tout
    Label1.Caption = "OK: " & ok
    Label2.Caption = "File not found: " & fnf
    Label3.Caption = "Server not found: " & snf
    Label4.Caption = "Timed out: " & tout
    Loop While True

    ' When all links checked, restore the form.
    Form1.WindowState = 0
    ' Close the worksheet.
    XLObj.Workbooks.Close
    ' Delete the object.
    Set XLObj = Nothing
    ' Display a summary of results.
    buf = "OK: " & ok & vbCrLf
    buf = buf & "Server not found: " & snf & vbCrLf
    buf = buf + "File not found: " & fnf & vbCrLf
    buf = buf & "Timed out: " & tout
    MsgBox (buf)

    End Sub

    The program takes a while to run - a couple of hours to check a few hundred links - but because it is running in the background you can continue to use your system for other tasks.

    discuss this topic to forum

    relation tutorial

    No relevant information

    Category

      .NET (8)
      Buttons (3)
      Database Related (7)
      Date and Time (1)
      Development (3)
      Error Handling (2)
      File Manipulation (6)
      Introduction to Visual Basic (24)
      Miscellaneous (3)
      Multimedia (10)
      Networking (10)
      Security (1)
      VB Script (6)

    New

    Hot