众所周知,HttpWatch是强大的网页数据分析工具,通过httpwatch,我们很方便得抓取到http请求,统计出网页的加载时间等信息。
通过人工来观察比较费时费力,那么如何通过自动方式来获取这些信息并自动分析呢?通过httpwatch帮助文档可以看到,它提供了一些接口可以进行调用,支持的语言有C#, Javascript和Ruby,并且给出了example。经过研究发现,使用Python语言也可以调用提供的接口,下面我就介绍一下怎么使用Python语言来调用这些接口。
其实很简单,首先需要安装pywin32,安装完之后,参考帮助文档(帮助文档在安装目录有,里面列出了所有接口),下面根据帮助文档写出示例程序(程序在Python2.7、httpwatch6.0下调试通过):
# -*- coding: cp936 -*-
importwin32com.client
# Create a new instance of
HttpWatch in IE
control =
win32com.client.Dispatch('HttpWatch.Controller')
# Open the IE browser
plugin = control.IE.New()
# Start Recording HTTP
traffic
plugin.Log.EnableFilter(False)
plugin.Record()
# Goto to the URL and wait
for the page to be loaded
plugin.GotoURL("http://www.baidu.com")
# This method waits for a
page to be fully loaded in the IE instance containing the specifiedpluginand is normally used after the GotoURL method.
control.Wait(plugin, -1)
# Stop recording HTTP
plugin.Stop()
Entries =
plugin.Log.Pages.Item(0).Entries
summary = Entries.summary
# Get Response Header and
print it
responsecount = Entries.Item(0).Response.Headers.Count
print"Response Header: "
foriinrange(responsecount):
printEntries.Item(0).Response.Headers.Item(i).Name +':',
printEntries.Item(0).Response.Headers.Item(i).Value
# Get Performance and print
it
ifplugin.Log.Pages.Count !=0:
print"Page Title:
",plugin.Log.Pages(0).Title
print"DNS Lookups
(ms): ", summary.DNSLookUps
print"Total time
to load page (secs):", summary.Time
print"DownloadedData:", summary.BytesReceived
print"HTTP
compression savings(bytes):", summary.CompressionSavedBytes
print"Number of
round trips: ", summary.RoundTrips
print"Number
of errors: ", summary.Errors.Count