Monitoring web page changes

From Useful Data
Revision as of 20:09, 15 June 2017 by Simon (talk | contribs) (Recreated page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

There is a web page that changes very infrequently, but I need to know when it does change so I can act on the new information. I don't want to be checking it frequently myself, I want a computer to do that for me. But how best to make that happen?

Options

There seems to be these high-level options:

  • make use of any RSS feed the page provides;
  • a software program one installs and runs;
  • an online service that monitors the pages you are interested in;
  • a browser add-on that monitors the pages you are interested in;
  • writing a script to do it using, say, curl or wget.

Make use of any RSS feed the page provides

I wouldn't be writing this if the page had one of those!

A software program one installs and runs

The only one for Linux, samidare (written in Ruby) is command-line based and I can't work out how to use it.

An online service that monitors the pages you are interested in

Some are free, some cost money, some do both. These include Versionista, Visualping, FollowThatPage, Femtoo, ChangeDetection, WatchThatPage and many others. I have not reviewed any of these yet. They tend to let you know by email if a page has changed.

A browser add-on that monitors the pages you are interested in

I'm only interested in Firefox add-ons at this time. These include Alertbox, Check4Change, Distill Web Monitor, Follow That Page, Fox Notifier, SiteDelta, UpdateScanner and Wachete.

Things to think about when comparing these add-ons:

  • Do you need to keep the page open?
  • How does it do notification?
  • What if you close and re-open Firefox, and the page has changed during that time?
  • Does it deserve a ?, a ✔ or a ✘ ?
  • Is it free?

✘ Alertbox

Link. Now called Distill Web Monitor.

✘ Check4Change ★★★★☆ from 142 user reviews. 26,919 users

Link. Says: "C4C currently only works with open tabs. It does not…remember running jobs between FireFox restarts" which is a show-stopper for me.

✘ Distill Web Monitor ★★★★☆ from 65 reviews. 15,770 users

Link. If you want to monitor web sites from a variety of devices and pay for the privilege then the cloud-version of this looks ideal. The browser-only version has the benefit of selecting regions of a web site, but I could not get it to work. The interface is overkill compared to UpdateScanner. Looks promising; must try it.

✘ Follow That Page ★★★☆☆ from 9 user reviews. 417 users

Link. Not updated since 2008. Comments suggest it is pants.

✘ Fox Notifier ★★★★☆ from 11 user reviews. 16 users

"Not available for Firefox 44.0". Last updated in 2010.

✘ SiteDelta ★★★★☆ from 62 user reviews. 1,638 users

Link. Confusing set of options and in multiple places. It is supposed to have a delta icon, but none appears. I can't get it to notify me of changes.

✔ UpdateScanner ★★★★☆ from 205 user reviews. 30,262 users

Link. Looks promising; must try it. Currently got this monitoring the BBC home page, the Google news site and the BBC Radio home page. Provides an icon in the toolbar which can open a sidebar listing pages being monitored or give you options to monitor the current web page. Hey, this works! The dialog box pops up saying how many web pages have changed and when you click on it, it opens them in tabs.

✘ Wachete ----- from 0 user reviews. 7 users

Link. A paid-for web site monitoring service that emails you, disguised as a free add-on.


Writing a script to do it using, say, curl or wget

I thought this might turn out to be most successful and easiest but was wrong. The idea was to use cron to schedule some shellscript that uses curl or wget to fetch the page(s) of interest, piping the output to a file and comparing that file to the previous downloaded version to see if it has changed, and then notifying me. But, many parts of web pages change that are not of interest, including in the page that prompted this, a 'session number' in which I have no interest. The add-ons for browsers already have the functionality to manage that sort of thing so there's not much point me re-coding the wheel.

Mind you, there's always Python. But, again, I'd only be repeating what the add-ons do.