JHarvester

[Work in progress]¬†A handy little app I’m making for myself that makes it easy to track textual information on any given webpage. It uses the page’s URL and the text element’s XPath to perform a harvest using the external htmlunit library. Lets me keep track of things like the price of an item I’m interested in, or a sports score, all without needing to go to the page.

Can handle pages that have AJAX; however, does not support pages that require authentication or pagination.

The code was developed in a test-driven way and employs the MVC design pattern.

Uses the following external libraries:

  • htmlunit by Gargoyle Software, Inc. licensed under Apache License, Version 2.0.
  • GSON by Google, Inc. licensed under Apache License, Version 2.0.

Full code can be found GitHub.

Update: 24 April, 2017

Added a background thread that periodically updates all the entries at their required update frequency.