- Harvest-NG
Harvest-NG is a collection of Perl modules and scripts which provide a powerful web crawling and summarizing agent. The code is aimed at providing an open source, standards compliant, tool for fetching content from a wide variety of information sources, summarising it into a set of resource descriptions, and storing these in an easily accessible database from which search services can be built and statistical information compiled.
Price: Free
Version: 1.0.2
http://webharvest.sourceforge.net/ng/
(Rating: 0.00 Votes: 0)
Rate It
Review It
- Web Secretary
Web Secretary is a web page monitoring software. However, it goes beyond the normal functionalities offered by such software. Not only does it detect changes based on content analysis (instead of date/time stamp or simple textual comparison), it will email the changed page to you with the new contents highlighted. Web Secretary is written in Perl and should be able to run on all Unix systems with the Perl interpreter (and LWP module) installed.
Price: Free
Version: 1.3.4
Platform: Unix
http://homemade.hypermart.net/websec/
(Rating: 0.00 Votes: 0)
Rate It
Review It
- WebAwk
This is a proof-of-concept of a tool to automate web browsing / data collection. It works like AWK except that instead of working on files and lines it works on HTML pages and hyperlinks. It is meant to be run as a command line script and includes base_url - the URL the script was initially invoked on, base_path - root of saved data tree, url - current URL being processed, linked_from - parent of current URL, and content - the actual data corresponding to the current URL.
Price: Free
Version: 0.1
http://lakas.itgo.com/
(Rating: 0.00 Votes: 0)
Rate It
Review It
|