HTM2TXT v 1.0

html2txt.png?itok=U_DRc0al 


HTM2TXT v 1.0, May.08,1997 by Otto R꼋er


Description:


   HTM2TXT.CMD is a REXX script which extracts HTML-tags from

   .HTML-files used in Internet www-communication and stores

   the remaining text into an ASCII-file.


Group:


   HTM2TXT belongs to group: ..pub/os2/apps/internet/www/util/


Freeware:


   HTM2TXT may be distributed freely under the following conditions.

   Copyright notices must NOT be removed, all files contained in the file

   inventory below must be distributed together (you may not remove any

   files), and you may not charge for the program.


   If you find the program useful then send a post-card (picture of

   the location where you live) to:


       Otto R꼋er

       Hauptstrasse 61B/13

       A3001 Mauerbach

       ---------------

       Austria


Prerequisites:


   HTM2TXT requires OS/2 and REXX.


   It has been developed and tested under OS/2 Warp,

   there is no intention to move it to other platforms.


Distribution:


   The following files are contained in HTM2TXT1.ZIP:


    HTM2TXT.CMD       the REXX command-file 1997-05-08

    HTM2TXT.ICO       an icon file contributed by Gerard Pinkas, pinkas@en.com

    MAKEOBJ.CMD       a command to create a desktop program object

    README.TXT        documentation, this file

    FILE_ID.DIZ       Id-file


Installation:


   To install HTM2TXT just UNZIP the HTM2TXT1.ZIP file and place the

   command into a directory contained in your CONFIG.SYS PATH= statement.


   You may use the MAKEOBJ.CMD to create a desktop oject for HTM2TXT.CMD.

   You should run MAKEOBJ.CMD from that directory where HTM2TXT.CMD and

   HTM2TXT.ICO are installed.


Usage:


     From an OS/2 command line start HTM2TXT:


       htm2txt filename.htm


     Make sure filename.htm is in the current directory.


     filename may contain wildcard character '*'.


   or


     Drag and drop a .HTML object to the HTM2TXT object if you

     have created one using makeobj.cmd.


   HTM2TXT will create an output file 'filename.txt' and it will

   start an editor to view this file.


   Note: HTM2TXT will follow <a href="...> tags and tries

         to resolve the given link-address. If it can be

         accessed it will be included in the .txt file.


   Following statements may be changed to customize operation:


     line 11: linemax=72    maximum line length in output file.

                            Any text longer will be split to

                            output lines not longer than 'linemax'


     line 12: pixlbyt=6     when <td width="nnnPIX">

                            then the column width

                            in tables is determined

                            by: chars = nnn/pixlbyt.


     line 13: editor='e'    the name of an ASCII editor to display

                            result file. It may be changed to the

                            installations favoured editor.


                            editor='' causes no editor to be called



     line 14: chain='Y'     tells HTM2TXT to follow  href-chains.

                            any other setting inhibits chaining.


     line 15: showu='N'     tells HTM2TXT not to show href-chain-addresses

                            in output-text. if set to 'Y' chain-addresses

                            are shown in output-text.


     line 16: nocmt='N'     tells HTM2TXT not to suppress html-comments.

                            any other value suppresses html-comments

                            in output file.


     line 17: ofile='.TXT'  tells HTM2TXT the outputfile-name should be

                            derived from the inputfile-name: it should

                            be ifiname.TXT.

                            any other value may specify a valid

                            path\filename or a symbolic device

                            like STDOUT.


     line 70: consts=       this is a table of variables to substitute

                            special characters. This table has been

                            contributed by tremro@digicom.qc.ca


   You may temporarily overwrite these parameters by adding options

   when starting htm2txt from an os/2 command line:


     htm2txt filename.html l 80      to set linemax to 80 characters

     htm2txt filename.html p  8      to set pixlbyt to  8 pixels/char

     htm2txt filename.html e tedit   to set editor  to tinyedit

     htm2txt filename.html o finame  to define an output file name

     htm2txt filename.html f n       to suppress chaining

     htm2txt filename.html u         to include url-references in .txt

     htm2txt filename.html c         to suppress comments


   These options may appear in any order after the filename:


     htm2txt filename.html e te p 8 l 80 u f n o stdout c


Warranty:


   The program is distributed on an as-is basis.

   It tries to extract as much text as possible,

   however, i am sure, there are some special forms

   of tags which i missed.

   Normally such tags are simply ignored.


   There is no guarantee to get certain results

   nor is any guarantee to avoid damages of existing files.


   Note: In the current directory the program will

         overwrite any file with filename of

         input-file and a file extension of .TXT, eg.: filename.TXT !


Comments:


   Comments and recommendations pls to:


         oraeder@ibm.net

Comments

Category
State
  • 현재 접속자 251 명
  • 오늘 방문자 717 명
  • 어제 방문자 1,567 명
  • 최대 방문자 11,402 명
  • 전체 방문자 2,449,098 명
  • 전체 게시물 3,053 개
  • 전체 댓글수 4,592 개
  • 전체 회원수 104 명

- 쇼핑몰 : Softbox
- 예전 문서 / Old docs
- FTP Server: http://ftp.hanmesoft.com
Facebook Twitter GooglePlus KakaoStory NaverBand