com.mindprod.csv
Class TableToCSV

java.lang.Object
  extended by com.mindprod.csv.TableToCSV

public final class TableToCSV
extends java.lang.Object

Extracts rows in CSV tables to CSV form. Extracts data from all tables in the input. Output in xxx.csv.

Use: java.exe com.mindprod.TableToCSV xxxx.html It also strips tags and converts entities back to UTF-8 characters.

Since:
2011-01-23
Version:
1.1 2011-01-25 allow you to specify encoding
Author:
Roedy Green, Canadian Mind Products

Constructor Summary
TableToCSV(java.io.File file, char separatorChar, char quoteChar, char commentChar, java.lang.String encoding)
          Constructor to convert an HTML table to CSV.
 
Method Summary
static void main(java.lang.String[] args)
          Simple command line interface to TableToCSV.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TableToCSV

public TableToCSV(java.io.File file,
                  char separatorChar,
                  char quoteChar,
                  char commentChar,
                  java.lang.String encoding)
           throws java.io.IOException
Constructor to convert an HTML table to CSV. Strips out entities and tags.

Parameters:
file - CSV file to be packed to remove excess space and quotes.
separatorChar - field separator character, usually ',' in North America, ';' in Europe and sometimes '\t' for tab for the output file. It is tab for the input file. Note this is a 'char' not a "string".
quoteChar - character used to quote fields containing awkward chars.
commentChar - character to treat as comments.
encoding - encoding of the input and output file.
Throws:
java.io.IOException - if problems reading/writing file
Method Detail

main

public static void main(java.lang.String[] args)
Simple command line interface to TableToCSV. Converts one HTML file to a CSV file, extracting tables, with entities stripped. Must have extension .html
Use java com.mindprod.TableToCSV somefile.html . You can use TableToCSV constructor in your own programs.

Parameters:
args - name of csv file to remove excess quotes and space