Java ETL: hard to find a suitable one [closed]
I'm looking for an embeddable Java ETL, i.e., an Extract Transform Load engine that can be called from Java code.
I'm finding it surprisingly hard to find a suitable one.
I'm mainly looking at loading delimited text files into database tables, with some minor transforms along the way.
I'd like the following features:
- the ability to specify the simple mappings externally, e.g, text column 5 to database column foo, specified some xml mapping file
- the ability to give the the database node a javax.sql.Datasource
CloverETL allows mapping to be specified in XML, but database connections must be either JNDI names or a properties file specifying driverClass, url, dbusername, password, etc. Since I already have s set up by my dependency injection framework, properties files seem painful and non-robust, especially if I want this to work in several environments (dev, test, prod).javax.sql.Datasource
KETL tells me that "We are currently in the process of completely overhauling our documentation for KETL™. Because of this, only the installation guide has been updated." Honest, but not helpful.
Octopus is now "http://www.together.at/prod/database/tdt", which is "under construction".
Pentaho seems to use the same "specify driverClass" style that CloverETL does, rather that using a datasource, but Pentaho's documentation for calling their engine from java code is just difficult to find.
Basically I'd really like to be able to do this pseudo-code:
extractTransformLoad(
getInputFile( "input.csv" ) ,
getXMLMapping( "myMappingFile.xml") ,
new DatabaseWriter( getDatasource() );
Any suggestions?