Monday, December 27, 2010

Generating db patches with DiffKit

In the 0.8.7 release, DiffKit gained the ability to generate database "patches". These patches are analogous to the patch files produced by traditional *nix diff tools-- they can be read by a patching tool in order to edit the RHS so that it is identical to the LHS.
Table 1. DiffKit versus Diffutils
diff tool patch format patch tool
DiffKit
diffkit-app
sql DML
any DML applicator
Diffutils
diff
unidiff
patch
The "patch" files produced by DiffKit contain only INSERT, DELETE, and UPDATE statements. After the user applies those DML statements to the RHS table, using whichever tools and techniques they prefer, the RHS table will have identical contents to the LHS table. The DiffKit application will never directly modify your tables-- DiffKit is strictly a read-only application from the perspective of your table data.
DB patches are created by using a new Sink implementation: the SqlPatchSink. test26.plan.xml, in the eg/ (examples) folder, dmonstrates this:
test26.plan.xml
   ...
      <property name="sqlPatchFilePath" value="./test26.sink.patch" />
   ...
invoked this way:
java -jar ../diffkit-app.jar -planfiles test26.plan.xml,dbConnectionInfo.xml
produces this output in the patch file:
test26.sink.patch
DELETE FROM PUBLIC.TEST26_RHS_TABLE
WHERE (COLUMN1='1' );

INSERT INTO PUBLIC.TEST26_RHS_TABLE (COLUMN1, COLUMN2, COLUMN3, COLUMN4)
VALUES ('2', 'xxxx', 2, 'zz2zz');

UPDATE PUBLIC.TEST26_RHS_TABLE
SET COLUMN3=3
WHERE (COLUMN1='3' );

UPDATE PUBLIC.TEST26_RHS_TABLE
SET COLUMN2='5555', COLUMN3=4, COLUMN4='zz4zz'
WHERE (COLUMN1='4' );

INSERT INTO PUBLIC.TEST26_RHS_TABLE (COLUMN1, COLUMN2, COLUMN3, COLUMN4)
VALUES ('5', 'xxxx', 5, 'zz5zz');

DELETE FROM PUBLIC.TEST26_RHS_TABLE
WHERE (COLUMN1='6' );

Wednesday, December 15, 2010

Embedding the DiffKit framework in your application

Here is some helpful information for using the DiffKit framework within your Java application.
  • everything you need is inside the binary distribution: diffkit-<release>.zip. In fact, everything you need is within the standalone application: diffkit-app.jar. You do not need the source distribution.
  • unjar diffkit-app.jar. All of the DiffKit api is then in the diffkit- <release>.jar file. That file is (as of this writing) 315KB and you can embed it in your Java application in the same way that you would embed any other jar.
  • all of the diffkit-<release>.jar dependencies are in the lib/ directory that resulted from unjarring diffkit-app.jar. Not all of those jars are hard dependencies-- many of them will only be loaded if you are touching certain functionality. In particular, if you are embedding DiffKit in your application and only programming against the core apis, you do not need to include these jars in your application:
    • groovy-all-<release>.jar — only needed to run the embedded TestCaseRunner.
    • h2-<release>.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the H2 database.
    • db2jcc.jar,db2jcc_license_cu.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the IBM DB2 database.
    • ojdbc14.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the Oracle database.
    • mysql-connector-java-5.1.13-bin.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the MySQL database.
    • jtds-1.2.5.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the SQL Server database.
    • postgresql-9.0-801.jdbc4.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the PostgreSQL database.
    • hsqldb.jar — only needed if you want to use a DKDBSource or DKDBSink that is configured for the HyperSQL database.
    • org.springframework.*.jar — only needed if you want to configure your use of DiffKit via the Spring framework. If all of your DiffKit configuration is programmatic, then you don’t need the Spring jars.
  • The combination of diffkit-<release>.jar + it’s core dependencies (excludes Groovy, Spring and all of the JDBC drivers) is 2.6MB.

Thursday, December 2, 2010

DiffKit 0.8.5 released -- fixes several minor bugs

Changelog: 


release 0.8.5 (12/2/2010)

fixes Issue 52: DiffKit 0.8.4- does not work with Java 1.5 and
flavor=SQLSERVER

fixes Issue 53: mysql unrecognized dbType _MYSQL_INT_UNSIGNED

fixes Issue 54: DiffKit 0.8.4- does not work with SQLServer 2005

Wednesday, December 1, 2010

Activate tracing debugging in DiffKit

By default DiffKit emits a small amount of helpful information on stdout. But DiffKit is capable of producing very fine-grained tracing and debugging information that cover all aspect of operation. You might find this helpful for diagnosing some type of problem with DiffKit. Or you might simply want a better understanding of how DiffKit works. If you are a Java programmer and interested in learning about the internals of DiffKit, reading the "logs", configured for the appropriate logging level, is a very productive technique. The DiffKit logs are designed to be highly readable and they try to tell a story.
DiffKit uses the Open Source Logback Java framework to configure and control all logging. Logback is the official successor project to the ubiquitous log4j. In DiffKit, logback is configured by editing a file named "logback.xml", which is found in the "conf/" directory under the DiffKit home (DiffKit home is usually the directory where you unzipped the distribution zip file; it usually contains the diffkit-app.jar executable).
logback.xml
...
   <logger name="org.diffkit">
      <level value="warn" />
   </logger>

   <logger name="user" additivity="false">
      <level value="info" />
      <appender-ref ref="USER" />
   </logger>
...
DiffKit uses two tiers of logging information: "user" and "system". "User" log messages are intended for regular DiffKit users and represent typical operational information that is presented on your console (standard out, or stdout). The standard output you see when you invoke DiffKit comes from the "user" tier. The "system" tier of messages is targeted at engineers who need to trace or debug DiffKit internals in order to better understand what is going on. Each of these tiers has it’s own "domain" (entry) in the configuration file. The "system" tier is represented by this entry:
logback.xml
...
   <logger name="org.diffkit">
      <level value="warn" />
   </logger>
...
DiffKit adheres to the standard logging "level" conventions used by logback and log4j:
  • trace: a crazy level of detail, only suitable for the deepest debugging.
  • debug: high level of detail, useful for debugging, not suitable for day-to-day operations.
  • info: a normal, conversational, level of information. Includes routine operational messages that help orient and inform the user about normal operating parameters and outcomes.
  • warn: something that needs to be looked into. Represents an abnormal operating condition, but not necessarily fatal.
  • error: something is totally broken, and things are probably not working, but there is some slim chance the program can still stagger on.
  • fatal: game over.
As you can see from the logback.xml configuration file, DiffKit normally logs info level messages in the "user" tier, but only warn (or worse) level messages in the system tier. If you want to see what’s going on at the system level, you can do this:
logback.xml
...
   <logger name="org.diffkit">
      <level value="info" />
   </logger>
...
If you want a lot more information, you can do this:
logback.xml
...
   <logger name="org.diffkit">
      <level value="debug" />
   </logger>
...