Saturday, January 15, 2011

Comparing file tables with DiffKit

In order for DiffKit to diff two Tables, a Table simply being a set of rows, it must know how to align the rows from the left side Table with those on the right. It does that using a key; one or more columns. Before the row sets can be diff’d, they must be sorted. If you are using a DB Source, DiffKit will sort the Tables for you and you don’t need to do anything. If you are using a File Source, DiffKit will not sort the files for you. In the future, I plan to modify DiffKit to do the sorting for you, but in the meantime you must sort the files yourself.
When you perform the sort, you need to ensure that you are sorting using the same comparison function that DiffKit will use internally to compare rows. That’s because comparing rows internally is how DiffKit figures out ROW_DIFFs.
If you are using a MagicPlan to diff File Sources, DiffKit has no data type information about the columns; MagicPlan doesn’t allow it. It’s just a text file, so DiffKit has to assume that all columns are data type String. In that case, DiffKit will use a lexical (String) sort internally to compare rows. And you must ensure that you have also used a lexical sort when you sort the file. I believe that the default comparison term for Unix sort is lexical.
If you are using a PassthroughPlan to diff File sources, you need to tell DiffKit what are the types of each column. If in the PassthroughPlan you have told DiffKit that the key column is type String, then DiffKit behaves exactly as in the case of the MagicPlan, and you must use a lexical sort on the file. However, if in the PassthroughPlan you tell DiffKit that the key has a numeric data type, then DiffKit will internally use a numeric comparison on the rows, and you must sort the file using a numeric comparison.
Bottom line:
DiffKit internal comparison ==(must equal)== comparison used to sort File. DiffKit internal comparison is based on data type(s) of the key.
MagicPlan always results in String data type(s) for the key. PassthroughPlan results in whatever column data types you specify for the key.

8 comments:

  1. With an ideal country, garden storage sheds may come with ramps like standard.louis vuitton vaskor , Not much a great deal more irritating as opposed to getting your revolutionary lose, getting ready to exercise with all your big items, and recognising it's a really number harder as opposed to the idea appears!

    Certainly, aside from expecting a different inividual we are able to a personally, you can take you a lot easier because they build a ramp.ugg bottes ,



    At this time, to build any ramp you will have to have a preparation report.outlet mbt , You should have two piace of cake blocks that one can discover at a computer hardware store towards you : if you are for doubting the fact that about whether they sell off him or her, provide the a fabulous call and they are able to advise you.

    ReplyDelete
    Replies
    1. I have read your blog its very attractive and impressive. I like it your blog.


      SEO Services in India SEO Company in India SEO Company in India

      Guaranteed SEO services Guaranteed SEO

      Delete


  2. Thanks for the good words! Really appreciated. Great post. I ve been commenting a lot on a few blogs recently, but I had nt thought about my approach until you brought it up.

    SAP training in Chennai

    ReplyDelete


  3. Great and useful article. Creating content regularly is very tough. Your points are motivated me to move on.


    SEO Company in Chennai

    ReplyDelete
  4. Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article.thank you for sharing such a great blog with us. expecting for your.

    seo company in india

    ReplyDelete