Thursday, September 16, 2010

TextDiffor in 0.6.11

0.6.11 introduces the TextDiffor, which is useful for diff'ng chunks of text that might have small formatting differences that you would like to ignore. A good example of this is programming language code. For instance, I was recently trying to diff SQL schemas (DDL) using meta data tables. A troublesome schema object to diff was the TEXT definition of stored procedures. One side looked like this:

 DECLARE  
  number1 NUMBER(2);  
  number2 NUMBER(2)  := 17;       -- value default   
  text1  VARCHAR2(12) := 'Hello world';  
  text2  DATE     := SYSDATE;    -- current date and time  
 BEGIN  
  SELECT street_number  
   INTO number1  
   FROM address  
   WHERE name = 'INU';  
 END;  

while the other side looked like this:

 DECLARE  
  number1  NUMBER(2);  
  number2  NUMBER(2)  := 17;       -- value default   
  text1    VARCHAR2(12) := 'Hello world';  
  text2    DATE     := SYSDATE;    -- current date and time  
 BEGIN  
  SELECT street_number  INTO number1 FROM address WHERE name = 'INU';  
 END;  

Note that the lines of the text1 and text2 variable declarations have small alignment differences between the two sides, and that the SQL SELECT statement is multiline in the first case, but only 1 line in the second case. These two snippets are identical PL SQL programmings (produce the same AST), but are different textually.

The TextDiffor will, by default, see these two snippets as identical. It uses a very simple text normalization before performing the String comparison.

1) replace all tabs and newlines ([\t\r\n]) with a single space character
2) compress all multi-character whitespace runs to a single space character
3) trim all whitespace from both ends

9 comments:

  1. Have you had a look at google-diff-match-patch/? It has a Java version too. Just FYI.

    Cheers!

    ReplyDelete
  2. Thanks for pointing that out Ashwin. I'll see if it makes sense to plug that in instead of rolling my own. One requirement that has been posed is to be able to ignore comments in stored procedures. My initial thinking is to simply implement this as a list of regexes to be ignored, but perhaps google-diff-match-patch can already handle it.

    thanks,

    Joe

    ReplyDelete
  3. The first thing you should do is talk with your friend or family member about the potential Fast bail bonding services in Dallas TX they would like to use. If you can’t reach them, then go online and look for reviews on the various companies in the area. The best way to find a good bail bond company is through word of mouth, so ask around!

    ReplyDelete
  4. replica bags hermes site link u4m77a2y92 replica bags manila replica bags london replica gucci y7v71n3o09 replica bags thailand hop over to this site o9e71t8j65 best replica designer zeal replica bags reviews

    ReplyDelete
  5. website link q3u74g5k57 louis vuitton fake replica bags turkey replica bags chicago look at this now a2r07f6o60 joy replica bags review replica bags from china free shipping hermes fake q9x23b5n77 replica bags supplier

    ReplyDelete