Wednesday, November 10, 2010

How to regression test a database application (part 1)

This is part 1 of 2. In part 1 I characterize what is meant by "database application". I also quickly review some different types of regression testing, and then describe a hypothetical retail inventory example system which will serve as the subject of detailed discussion. In part 2, I explore the "final table" type of regression test in detail, and apply it to our retail inventory system. I conclude with an argument for why "final table" testing should always be included on your project.

Database Applications

Here, a database application means a software system that uses an RDBMS as its primary persistence mechanism. In other words, a database application is built on top of, and around, a relational database. The application logic might be implemented within the database itself, in the form of stored procedures and other database objects, or it might be implemented outside the database in a general purpose programming language such as Java or .NET. It’s quite common that Big Enterprise sytems are implemented using a layered mixture of different programming technologies. Another characteristic of database applications is that the application data, in the database, is the primary window through which the application teams (developers and admins) monitor the health of the system and evaluate the correctness of the results.
Regression testing is a well established technique to help ensure the quality of software systems. The intent of regression testing is to assure that a change, such as a bugfix, did not introduce new bugs. At its most basic level a regression test reruns a function that used to work and verifies that it still produces the same results. The basic elements of a regression test are:
  • a function to test. The function must have identifiable inputs and outputs. Function here is meant in the broadest possible sense. It could represent a database stored procedure, a Java method, an RDBMS View, or a high level shell script job that choreographs other jobs. Any software artifact that is callable or invocable.
  • a predetermined, capturable (i.e. saveable), input data.
  • a predetermined, capturable, expected output data for the above input data.
  • an actual output, which results from applying the latest revision of the function to the input data above.
  • a means for comparing the expected output with the actual output to determine if they are the same or if they are different in a way that is expected.
Regression testing can be applied at many different levels. Applied at the lowest, most primitive, level it is called a "Unit Test". Unit testing applies a laser focus on one low level functional unit. Unit tests are very popular amongst the general purpose programming language communities, such as Java and Python. Testing above the Unit level is generally referred to as functional testing. Again, functional testing can ocurr at different levels. It can test software modules that compose the lowest level operational units into higher level functions. Here, I call a high-level functional module that implements one complete business function a slice. At the top of the testing hierarchy is the system test. The system test exercises the fully integrated, end-to-end, system at the highest possible level. It touches all functions and modules. This usually corresponds to the same level at which the users experience the system.

A Concrete Database Application Example

For the sake of better illustrating the concepts and techniques in this article, I’ll invent a hypothetical database application called the "Automated Inventory Replenishment" system — AIR. The AIR system is maintained by a midsized enterprise that is in the retail sales business. Every day, the business sells a portion of its inventory (stock) and must decide how to restock the store shelves. The AIR system is highly streamlined and simplified, in order to allow focus on just the salient points. The enterprise performs the restocking only once per day, after all of the stores have closed. As a result, AIR is run once per day, well after the close of business. AIR is run in a batch mode-- a single invocation of the system performs a complete, daily, restocking cycle.
Figure 1. shows the complete AIR system.
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJIjVAuvkdejj0Yqdek-Ds1GCNSqxqzFxBmCkPPdg8JyQfTbLF0YHNTL0Ufb_d7JJqj-3qigmeezC80BKahfpLOMCh0QfXeL3pVv9GYNfWwcmhY0S1zXlM-qGbRbs3EGLlHOGk9Tho4-Ej/s1600/inventory_system.png
Figure 1: System
There are many different functional modules, or slices, needed to complete the entire cycle. The inputs are: the current level of inventory for each product, pricing and availability for all products we might want to purchase, and a listing of all available unoccupied shelf space. AIR must determine which product orders to place in order to fill all of the unoccupied shelf space. So the outputs are simply a list of product orders (SKUs and quantities). The outputs are stored in a single table in the database, which is referred to as the "final table" because it is the end (terminal) of a chain of tables used during processing. One of the slices, the "Forecast Slice", is exploded in detail in Figure 2:
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiT7PV9GY-5MWYimroYYD6ODW-L9uyfBU5VIr3yLM_pwDU8IROlw_7dhi2Z7uDFhBAehTud8-2BcTbZ-ZOG21kTO87SHizNLaOiYCQ_kVOiCvGDbcPygXl0P2uhkz4qTt4Hrm9yolripQJX/s1600/inventory_slice.png
Figure 2: Forecast slice
The forecast slice is responsible for guessing how much of each product will be needed (or wanted) for the next business week. These forecast results are one of many importantant inputs to the final ordering function. The final ordering determination has to include not just forecast information, but also price and availability. The forecast slice is implemented as a Java function (method), a DB stored procedure, and a collection of tables and views. Java was chosen for the detailed forecast business logic, because the calculations and flow control are very complicated and better suited for a general purpose programming language than for an embedded database procedural language (e.g. pl/sql). The Java method calls the stored procedure getProductList(), which returns a complete list of all products that we might want to forecast for. The Java forecast method writes all of the forecasts (results) to a table.
Suppose I’m the project manager for the AIR system, and it has been in production for several years. Let’s also suppose that new functional and operational requirements are fed to team AIR on a regular basis. My team is working on changing the system in order to provide new business functionality. But before I can release these changes into the production environment, I have to be confident that we are not breaking something that already works. Let’s suppose that the current level of operational integrity of AIR is fair; then, even minor bugs introduced as a result of change could very well cause noticeable degradation of operational integrity for AIR, and appreciable degradation of career prospects for staff. Clearly, some form of regression testing is called for. But it’s not at all clear which tests will give the highest level of confidence for the least amount of work. Say that some of the slices are covered by both unit and higher level functional regression tests. But, as in most real-world custom-built enterprise applications, most slices are not covered by rock-solid automated regression. For some slices, there are no regression tests at all. For others, there are some unit and functional tests that, history has shown, catch some, but not all, of the bugs that typically creep in as a result of system change.
As project manager, I now have to decide what mix of regression tests will provide the biggest bang-for-the-buck. In part 2 I dive into the details of the costs and benefits associated with the different types of regression tests.

34 comments:

  1. It is also the same with Hadoop testing through various models. We utilize ping signals and connection packet traffic for generic network management software before testing out the main enterprise software.

    ReplyDelete
  2. That's a good read. Thanks. Shared an article on database regression testing - http://bit.ly/1qDOy8f. Might be a good further read.

    ReplyDelete
  3. Thanks for your informative article on UFT automation testing tool. Your post helped me to understand the features and functionality of QTP automation testing tool. QTP training

    ReplyDelete
  4. Technology place a vital part in humans ecosystem. So in order to survive one must be up to date. Thanks for sharing this information in here. Keep blogging article like this. I have bookmarked this page for future reference.


    Hadoop Training Chennai | Big Data Training
    | JAVA training in Chennai

    ReplyDelete
  5. Your blog is awesome..You have clearly explained about it ...Its very useful for me to know about new things..Keep on blogging..
    Hadoop training in chennai

    ReplyDelete
  6. What a great think really informative post thanks for useful sharing Gorilla Online Marketing

    ReplyDelete
  7. Well Said, you have furnished the right information that will be useful to anyone at all time. Thanks for sharing your Ideas.
    PHP Training in Chennai | PHP course in Chennai

    ReplyDelete
  8. The blog gave me idea to perform regression test on database My sincere Thanks for sharing this post and please continue to share this post
    Software Testing Training in Chennai

    ReplyDelete
  9. Nice information!!! I prefer Loadrunner automation testing tool to validate the performance of software application/system under actual load.
    softwaretesting training in chennai

    ReplyDelete
  10. our content is awesome . You have done a great job and its very useful for me . I appreciate your effort and I hope that you will get more positive comments from the web users.. Want to learn software testing reach us at Software Testing Training in Chennai | Selenium Training in Chennai

    ReplyDelete
  11. It has been our experience that, defending on the size of the system, it is sometimes necessary to dial this number back to eliminate overloading the database.create mysql dashboard

    ReplyDelete
  12. Great blog! Thanks for giving such valuable information, this is unique one. Really admired.

    QTP Training in Chennai

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete
  14. A clean space is important for every organization because it creates a positive impression on customers or guests that come in. It’s also important for employees because it shows that the company cares about them and their well-being. For partners, it sends an invitation to use the organization’s products or services. This is why commercial cleaning company Dallas TX becomes necessary on every level.

    ReplyDelete
  15. This comment has been removed by the author.

    ReplyDelete
  16. Hey there, I’m William, a passionate web developer residing in the sunny coastal city of San Diego, California. With a knack for both coding and creativity, I thrive in the dynamic world of web development and design.

    ReplyDelete