Comparison of Tree Graphs

When developing an information system that also includes various processing of design and technological documentation, I faced the following problem.

We have a certain product structure. During the day, different parts of this product are changed and by the evening, it is already unclear what has been changed. Sometimes, products can consist of more than 10 000 elements. The elements are not unique, and the reality is that the structure can be often modified, although the product is almost ready. The failure to understand the scope of changes complicates planning.

The product structure can be represented as a tree graph in PL/SQL. Since I could not find a suitable way to compare two graphs, I decided to create my own method.

Task

During the day, various changes are received by the accounting system from the design system. Production planning is based on the data from the accounting system. Conditions allow you to accept all the changes for the day and recalculate the product specification at night. However, as I wrote above, it is unclear how the yesterday state of the product differs from the today one.

I would like to see what was removed from the tree and what was added to it, as well as which part or assembly replaced another one. For example, if an intermediate node was added to the tree branch, it would be wrong to assume that all the downstream elements were removed from the old places and added to the new ones. They remained where they were, but the insert of the mediation node took place.  In addition, the element can ‘travel’ up and down only within one branch of the tree due to the specifics of the manufacturing process.

Preparation

Product specification is recalculated by the PL/SQL-procedure within Oracle. That’s why I decided to create my own search of changes in PL/SQL.

The  MR_ORDER_TREE table with the product tree looks in the following way:

order_idOrder ID where the tree is stored
item_idElement ID in the tree. Along with order_id, it forms the primary key and is unique within the order
item_refElement ID that includes the selected element
kts_item_idID from the catalog of parts and nodes
item_qtyQuantity
is_item_buyIdentifies if the item is purchased
item_positionPosition number in the assembly

The item_ref, kts_item_id bindings are unique. In addition to the purchase, position, and quantity, there are other attributes of the particular element.

When recalculating the specification, data of the modified orders is completely deleted and recalculated. Therefore, it is necessary to keep the current tree state until the recalculation process starts. To do this, use the MR_ORDER_TREE_PREV table.

The comparison result will be stored in the MR_ORDER_TREE_COMP table, which will include two additional columns:

statA column to mark the element state:
-1 – additional root element
0 – element is removed
1 – element is added
2 – element properties are changed
3 – unknown state when something went wrong
4 – element is not changed
commA comment, which provides additional data about the state

Take the previous and current trees and add them to the MR_ORDER_TREE_COMP table. In addition, we will add a common root to the trees, increase item_id of the current tree by (a maximum value plus 1) item_id of the tree with the previous state. All the elements of the old tree will be considered as deleted, while the elements of the new tree – as inserted.

Specify the level of nesting for the elements of the summary tree so that you do not need to calculate it in the future.

Store the state of the mr_order_tree_comp table for the order to be recalculated. I used the collection, but I think it is possible to use a temporary table.

“Addition” of trees

Now, we need to check the tree for changes by levels and nodes. First, determine the maximum level:

Now, we need to perform a cyclic check of the levels of this tree. At each level, we will select the child elements of each node, and look for an element with an “opposite sign”. In other words, we are going to look for kts_item_id with the same item_ref, but the state 1 is for 0 and 0 is for 1. After that, one of them is deleted, and the incoming elements are re-linked to the remaining node.

When the element is processed, insert it into a list of processed elements. To do this, I used the following procedure:

The following function processed the element:

The cycle looks like:

For (rl.stat = 1), the logic is the same.

When you find the same element, everything is simple – you just need to compare their properties. To do this, use the diff_items function. The situation, when more than one element is found, is an exception that indicates that something is wrong with the structure tree.

Search for identical elements

What to do when there was a change of one node to another, the node was inserted into the middle of the cycle or the node from the middle was removed?

To determine the described situations, I decided to compare the structures of two nodes in order to identify the ratio of the number of identical kts_item_id to the total number of elements. If the value of this ratio is greater than a certain value, the nodes are interchangeable. If the node has several replacement options at the current iteration of the cycle, the option with the highest “similarity index” is taken.

Let’s add a code to the main CASE.

I could find one element in this node.

Without any hesitation, let’s link the contents of the current node to it provided that both elements are assemblies. The current element is not deleted. Why? If there are no identical elements in nodes at all, everything will be returned as unchanged.

If we could find several elements, we take the most suitable one. To define the similarity, use the like_degree procedure. The lperc variable contains the value of the index for comparison.

As a result, some elements will be relinked, and the tree will have the following structure:

If there is no need in using an additional root element that was added at the beginning, delete it.

You can take all the remaining elements with the status 0 and 1, and go up to the root. If you find the same element with the “opposite status”, then you need to compare them, remove the element with 0 from the tree, and change the status of the element with 1.

Now, go through the remaining elements with the status 0 and restore their previous item_ref. To do this, use the tree_before_calc collection, which stores the initial state of the mr_order_tree_comp tree.

The tree has the following structure:

I think there are more beautiful, fast and correct ways to compare trees. This is my method and hope it will be useful to someone.

Dmitriy Vlasov

Dmitriy Vlasov participates in the development of ERP-system for a large machine-building enterprise. He develops program modules and is engaged in the administration of a DB a little bit. He writes on Delphi and uses Oracle (10g, 11g) as DBMS. Dmitriy tries to expand his knowledge in the field of algorithms and mathematics.

Latest posts by Dmitriy Vlasov (see all)

Dmitriy Vlasov

Dmitriy Vlasov participates in the development of ERP-system for a large machine-building enterprise. He develops program modules and is engaged in the administration of a DB a little bit. He writes on Delphi and uses Oracle (10g, 11g) as DBMS. Dmitriy tries to expand his knowledge in the field of algorithms and mathematics.