# «Abstract. Evolution of Knowledge Bases (KBs) expressed in Description Logics (DLs) proved its importance. Recent studies of the topic mostly focussed ...»

Capturing Global Semantics. ΠG (φ) checks whether φ of N T -contradicts A: ΠG (φ) is true iff ¬φ ∈ fclT (A) \ AlignAlg((T, A), N ). Intuitively, SymAlg for global semantics works as follows: having contradiction between N and A on φ = B(c), the change of B’s interpretation is inevitable. Since the semantics traces changes on symbols only, and B is already changed, one can drop from A all the assertions of the form B(d).

Clearly, SymAlg(K, N, ΠG ) can be computed in time polynomial in |K ∪ N |. The following theorem shows correctness of this algorithm.

5 La Evolution of DL-LiteR KBs ⊆ In the previous section we showed that atom-based MBAs behave well for DL-Litepr R evolution settings, while symbol-based ones do not. This suggests to investigate atombased MBAs for the entire DL-LiteR. Moreover, one of the atom-based semantics La which is essentially the same as a so-called Winslett’s semantics [15] (WS) was ⊆ widely studied in the literature [6,8]. Liu, Lutz, Milicic, and Wolter studied WS for expressive DLs [6], and KBs with empty TBoxes. Most of the DLs they considered are not closed under WS. Poggi, Lembo, De Giacomo, Lenzerini, and Rosati applied WS to the same setting as we have in this work: to what they called instance level (ABox) update for DL-Lite [8]. They proposed an algorithm to compute the result of updates, which has technical issues, i.e., it is neither sound, nor complete [10]. They further use this algorithm to compute approximations of ABox updates in sublogics of DL-Lite, which inherits these technical issues. Actually, ABox update algorithm cannot exist since Calvanese, Kharlamov, Nutt, and Zheleznyakov showed that DL-Lite is not closed under La [11]. We now investigate La evolution for DL-LiteR and ﬁrstly explain why ⊆ ⊆ DL-LiteR is not closed under La.⊆ 10 Evgeny Kharlamov and Dmitriy Zheleznyakov

5.1 Understanding Inexpressibility of Evolution in DL-LiteR

We now give an intuition why in K N under La canonical models may be missing.

⊆

**Observe that in Example 10, the role R is affected by the old TBox T1 as follows:**

(i) T1 places (i.e., enforces the existence of) R-atoms in the evolution result, and on one of coordinates of these R-atoms, there are constants from speciﬁc sets, e.g., A ∃R of T1 enforces R-atoms with constants from A on the ﬁrst coordinate, and (ii) T1 forbids R-atoms in K1 N1 with speciﬁc constants on the other coordinate, e.g., ∃R− ¬C forbids R-atoms with C-constants on the second coordinate.

Due to this dual-affection (both positive and negative) of the role R in T1, we were able to provide ABoxes A1 and N1, which together triggered the case analyses of modiﬁcations on the model I, that is, A1 and N1 were triggers for R. Existence of such an affected R and triggers A1 and N1 made K1 N1 inexpressible in DL-LiteR. Therefore, we now formally deﬁne and then learn how to detect dually-affected roles in TBoxes T and how to understand whether these roles are triggered by A and N.

Capturing Instance Level Ontology Evolution for DL-Lite 11

Deﬁnition 11. Let T be a DL-LiteR TBox. Then a role R is dually-affected in T if for some concepts A and B it holds that T |= A ∃R and T |= ∃R− ¬B. A duallyaffected role R is triggered by N if there is a concept C such that T |= ∃R− ¬C and N |=T C(b) for some constant b.

As we saw in Example 10, even one dually-affected role in a TBox can cause inexpressibility of evolution. Moreover, if there is a dually affected role, we can always ﬁnd A and N to trigger it.

**We generalize this observation as follows:**

5.2 Prototypes

**Closer look at the sets of models K N for DL-LiteR KBs K gives a surprising result:**

Theorem 13. The set of models K N under La can be divided (but in general not partitioned) into worst-case exponentially many in |K∪N | subsets S0,..., Sn, where each Si has a canonical model Ji, which is a minimal element in K N wrt homomorphisms.

We call these Ji s prototypes. Thus, capturing K N in some logics boils down to (i) capturing each Si with some theory KSi and (ii) taking the disjunction across all KSi. This will give the desired theory K = KS1 ∨ · · · ∨ KSn that captures K N.

As we will see some of KSi s are not DL-Lite theories (while they are SHOIQ theories, see Section 5.4 for details). We construct each KSi in two steps. First, we construct a DL-LiteR KB K(Ji ) which is a sound approximations of Si, i.e., Si ⊆ Mod(K(Ji )).

Second, based on K and N, we construct a SHOIQ formula Ψ, which cancels out all the models in Mod(K(Ji )) \ Si, i.e., KSi = Ψ ∧ K(Ji ). Finally, KS0 ∨ · · · ∨ KSn = (Ψ ∧ K(J0 )) ∨ · · · ∨ (Ψ ∧ K(Jn )) = Ψ ∧ (K(J0 ) ∨ · · · ∨ K(Jn )).

To get a better intuition on our approach consider Figure 3, where the result of evolution K N is depicted as the ﬁgure with solid-line borders (each point within the ﬁgure is a model of K N ). For the sake of example, let K N under La can be divided ⊆ in four subsets S0,..., S3. To emphasize this fact, K N looks similar to a hand with four ﬁngers, where each ﬁnger represents an Si. Consider the left part of Figure 3, where the canonical Ji model of each Si is depicted as a star. Using DL-LiteR, we can provide KBs K(Ji )s that are sound approximation of corresponding Si s. We depict the models 12 Evgeny Kharlamov and Dmitriy Zheleznyakov Mod(K(Ji )) as ovals with dashed-line boarders. In the right part of Figure 3 we depict in grey the models Mod(K(Ji )) \ Si that are cut off by Ψ.

We now deﬁne prototypes formally and proceed to procedures discussed above.

Deﬁnition 14. Let K and N be an evolution setting. A prototypal set for K N under La ⊆

**is a minimal subset J = {J0,..., Jn } of K N satisfying the following property:**

for every J ∈ K N there exists Ji ∈ J homomorphically embeddable in J.

We call every Ji ∈ J a prototype for K N. Note that prototypes generalize canonical models in the sense that every set of models with a canonical one, say Mod(K) for a DL-LiteR KB K, has a prototype, which is exactly this canonical model.

Computing La Evolution for DL-LiteR 5.3 ⊆ For the ease of exhibition of our procedure that computes evolution K N under La ⊆ semantics we restrict DL-LiteR by assuming that TBoxes T should satisfy: for any two roles R and R, T |= ∃R ∃R and T |= ∃R ¬∃R. That is, we forbid direct interaction (subsumption and disjoint) between role projections and call such T as without direct role interactions. Some interaction between R and R is still possible, e.g., role projections may contain the same concept. This restriction allows us to analyze evolution that affects roles independently for every role. We will further comment on how the following techniques can be extended to the case when roles interact in an arbitrary way.

Components for Computation. We now introduce several notions and notations that we further use in the description of our procedure. The notion of alignment was introduced in Section 4.1. An auxiliary set of atoms AA (Auxiliary Atoms) that, due to evolution, should be deleted from the original KB and have some extra condition on the ﬁrst

**coordinate is:**

∃R, A |=T A(a), N |=T ¬∃R− (b)}.

AA(T, A, N ) = {R(a, b) ∈ fclT (A) | T |= A If Ri is a dually-affected role of T triggered by A and N, then the set of forbidden

**atoms (of the original ABox) FA[T, A, N ](Ri ) for Ri is:**

− {D(c) ∈ fclT (A) | ∃Ri (c) ∧ D(c) |=T ⊥ and N |=T D(c), and N |=T ¬D(c)}.

Consequently, the set of forbidden atoms for the entire KB (T, A) and N is FA(T, A, N ) = FA(T, A, N )(Ri ), Ri ∈TR

Constructing Zero-Prototype. The procedure BZP (K, N ) (Build Zero Prototype) in Figure 4 constructs the main prototype J0 for K and N, which we call zero-prototype.

Based on J0 we will construct all the other prototypes. To build J0 one should align the canonical model I can of K with N, and then delete from the resulting set of atoms all the auxiliary atoms R(a, b) of AA(K, N ). If I can contains no atoms R(a, β) ∈ AA(K, N ) for some β, then we further delete atoms rootat (∃R(a)) from J0, otherwise would we T get a contradiction with the TBox. Note that J0 can be inﬁnite.

Constructing Other Prototypes. The procedure BP (K, N, J0 ) (Build Prototypes) of constructing J takes J0 and, based on it, builds the other prototypes by (i) dropping FA-atoms from J0 and then (ii) adding atoms necessary to obtain a model of K N.

This procedure can be found in Figure 5.

**We conclude the discussion on the procedures with a theorem:**

Continuing with Example 10, one can check that the prototypal set for K1 and N1 is {J0, J1, J2, J3 }, where J0, J1, and J2 are as in the example and AJ3 = {x, y}, C J3 = {b}, and RJ3 = {(x, d), (y, e)}.

We proceed to correctness of BP in capturing evolution in DL-LiteR, where we use the following set FC[T, A, N ](Ri ) = {c | D(c) ∈ FA[T, A, N ](Ri )}, that collects all the constants that participate in the forbidden atoms.

Theorem 16. Let K = (T, A), N be an evolution setting, T without direct role interactions, and BP (K, N, BZP (K, N )) = {J0,..., Jn } a prototypal set for K N. Then K N under La is expressible in SHOIQ and moreover ⊆

What is missing in the theorem above is how to compute the ABoxes Ai s. One can do it using a similar procedure to the one of constructing Ji s, with the difference that one has to take the original ABox A instead of I can as the input. Note that A may include negative atoms, like ¬B(c), which should be treated in the same way as positive ones.

**Continuing with Example 10, the ABoxes A0 and A1 are as follows:**

** A0 = {C(d), C(e), C(b)}; A1 = {A(x), C(e), C(b), R(x, d)}.**

A2 and A3 can be built in the similar way. Note that only A0 is in DL-LiteR, while writing A1,..., A3 requires variables in ABoxes. Variables, also known as soft constants, are not allowed in DL-LiteR ABoxes, while present in DL-LiteRS ABoxes. Soft constants x are constants not constrained by the Unique Name Assumption: it is not necessary that xI = x. Since DL-LiteRS is tractable and ﬁrst-order rewritable [12], expressing A1 in DL-LiteRS instead of DL-LiteR does not affect tractability.

Note that the number of prototypes is exponential in the number of constants, and therefore the size of the SHOIQ theory described in Theorem 16 is also exponential in the number of constants.

Capturing La Semantics for DL-LiteR KBs with Direct Role Interactions. In this general ⊆ case the BP procedure does return prototypes but not all of them. To capture the La ⊆ for such KBs one should iterate BP over (already constructed) prototypes until no new prototypes can be constructed. Intuitively the reason is that BP deletes forbidden atoms (atoms of FA) and add new atoms of the form R(a, b) for some triggered dually-affected role R which may in turn trigger another dually-affected role, say P, and such triggering may require further modiﬁcations, already for P. This further modiﬁcation require a new run of BP. For example, if we have ∃R− ¬∃P − in the TBox and we set R(a, b) in a prototype, say Jk, this modiﬁcation triggers role P and we should run BP recursively with the prototype Jk as if it was the zero prototype. We shall not discuss the general procedures in more details due to space limit.

Proof. Clearly all ABox assertions of Ac are over concepts, roles, and constants of K, thus, there are at most a quadratic many (in |K ∪ N |) of them, and we can simply test whether F ∈ Ac for each such assertion F. Since K N is representable in SHOIQ, this test can be reduced to the subsumption problem for SHOIQ (checking whether K |= C(a) is equivalent to checking whether K |= {a} C). Subsumption for SHOIQ is NExpTime-complete and can be tested using the algorithms of [16].

The proposition above gives the upper bound for Kc computations. We do not know the lower bound, but conjecture it to be in polynomial time. Note that NExpTime lower bound for SHOIQ subsumption checking holds for arbitrary SHOIQ concepts, while Theorem 16 gives us K with concepts of a speciﬁc kind. Moreover, the authors of [16] argue that despite the high complexity of subsumption checking their algorithms should behave well in many typically encountered cases. Note also that for DL-Litepr KBs R certain approximations in fact capture the evolution result, that is Mod(Kc ) = K N.

**6 Conclusion**

We studied model-based approaches to ABox evolution (update and revision) over DL-LiteR and its fragment DL-Litepr, which both extend (ﬁrst-order fragment of) RDFS.

R DL-Litepr is closed under most of the MBAs, while DL-LiteR is not closed under any of R them. We showed that if the TBox of K entails a pair of assertions of the form A ∃R and ∃R− ¬C, then an interplay of N and A may lead to inexpressibility of K N. For DL-Litepr we provided algorithms how to compute evolution results for six model-based R 16 Evgeny Kharlamov and Dmitriy Zheleznyakov approaches and approximate for the remaining two. For DL-LiteR we capture evolution of KBs under a local model-based approach with SHOIQ using novel techniques based on what we called prototypes. We believe that prototypes are important since they can be used to study evolution for ontology languages other than DL-LiteR. Finally, we showed how to approximate evolution when it is not expressible in DL-LiteR using what we called certain approximations.

It is the ﬁrst attempt to provide an understanding of inexpressibility of MBAs for DL-Lite evolution. Without this understanding it is unclear how to proceed with the study of evolution in more expressive DLs and what to expect from MBAs in such logics. We also believe that our techniques of capturing semantics based on prototypes give a better understanding of how MBAs behave.