论文标题
数据同步:文件系统的完整理论解决方案
Data Synchronization: A Complete Theoretical Solution for Filesystems
论文作者
论文摘要
一般而言,数据核对尤其是文件系统同步,缺乏严格的理论基础。本文首次介绍了一个理论文件系统的两个复制品的同步完整分析。同步有两个主要阶段:确定冲突并解决它们。所有现有的(理论和实用)同步器都是基于操作的:它们使用一些理由或启发式方法定义了如何解决冲突,而无需考虑解决方案对后续冲突的影响。取而代之的是,我们的方法是基于声明的:我们定义了所有冲突的解决方案,对于每种可能的情况,我们证明了操作 /命令序列的存在,这些序列 /命令的序列将副本转换为通用的同步状态。这些序列包括操作倒退一些局部变化,然后是对其他复制品进行的操作。一组滚动的操作为用户提供了有关拟议更改的清晰而直观的信息,因此她可以轻松地决定是接受或要求其他替代方案。所有可能的同步状态都通过指定一组冲突来描述,这是描述需要解决的顺序的冲突的部分顺序,以及每个决定对后续冲突的影响。使用此分类,可以轻松研究不同冲突解决政策的结果。
Data reconciliation in general, and filesystem synchronization in particular, lacks rigorous theoretical foundation. This paper presents, for the first time, a complete analysis of synchronization for two replicas of a theoretical filesystem. Synchronization has two main stages: identifying the conflicts, and resolving them. All existing (both theoretical and practical) synchronizers are operation-based: they define, using some rationale or heuristics, how conflicts are to be resolved without considering the effect of the resolution on subsequent conflicts. Instead, our approach is declaration-based: we define what constitutes the resolution of all conflicts, and for each possible scenario we prove the existence of sequences of operations / commands which convert the replicas into a common synchronized state. These sequences consist of operations rolling back some local changes, followed by operations performed on the other replica. The set of rolled-back operations provides the user with clear and intuitive information on the proposed changes, so she can easily decide whether to accept them or ask for other alternatives. All possible synchronized states are described by specifying a set of conflicts, a partial order on the conflicts describing the order in which they need to be resolved, as well as the effect of each decision on subsequent conflicts. Using this classification, the outcomes of different conflict resolution policies can be investigated easily.