Campus : Amritapuri
School : School of Engineering
Department : Computer Science
Year : 2007
Abstract : Sequences are well defined mathematical objects which include strings, permutations, and vectors. In this thesis, a “sequence” refers to an object which is either a string over a finite alphabet or a permutation. The measure of similarity among sequences is of fundamental importance. Given an operation, distance is a metric that measures similarity. That is, the fewer the number of operations required to transform one sequence to another, the greater the similarity.Several types of operations and their associated distances have been studied on permutations and strings. This thesis studies: (a) prefix transpositions and prefix reversals, as well as reversals of length 2 and the associated distance calculations on strings, (b) the prefix transposition operation on permutations and (c) the computational complexity of corresponding problems. For the operations mentioned above, upper and lower bounds, complexity results, and approximation strategies are given.Specifically, it is shown for prefix transpositions on strings that one can transform a string into a compatible string in at most n−α moves, where α is the frequency of occurrence of the most frequent symbol and the problem of finding the exact prefix transposition distance is NP-complete. For prefix transpositions on permutations of length n, it is shown that 2 n/3 is a lower bound and n−log8( n) is an upper bound.A proof of NP-completeness is shown for prefix reversal distance on strings, cut-and-paste move distance on strings, and simultaneous prefix reversal and prefix transposition distance on strings. In addition, it is shown that one can compute the number of reversals of length 2 to transform a given string S into a compatible string T over a k symbol alphabet in O( n·k) time.