TY - JOUR

T1 - Graph connectivity, partial words, and a theorem of Fine and Wilf

AU - Blanchet-Sadri, F.

AU - Bal, Deepak

AU - Sisodia, Gautam

PY - 2008/5

Y1 - 2008/5

N2 - The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of "do not know" symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).

AB - The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of "do not know" symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).

KW - Fine and Wilf's theorem

KW - Graph connectivity

KW - Partial words

KW - Periods

KW - Words

UR - http://www.scopus.com/inward/record.url?scp=41349098784&partnerID=8YFLogxK

U2 - 10.1016/j.ic.2007.11.007

DO - 10.1016/j.ic.2007.11.007

M3 - Article

AN - SCOPUS:41349098784

SN - 0890-5401

VL - 206

SP - 676

EP - 693

JO - Information and Computation

JF - Information and Computation

IS - 5

ER -