Graph connectivity, partial words, and a theorem of Fine and Wilf

F. Blanchet-Sadri, Deepak Bal, Gautam Sisodia

Research output: Contribution to journalArticleResearchpeer-review

9 Citations (Scopus)

Abstract

The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of "do not know" symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).

Original languageEnglish
Pages (from-to)676-693
Number of pages18
JournalInformation and Computation
Volume206
Issue number5
DOIs
StatePublished - 1 May 2008

Fingerprint

Partial Words
Graph Connectivity
Molecular biology
Pattern matching
Data compression
World Wide Web
Interfaces (computer)
DNA
Servers
Communication
Theorem
Optimal Bound
Combinatorics on Words
DNA Computing
Closed
Highest common factor
Molecular Biology
Data Communication
Web Server
Data Compression

Keywords

  • Fine and Wilf's theorem
  • Graph connectivity
  • Partial words
  • Periods
  • Words

Cite this

Blanchet-Sadri, F. ; Bal, Deepak ; Sisodia, Gautam. / Graph connectivity, partial words, and a theorem of Fine and Wilf. In: Information and Computation. 2008 ; Vol. 206, No. 5. pp. 676-693.
@article{24f54dc48d5f475da70df39b363608d4,
title = "Graph connectivity, partial words, and a theorem of Fine and Wilf",
abstract = "The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of {"}do not know{"} symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).",
keywords = "Fine and Wilf's theorem, Graph connectivity, Partial words, Periods, Words",
author = "F. Blanchet-Sadri and Deepak Bal and Gautam Sisodia",
year = "2008",
month = "5",
day = "1",
doi = "10.1016/j.ic.2007.11.007",
language = "English",
volume = "206",
pages = "676--693",
journal = "Information and Computation",
issn = "0890-5401",
publisher = "Elsevier Inc.",
number = "5",

}

Graph connectivity, partial words, and a theorem of Fine and Wilf. / Blanchet-Sadri, F.; Bal, Deepak; Sisodia, Gautam.

In: Information and Computation, Vol. 206, No. 5, 01.05.2008, p. 676-693.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Graph connectivity, partial words, and a theorem of Fine and Wilf

AU - Blanchet-Sadri, F.

AU - Bal, Deepak

AU - Sisodia, Gautam

PY - 2008/5/1

Y1 - 2008/5/1

N2 - The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of "do not know" symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).

AB - The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string searching and pattern matching algorithms. The notion of period of a word is central in combinatorics on words. There are many fundamental results on periods of words. Among them is the well known and basic periodicity result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. More precisely, any word with length at least p + q - gcd (p, q) having periods p and q has also period the greatest common divisor of p and q, gcd (p, q). Moreover, the bound p + q - gcd (p, q) is optimal since counterexamples can be provided for words of smaller length. Partial words, or finite sequences that may contain a number of "do not know" symbols or holes, appear in natural ways in several areas of current interest such as molecular biology, data communication, DNA computing, etc. Any long enough partial word with h holes and having periods p, q has also period gcd (p, q). In this paper, we give closed formulas for the optimal bounds Ł (h, p, q) in the case where p = 2 and also in the case where q is large. In addition, we give upper bounds when q is small and h = 3, 4, 5, 6 or 7. No closed formulas for Ł (h, p, q) were known except for the cases where h = 0, 1 or 2. Our proofs are based on connectivity in graphs associated with partial words. A World Wide Web server interface has been established at www.uncg.edu/mat/research/finewilf3 for automated use of a program which given a number of holes h and two periods p and q, computes the optimal bound Ł (h, p, q) and an optimal word for that bound (a partial word u with h holes of length Ł (h, p, q) - 1 is optimal if p and q are periods of u but gcd (p, q) is not a period of u).

KW - Fine and Wilf's theorem

KW - Graph connectivity

KW - Partial words

KW - Periods

KW - Words

UR - http://www.scopus.com/inward/record.url?scp=41349098784&partnerID=8YFLogxK

U2 - 10.1016/j.ic.2007.11.007

DO - 10.1016/j.ic.2007.11.007

M3 - Article

VL - 206

SP - 676

EP - 693

JO - Information and Computation

JF - Information and Computation

SN - 0890-5401

IS - 5

ER -