A "trie" is a data structure that converts "sentences" of symbols into a tree structure. In a trie all sequences ("n-grams") up to a certain length that appear in a "sentence" have as their parent "most" of that -gram, that is, all but the rightmost element of the original -gram. One also generally stores information on the number of times that a given -gram appears as the child of other -grams. Tries are useful for many purposes, including searching and finding "collocations", that is, -grams that appear more frequently than would occur randomly given their constituent parts.
In this Demonstration you create "sentences" by selecting an elementary cellular automaton rule and then choosing some rows of its evolution. Each row represents a sentence (although unlike linguistic sentences, the rows are "periodic", in the sense that the right end is connected with the left end). You also select the maximum size of the -grams you wish to appear in the trie. The system responds with a visualization of the trie as a network in which the nodes are the -grams (labeled in purple with the number of times they are hit by an incoming edge) and the thickness of the edges likewise reflects the number of times one particular -gram follows another.


  • [Snapshot]
  • [Snapshot]
  • [Snapshot]


There is nothing particularly special about using a cellular automaton to generate the sentences or the use of a binary alphabet. The alphabet could be linguistic symbols or words and the stream of "text" could be words or true sentences.
Snapshot 1: where the sentences have lots of patterns that appear far more frequently than mere chance, the trie reflects this with fewer branches
Snapshot 2: a trie for rule 110
Snapshot 3: a trie for a cellular automaton with simple behavior
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.