Fixed vs variable word forms in context

Some of the patterns that are listed in StringNet search results have bold-faced words. Here’s why.

StringNet searches yield a list of patterns (hybrid n-grams). Patterns are represented as strings of type labels, and there are roughly two kinds of type labels: (1) words or (2) parts of speech (POSs). Here’s a pattern with some of each type that show up in a search for the target word ‘go’:

A pattern with words only:              go to sleep

A pattern with words and POSs:   go so far as [to vb]

Notice, though, that words are distinguished further: bold vs not bold. We want to explain that a bit. A bold word represents a slot where there is attested variation in the form of that word. So in go to sleep, bold go means not just go (or maybe even not go at all) but goes/went. The exact variation attested can be seen by clicking on the bold word.

The bold vs non-bold distinction comes into relief in another pattern that shows up for the search term ‘go’:

                                                    what the hell be going on

There are 60 tokens of this pattern in BNC. The bold be indicates that variations of the lexeme be occur here in these 60 instances (and not necessarily the exact form ‘be’ itself). Be here is just the place holder for this lexeme in its various word forms (is, ‘s, was…). Again, the exact distribution of the attested forms of be shows up if you click on the be. 

In contrast, going in that same pattern what the hell be going on is not in bold. The non-bold going indicates that going is the exact word form that appears in all 60 tokens of this pattern in BNC

(Incidentally, there’s a bit of grammar reflected in the bold vs non-bold distinction in be going in this case of what the hell be going on: The fixed -ing form of go is due to the preceding be (the auxiliary for progressive aspect), and the be shows variation because, as the leftmost in the verb string, it varies with tense and agreement demands.)



