Incomplete neutralisation

Phonology Phonetics
Karthik Durvasula
2025-05-14

Background

In the previous blog post (Durvasula 2025), I discussed how near mergers pose no problem for categorical representations. The error in claiming that they do lies in trying to interpret phonological categories directly from the phonetic exponence (or manifestation). Therefore, phonetic proximity and variation in effect sizes across speakers in a study by themselves can’t inform us about the merger of the underlying phonological categories.

In today’s blog post, I will be discussing the related issue of Incomplete Neutralisation, where phonetic proximity (non-neutralisation) has been argued to show that the phonological representations are not the same. Again, this is faulty logic, as it rides on a strawman auxiliary hypotheses about how phonological rules are implemented, and is a view that has been argued to be wrong, right from the beginning of the generative enterprise (Chomsky 1965; Chomsky and Halle 1968; Lyons 1974; Hammarberg 1982; Postal 1968). For more in depth discussion of the error, see Du and Durvasula (2024).

OK, let’s start with everyone’s favourite positional neutralisation phonological process, prosodic word-final devoicing in German (Wagner 2002).

     UR Meaning Singular  Plural
 /ʁaːt/ council   [ʁaːt] [ʁæːtə]
 /ʁaːd/   wheel   [ʁaːt] [ʁæːdɐ]

The data above (or similar words) make an appearance in stadard introductions to phonology, and the crucial pattern of devoicing can be desrcribed in terms of a categorical phonological process as follows:

The crux of the issue is that researchers have observed subtle differences in the closure duration, the burst amplitude, the pre-consonantal vowel duration, … between the derived and intended targets. More specifically, the [t]/d/ always seems to have traces of [d], and is never completely like [t]/t/. So, if we plot the a specific phonetic measure on a one-dimensional plot, then you would see something like the following.

sub-lexical incremental processsing

Figure 1: sub-lexical incremental processsing

This observation has lead to a gazillion papers studying similar effects in a variety of languages and patterns. There has also been incredible interest in studying what the effect stems from (orthography, phonological representations, task effects,…). I think it is fair to say that this is one of the defining domains of laboratory phonology, at its inception, and many have used it to infer that phonological representations may not be categorical. In Du and Durvasula (2022) and Du and Durvasula (2024), we show that the effect stems from performance factors; specifically, how abstract phonological knowledge is implemented in speech production, and therefore tells us little about phonological representations themselves. I refer the reader to those publications for more extensive discussion of the issues and relevant citations. In this post, I will focus on something we had left for the reader to work out, namely, developing a formal statement of the specific theory of incomplete neutralisation we proposed, particularly for the incomplete neutraslisation with a small effect size.

Desiderata of any explanation of incomplete neutralisation

Before trying to explain the phenomenon, it is useful to lay out what it is exactly that we are trying to achieve with a theory of the phenomenon:

  1. No phenomenon specific factors, if possible.
  2. Has to extend to nonce words or novel (or extremely low-frequency) phrases.
  3. Explain why some processes result in small effect sizes and others in large effect sizes.
    • i.e., the counter-factual should be predicted to be impossible.
  4. Explain why “over-neutralisation” isn’t observed.
  5. Explain how a feeding interaction is possible if there is phonetically incomplete neutralisation.
  6. Explain why incompletely neutralised segments can trigger phonological processes,
    but other phonetically similar segments don’t.

Here, it is important to be clear about terminology. Many people use explain and account interchangeable, though there is a huge difference between the two verbs. By explain, I mean that the counter-factual should be predicted to be impossible. So, it is important to be clear about what the fact is that we are trying to explain, so that we can be clear about the counter-factual.

Notable, any sufficiently complex theory theory can account for data. For example, any theory that says everything is memorised will account for all the words in a language, but it doesn’t explain anything, including systematic facts in the words.1 That is, such accounts of the data don’t explain why the relevant facts are the way they are and not some other way. Such theories are some times called just so theories.

Second, by over-neutralisation, I mean that we never see that the derived [t]/d/ is more extreme in its phonetic exponence than the underlying [t]/t/. Now, this is crucial to account for — if we jsut say that such exponences are lexicalised (as in exemplar representations), or in theories that map representations to arbitrary phonetics, then it is easy to get such over-neutralisation; therefore, such theories don’t explain the phenomenon of incomplete neutralisation.

sub-lexical incremental processsing

Figure 2: sub-lexical incremental processsing

As mentioned above, in this post, I will focus on developing a formalisation of a specific theory of incomplete neutralisation, particularly the kind with a small effect size. However, I do want to point out that the last two desiderata are phonological desiderata — if we have categorical phonological representations, it is obvious how we achieve them; but it is far trickier (if at all possible) to achieve them with gradient/high-dimensional representations.

In Du and Durvasula (2024), we lay out a specific theory of incomplete neutralisation with a small effect size. The main insight there was that the phenomenon can be viewed as an average effect of multiple planned events stemming from incremental processing (Kilbourn-Ceron and Goldrick 2021; Tanner, Sonderegger, and Wagner 2017; Wagner 2012; Ferreira and Swets 2002).

Let’s return to the German final devoicing mentioned above. Imagine the UR /bad/. When the morpheme with a final voiced stop is encountered in planning, the speaker doesn’t know that the rule/process environment of prosodic word-finality is met. Therefore, they plan a morpheme final voiced stop. Further incremental processing allows the speaker to plan subsequent morphemes, and if the first morpheme does appear at the end of a prosodic word, then the devoicing process is planned. Consequently, the speaker has planned a set of surface representations for the same underlying representation at the same time, and a more recently planned voiceless obstruent will blend with a previously planned surfaced voiced obstruent, causing the output to be more voiced than an underlying voiceless obstruent and resulting in incomplete neutralisation. Furthermore, the effect of more recently planned voiceless obstruent is stronger due to a recency effect, so the output in production is predicted to be closer to a voiceless stop, which results in a small effect size in incomplete neutralisation.

It might help the reader to visualise this. Say for the UR /bad/, t4 is when the next context is planned, so the phonological rule applies. Then, the time-points before t4 will result in [d] being planned, and at and after t4, [t] is planned.

sub-lexical incremental processsing

Figure 3: sub-lexical incremental processsing

So, for a lexical item (or morpheme), there can be multiple planned events for each UR segment {[d], [d], [d], [t], [t], [t], }, but more recent planned events have a stronger effect given the recency bias present with short-term memory (Glanzer and Cunitz 1966; Waugh and Norman 1965; Rundus 1971). Consequently, the phonetic exponence of the phonological mapping [t]/d/, will be trapped between the exponences of [t] and [d], and will be much closer to a [t]. OK, but, can we formalise these statements so that there is no fuzziness about what’s being claimed?

Formalising the theory that we laid out

The word theory that Du and Durvasula (2024) proposed can be mathematically formalised as follows:2
\(Expected\ exponence = \sum_{t=1}^{N} w_{t} * planned\ phonetic\ target\)

Consequently, one gets a representation that is trapped between [d] and [t] and much closer to [t]. We call this the Incremental Unitary Planning Effect.

Simulation

Imagine there is a phonological process:
\(Cat_{1} \rightarrow Cat_{2}/ \underline{\hspace{0.5cm}}\ some\ environment\)

We need to make some assumptions to be able to implement the formalisation in the previous section. First, let’s say there are 10 incremental steps before the planned event is sent to production (not crucial). Second, the phonetic target for Category 1 is 10 (not crucial), and that for Category 2 is 20 (not crucial). Third, the phonological process is planned 4 steps before the final step. Finally, the weighting/decay function is a negative exponential (Wickelgren and Norman 1966; Goldinger 1998; Nosofsky 1988).

All simulations are done in R (R Core Team 2021) using tidyverse functions (Wickham 2017) as necessary.

SwitchDistance = 4 #Distance from the final planned target
NumberOfSteps = 10
# (4) Target for category 1 and 2
CategoryTargets = data.frame(Category = c("Cat1","Cat2"), 
                             PhoneticTarget = c(10,20))

# Weighted predictions
DerivedCat2Target = data.frame(DistanceFromUtterance = NumberOfSteps:1) %>% 
  #Weighting function is exponential, and then normalised
  mutate(weightingFunction = exp(-DistanceFromUtterance)) %>%
  mutate(weightingFunctionNormalised = weightingFunction/sum(weightingFunction)) %>% 
  #Calculating the contribution from each planned event
  mutate(contributoryTargetFromStep = 
           ifelse(DistanceFromUtterance >SwitchDistance,
                  weightingFunctionNormalised*CategoryTargets$PhoneticTarget[1],
                  weightingFunctionNormalised*CategoryTargets$PhoneticTarget[2])) %>% 
  #Calculating the net prediction
  mutate(PhoneticTarget = sum(contributoryTargetFromStep))

OK, let’s visualise the exponential decay function, which just says that more recent events have a larger effect on the production.

Finally, we can see what the predicted output of the derived representation is compared to the non-derived representations. You can see in the figure below that though the difference is sizeable between the exponences of the two phonemes in non-neutralising contexts, the phonetic exponence of the derived segment is really, really close to that of the intended category.

CategoryTargets %>% 
  rbind(data.frame(Category = c("Cat2-derived"), 
                   PhoneticTarget = unique(DerivedCat2Target$PhoneticTarget))) %>% 
  mutate(Category = factor(Category,levels = c("Cat1","Cat2-derived","Cat2"))) %>% 
  ggplot(aes(x=Category,y=PhoneticTarget,fill=Category))+
  geom_bar(stat="identity")+
  scale_fill_grey()+
  geom_text(aes(label=round(PhoneticTarget,2)),nudge_y=1)+
  theme_bw() + xlab("")

Some important consequences

First, there is nothing interesting being learnt about the incompleteness by a learner. The child only learns the categorical phonological pattern, the incompleteness effect comes for free with the planning story.

Some important predictions fall out of this view:

Conclusion

Any theory has to be situated in the context of auxiliary hypotheses (Lakatos 1970; Quine 1951; Duhem 1954). Our whole point is that the auxiliary hypotheses people have been using to test categorical representations and categorical phonology are simply wrong, and in fact contradict what proponents of categorical representations (or generative grammars) have said. So, we took it as a challenge to flesh out a specific auxiliary hypothesis related to performance, more specifically, of how categorical knowledge/representations are used in production.

To the extent that it is formalisable and the predictions are clear (and not based on subjective interpretations), I believe this is the right way to approach the issue. However, as we mention in previous work (Du and Durvasula 2024, 2022), even if the predictions laid out above turn out to be wrong, it doesn’t falsify the framework of categorical features — which have multiple strands of evidence. It would at most argue against our specific auxiliary hypotheses related to planning.

Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge: MIT Press.
Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English. New York: Harper; Row.
Du, Naiyan, and Karthik Durvasula. 2022. “Phonetically Incomplete Neutralization Can Be Phonologically Complete: Evidence from Huai’an Mandarin.” Phonology.
———. 2024. “Psycholinguistics and Phonology: The Forgotten Foundations of Generative Phonology.” Cambridge Elements on Phonology.
Duhem, Pierre Maurice Marie. 1954. The Aim and Structure of Physical Theory. Princeton University Press.
Durvasula, Karthik. 2025. “Karthik Durvasula: Near Mergers.” https://karthikdurvasula.gitlab.io/posts/2025-04-29-Near Mergers/.
Ferreira, Fernanda, and Benjamin Swets. 2002. “How Incremental Is Language Production? Evidence from the Production of Utterances Requiring the Computation of Arithmetic Sums.” Journal of Memory and Language 46 (1): 57–84.
Glanzer, Murray, and Anita R Cunitz. 1966. “Two Storage Mechanisms in Free Recall.” Journal of Verbal Learning and Verbal Behavior 5 (4): 351–60.
Goldinger, Stephen D. 1998. “Echoes of Echoes? An Episodic Theory of Lexical Access.” Review. Psychological Review 105 (2): 251–79. https://doi.org/10.1037/0033-295X.105.2.251.
Hammarberg, Robert. 1982. “On Redefining Coarticulation.” Journal of Phonetics 10 (2): 123–37.
Kilbourn-Ceron, Oriana, and Matt Goldrick. 2021. “Variable Pronunciations Reveal Dynamic Intra-Speaker Variation in Speech Planning.” Psychonomic Bulletin & Research. https://doi.org/10.3758/s13423-021-01886-0.
Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge: Proceedings of the International Colloquium in the Philosophy of Science, London, 1965, edited by Imre Lakatos and AlanEditors Musgrave, 4:91–196. Cambridge University Press. https://doi.org/10.1017/CBO9781139171434.009.
Lyons, J. 1974. “Linguistics.” Encyclopaedia Britannica 10: 1002.
Nosofsky, Robert M. 1988. “Exemplar-Based Accounts of Relations Between Classification, Recognition, and Typicality.” Journal of Experimental Psychology: Learning, Memory, and Cognition 14 (4): 700.
Postal, Paul Martin. 1968. Aspects of Phonological Theory. New York, USA: Harper & Row.
Quine, Willard V. O. 1951. “Two Dogmas of Empiricism.” Philosophical Review 60 (1): 20–43. https://doi.org/10.2307/2266637.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org.
Rundus, Dewey. 1971. “Analysis of Rehearsal Processes in Free Recall.” Journal of Experimental Psychology 89 (1): 63.
Tanner, James, Morgan Sonderegger, and Michael Wagner. 2017. “Production Planning and Coronal Stop Deletion in Spontaneous Speech.” Laboratory Phonology: Journal of the Association for Laboratory Phonology 8 (1): 15. https://doi.org/http://doi.org/10.5334/labphon.96.
Wagner, Michael. 2002. “The Role of Prosody in Laryngeal Neutralization.” MIT Working Papers in Linguistics 42: 373–92.
———. 2012. “Locality in Phonology and Production Planning.” McGill Working Papers in Linguistics 22 (1): 1–18. http://prosodylab.org/~chael/papers/wagner2012production.pdf.
Waugh, Nancy C, and Donald A Norman. 1965. “Primary Memory.” Psychological Review 72 (2): 89.
Wickelgren, Wayne A, and Donald A Norman. 1966. “Strength Models and Serial Position in Short-Term Recognition Memory.” Journal of Mathematical Psychology 3 (2): 316–47.
Wickham, Hadley. 2017. Tidyverse: Easily Install and Load the “Tidyverse”. https://CRAN.R-project.org/package=tidyverse.

  1. This is very similar to saying that the R2 of a statistical model can only increase with the addition of meaningless terms in the model. Therefore, R2 is not a meaningful measure of model fit by itself — without an evaluation of and correction for model complexity, it is not possible to do model comparison.↩︎

  2. Note, this is full consistent with the previous blog post, where targets were fleshed out in terms of random variables. Here, the focus is on the expected values since we are talking about (unstandardised) effect sizes, and not the variance itself. So, I don’t present formalisation in terms of random variables, to avoid unnecessary formal details. However, note, if w_{t} and targets were represented as random variables, W and T, respectively; then, E[W*T] = E[W]*E[T], and so, the argument carries over straight-forwardly.↩︎

References

Citation

For attribution, please cite this work as

Durvasula (2025, May 14). Karthik Durvasula: Incomplete neutralisation. Retrieved from https://karthikdurvasula.gitlab.io/posts/2025-05-14-Incomplete neutralisation/

BibTeX citation

@misc{durvasula2025incomplete,
  author = {Durvasula, Karthik},
  title = {Karthik Durvasula: Incomplete neutralisation},
  url = {https://karthikdurvasula.gitlab.io/posts/2025-05-14-Incomplete neutralisation/},
  year = {2025}
}