How I teach...speaking
Less hare, more tortoise when it comes to output.
A race to production
For a long time, I thought I was teaching speaking.
Students did role-plays. They answered questions. They “had a go”. I set up pair work, circulated, encouraged, corrected gently. To the casual observer, it probably looked fine.
But I felt that something was off.
A chasm seemed to be forming. The same chasm that forms when discovery learning (or whatever it calls itself these days) is used in favour of explicit instruction. Using this approach is tantamount to privileging the already privileged, as Carl Hendrick said recently. The students that did OK or even thrived were usually the confident ones, the ones with good memories or strong literacy skills. Many others froze or - in the case of pair work - waited for their partner to do the heavy lifting. And despite all that “practice”, progress in speaking was slow, fragile, and uneven.
What I eventually realised — thanks largely to Extensive Processing Instruction (EPI) — is that I wasn’t really teaching speaking at all. I was mostly just asking for it. And I was asking too soon. Like someone you barely know asking for a lift to the airport.
Speaking is not a skill you practise into existence
One of the most persistent myths in languages teaching is that speaking develops primarily through speaking practice.
It feels intuitive: want students to speak better? Get them speaking more.
The problem is that speaking is a cognitively demanding, real-time skill, requiring learners to retrieve vocabulary, apply grammatical knowledge, and articulate meaning under time pressure (Levelt, 1989; DeKeyser, 2007).
For novice learners especially, this is cognitively brutal.
What is more, such an approach was probably subjecting my students to what Pink (2009) calls excessive challenge: repeated exposure to a task which is too difficult. As you can probably imagine, this leads to learners becoming demotivated and possibly anxious regarding tasks of this nature.
EPI forced me to confront an uncomfortable truth: speaking quality is constrained by the quality of underlying language processing.
The Modelling phase and why it is crucial
At the heart of EPI is the Modelling (or Awareness-Raising) phase. This is where students are exposed to a great deal of high-quality, carefully structured input from the teacher.
This input is composed of rich, salient, meaningful language that is comprehensible, repeated, and purposeful. For this phase, I often go back to Dylan Viñales’ magnificient blog post.
In this phase, students listen, repeat in a controlled way, and practise identifying and experimenting with the phonological features of the language they are hearing. They are not expected to speak spontaneously. Their role is to listen, notice patterns in sound and form, and gradually build stable connections between sounds and meaning.
The more I’ve leaned into this phase, the more it’s reminded me of something obvious but often overlooked in languages teaching: how we learn our first language.
Parallels with first language acquisition
Before children speak, they listen a lot.
They are immersed in enormous amounts of natural, authentic input. Through this exposure, they begin to attune to the sounds of the language, recognise recurring words and phrases, map meaning onto form, and build expectations about how language works.
All of this happens long before they are capable of producing language themselves.
In many second language classrooms, we reverse this sequence. We ask students to speak almost immediately, often after minimal exposure.
EPI resists this reversal.
By flooding students with quality input in the modelling phase, we give them time to develop awareness of sounds, structures, and meaning before we ever ask them to retrieve language under pressure.
One of my current research interests is exploring how and to what extent insights from first language acquisition can inform second language teaching, particularly in the first year of language learning. While first and second language acquisition are not identical processes, the principle that comprehension and exposure precede confident production feels both intuitive and instructionally powerful.
EPI flips the speaking sequence
In an EPI classroom, speaking comes late. This is because it is demanding, rather than unimportant.
The core idea is simple: learners need extensive, meaningful processing of language before they are expected to produce it.
That means:
sustained, structured listening
repeated encounters with the same language
carefully controlled variation
time for form–meaning connections to stabilise
Speaking isn’t removed. It’s earned.
From output-first to processing-first
Before EPI, my lessons often followed this pattern:
Teach some vocabulary and a grammar point
Do a couple of examples
“Now you try”
Under EPI, the sequence looks very different:
Input flood through teacher modelling
Guided processing that forces attention to meaning and form
Recycling of the same structures across lessons
Constrained speaking using already-processed language
Freer speaking once retrieval becomes more automatic
When students reach the stage at which they are able to produce accurate output, it provides an opportunity for them to showcase their learning.
Guided speaking is not “dumbing it down”
A common criticism of EPI — particularly of the modelling and early recycling phases — is that it can feel like parroting. Students listen, repeat, and encounter the same high-frequency language and chunks again and again, and critics argue that this work is artificial and ultimately unhelpful for developing spontaneous speech.
This concern, whilst understandable, rests on a misunderstanding of what repetition is doing in an EPI classroom.
The modelling stage is not about students mimicking sounds for their own sake. It is about building phonological, lexical, and syntactic representations that are stable enough to be retrieved later under real-time pressure. Repetition here is controlled, intentional, and meaning-driven. Students are not being asked to generate language; they are being asked to notice how it sounds, how meaning shifts, what stays constant, and what changes.
The same applies to the recycling of high-frequency language. This is not an artificial constraint but a deliberate attempt to prioritise depth over breadth. Spontaneous speech does not emerge from having seen lots of language once; it emerges from having processed a smaller amount of useful language (usually very high-frequency) repeatedly and successfully. Research on formulaic language and skill acquisition consistently shows that fluent speech relies heavily on automatised chunks rather than novel sentence construction under pressure (e.g. Levelt, 1989; DeKeyser, 2007; Wray, 2002).
What is often labelled “artificial” is, in fact, what makes spontaneity possible later on. By reducing cognitive load early — through familiar structures and predictable language — learners are freed to focus on meaning when speaking becomes less constrained. Without this foundation, so-called spontaneous speaking tasks frequently result in avoidance, silence, or reliance on memorised scripts.
From this perspective, constrained (or guided) speaking is not the opposite of authentic communication; it is the preparation for it. Spontaneity is not something we demand at the outset — it is something that emerges once learners have the linguistic resources to support it.
And contrary to the cognitively-demanding and anxiety-inducing alternative, confidence grows gradually from repeated success and from not being pushed into the deep end.
Fluency emerges. It isn’t forced
One of the most satisfying outcomes of shifting to EPI has been how fluency develops without me explicitly chasing it.
When learners have heard the same structures many times, processed them for meaning, and retrieved them in low-stakes speaking contexts, speech becomes faster and smoother almost as a by-product.
Fluency is built before it is displayed.
The Guinness approach
EPI doesn’t make students speak less. It makes them speak better and with far more confidence.
By practising patience and giving modelling and comprehension the time they deserve, and by resisting the urge to rush into output, speaking can become more accurate, more fluent, and more inclusive.
With speaking, like Guinness, good things come to those who wait.






