On trickiness of hard evidence in ELT

Recently, there’s been a lot of debate about the need of evidence, method and research in ELT. I confess I’m truly fascinated by the topic. I enjoy delving into the debates as well as reading articles describing various experiments. I also love to experiment myself. However, I often catch myself having doubts, especially concerning the process, but also interpretation and implementation of the results. The other day I came across this interesting article called How Tests Make us Smarter, where Henry L. Roediger III discusses the benefits of regular quizzing:
“When my colleagues and I took our research out of the lab and into a Columbia, Ill., middle school class, we found that students earned an average grade of A- on material that had been presented in class once and subsequently quizzed three times, compared with a C+ on material that had been presented in the same way and reviewed three times but not quizzed. The benefit of quizzing remained in a follow-up test eight months later”.

This is my concern: I suppose the author is talking about two different texts (material) and one group of students. The thing is that I’m not convinced that it’s possible to get reliable results with two different pieces of material. Who can guarantee that they were of exactly the same level of difficulty; that they included comparably demanding content? We have some scientific methods and tools that can measure readability scores, for example, but there are so many factors at play as far as the difficulty of texts is concerned. Nevertheless, a similar situation would occur if the researcher used the same text but two different groups: there would be no guarantee that the groups (or individual students) had the same ability, intelligence, aptitude, etc. 

I think that the problem is the attempt to turn ELT into rigorous science. We call for research and concrete evidence but in any research of this type, there are people involved who react to stimuli, interact, respond, have different character features, etc.; they simply behave differently and unpredictably in different situations and under different circumstances. Then there is the learning content: texts, images, equations, vocabulary, graphs, you name it. The former is undoubtedly a very unstable element but also, the material is not a constant either because the perception of difficulty of a piece of material is highly dependent on the one who’s perceiving – it’s not merely a property of the material itself. 

I remember conducting an experiment (as part of my MA studies) with a group of intermediate students (the same age, approximately the same level). It was based on I.S.P. Nation’s belief that we need to be familiar with 98% of the words in a text to be able to understand it sufficiently. First, I gave my students a paper version Nation’s Vocabulary Levels Test (which can be easily accessed online) to assess their vocabulary knowledge. My intention was to find out to what extent the result corresponded with the readability score of the text they were going to read. Then I gave them a simple authentic short story by Ernest Hemingway and asked them to underline all the unknown vocabulary while reading. To make the results more precise, I asked them to use two different markers: one for words they don’t know at all and can’t infer from the surrounding context, the other for those whose meaning they don’t know but think they can guess it from the context and co-text. Then I asked them some comprehension questions to see the correlation between the unknown words and the ability to understand the story. Finally, in a random manner, I tested if they really knew the words they hadn’t underlined. I got all sorts of interesting results, such as 1) some students had ‘cheated’ and underlined less than they should have 2) one of the best students in the class had underlined the most unknown words, which, however, hadn’t prevented him from understanding the most important message of the story, 3) some students had underlined some vocabulary only to realize later that they actually knew them, 4) another very good student hadn’t underlined many words but his comprehesion was rather weak, etc. Overall, I got a lot of hard evidence of how distorted the results can be if human factor is involved. I don’t intend to go into further detail here. The point is that all students got the same text, fairly easy one from the linguistic point of view, but because the story was a literary text with lots of implicit and hidden messages and meanings, and each student  came from a different background, with different experience and schemata, the level of comprehension didn’t and couldn’t correlate with the actual language knowledge. 

All in all, I believe that it’s the human factor what complicates ELT research and the validity of any evidence. No matter how much we want to experiment, some of the data we get from our experiments will often be pretty unreliable and irreplicable. If I say something worked for my students and I even prove it, any educator or researcher can disprove my claims quite easily if they conduct the research at a different time, in a different environment, with different students and different material. No wonder that to some ELT research may appear a waste of time; they prefer taking all sorts of feeble arguments for granted and they simply try what others have tried before without challenging their assumptions. 


About Hana Tichá

I'm an EFL teacher based in the Czech Republic. I've been teaching English to learners of all ages for almost 25 years and I still love my job. You can find out more about my passion here on my blog.
This entry was posted in Uncategorized. Bookmark the permalink.

8 Responses to On trickiness of hard evidence in ELT

  1. mallingual says:

    I think you're right to question research results, that's very important but I also think that research is the best we optionwe have http://malingual.blogspot.co.uk/2013/04/the-least-worst-solution.html


  2. mallingual says:

    I've also written about the 'human element' here. http://malingual.blogspot.co.uk/2013/12/thought-terminating-cliches.htmlAnd I wonder do you you think psychology and medical science all suffer from the same issues? A drug is test on a group of people but they are all different so how can we know if the drug really works?


  3. Hi Hana,

    I think you raise some interesting points here, and agree that one piece of research probably doesn't really tell us much. I think that's why it is important for people to replica studies to see if they get the same results. For my MA, I looked at motivation, using Dornyei's model of the L2 self. Quite a few studies have been done in this area, so I looked at what had already been done and attempted to see whether I would get the same results in Korea (I did), that therefore gave me the confidence to say that what I had read was 'correct'.

    But I think, you do make an excellent point about different results in different environments. I am by no means an expert in research at all, and it is something that I probably need to learn a lot more about. I do think however that there should be more of a push for teachers to do Action Research, similar to the vocabulary experiment you carried out. I think it is a very good way for individual teachers to get to know their particular group of students and then adapt the course and teaching based on those individuals.



  4. Hana Tichá says:

    I agree, Russ. I think you make a good point in your post when you question the value of experience. I'd never say that it's the only prerequisite of a being good teacher. However, can we come up with some measurable criteria of what a good teacher/lesson is (as opposed to crap one)? Who dictates the criteria? And who dictated them a hundred years ago? Also, what one considers to be good, another may consider a disaster – be it from the viewpoint of a teacher, administrator, or a student. This is a philosophical problem, though. That's why I will always consider teaching and education a humanity field, not a rigor science, even though I’m not against research and experimenting.


  5. Hana Tichá says:

    I'm not a psychologist. I'm a teacher so I speak about stuff I'm familiar with. But yes, I guess psychology does suffer from the same issues. Human beings are much more complicated than we think – they are not a mass of flesh and bones. And this applies to teaching and pedagogy in general. There are lots of immeasurable things such as the degree and value of empathy, compassion a teacher shows, etc. And these are the things that should be discussed because they are starting to disappear from our society. To be honest, that’s what worries me most, not whether vocabulary needs to be quizzed in order to be learnt. But I love reading your blog because it offers a different perspective on certain issues. And I thank you for that.


  6. Hana Tichá says:

    Hi David,
    What a coincidence; I looked at motivation in my BA thesis 🙂 I remember two names were in fashion back then – Gardner and Lambert – and their concept of integrative motivation. It was a long, long time ago and a lot has changed in the field since. One is amazed by the constant change and development in the field and, at the same time, by the fact some teachers apparently teach the same way they did 20 years ago. This doesn’t mean they’re not doing their job well, though.
    Anyway, like you, I'm in favour of Action Research and adapting. I believe theoretical considerations, research results and one’s own experience are needed when we are deciding what to change and what to leave untouched. Teaching is so fluid and so pliable. That’s why I’m a teacher, not a scientist.


  7. mallingual says:

    Can we indeed? I think we can but I would have to ask whether or not you have a celtA or delta? I ask because it is possible to fail both so presumably Cambridge believe good teaching is 'measurable' in some sense.

    Teaching will never be a hard science, you're right, but as Swan says that doesn't excuse us from having to provide evidence for our beliefs. 🙂


  8. Hana Tichá says:

    I agree that 'in some sense' yes. But the ones who judge whether it's good or bad are not exclusively CELTA or DELTA examiners, right? The thing is that I'm an EFL teacher but also a classic 'subject' teacher, i.e. the one teaching English as a subject in the state sector in my native country. This may be the source of the discrepancy in our opinions. But where there is no contradiction, there is no progress (or something along these lines). BTW, no, I hold neither of the certificates you mention but in my MA programme I was trained by excellent DELTA holders, so I’m familiar with the principles. I learnt a lot from them but that doesn’t mean I agreed with them all. Thanks for stopping by and commenting. I really appreciate it.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s