Uncovering linguistic layers

From time to time a student composing an essay asks: “Shall I write in the street or on the street?”  As I don’t want to disturb the others by long lectures on prepositions, I usually say: “You can use both. Just choose”. I know it’s not quite accurate but I don’t think it’s a big deal either, at least at lower levels of proficiency.

As far as I remember, the way I learned this piece of lexicogrammar at school was something along the lines: in the street is mainly used in British English and be on the street(s) means be homeless. Let’s have a closer look.

First of all, it seems that there is some difference between the phrase including a singular noun (street) and the phrase including a plural form of the noun.

Regarding the phrases including the singular form of the noun, in the street, according to the first graph below, was more frequent till about 1980 but then on the street, which, by the way, was almost non-existent back in 1800, started winning the race. I learned this by checking out Google Ngram Viewer (thanks, Sandy Millin, for sharing this). Anyway, after a rather sharp decline around 1945, a sudden increase in the use of on the street can be seen, precisely around 1965. One wonders why; has the issue of homelessness become more pressing recently ?


Now, looking at the phrase including the plural form of the noun, I can see that in the streets has consistently been more frequent than one the streets. Like on the street, on the streets was almost non-existent in 1800 (see the second graph 2 below).


If you look at some concordance lines of the chunk on the street(s), you will discover that, indeed, it is often related to homelessness.

  • She spends several years on the streets.
  • To fear being thrown on the street?
  • The average person on the street are not scientists
  • She was better off on the streets.
  • Will they sleep on the streets tonight?
  • A young girl who lives on the streets.

Things shift a bit if you add a little function word, though. If you search the phrase in the streets *of*, the most frequent right collocates are usually (and quite obviously) places/towns. The same happens with on the streets *of*. When studying the concordance lines, I didn’t discover any difference in connotation between these two chunks other than the number of hits per million. The chunk on the streets *of* is more frequent than in the streets *of*.


What is interesting though is that the preposition *of* strips the phrase on the streets of its exclusivity related to the connotation of homelessness. In other words, it seems to me that it brings closer the connotations of in the streets and on the streets, i.e. the preposition simply doesn’t matter anymore.

I can’t help feeling that I’m only moving on the surface of the problem and that there’s much more behind it. Some of my conclusions may even be inaccurate and incomplete. Still, it’s a great adventure to slowly uncover the linguistic layers. What’s more, I’m learning a lot along the way.


Apparently, my blog has recently turned into a diary where I’ve been recording and sharing some of my corpora-related observations.

Here’s another anecdote in the series of posts: Yesterday, in class, we dealt with adjectives of feeling and emotions and the prepositions they take, such as angry with, depressed about, proud of, etcetera. As you know, some adjectives are quite tricky since they can take more than one preposition while the meaning stays roughly the same. One of the notorious ‘troublemakers’ is, for example, the word disappointed. 


I mentioned to my class that this adjective is usually followed by with, by or in. One of my students curiously searched the internet to finally confirm my conclusion. He came up with this page, which explains the slight shifts in meaning when different prepositions are used.


Disappointed by usually indicates that somebody has done something specific to cause you to be disappointed.

Disappointed with implies that the cause of the disappointment was something basic about the nature or attributes of the thing.

Disappointed in usually indicates a deeper level of disappointment with the nature of somebody or something, or repeated problems with them, and often indicates that the speaker has lost faith in someone’s ability to do what’s expected of them.

Although the author did his/her best to help the puzzled learner, it’s still a bit complicated, at least for a B1/B2 learner of English. So I’ve tried to figure it out for myself by looking at sets of concordance lines in BNC. Here are the most frequent collocates of the phrase disappointed + by/with/in (from the perspective of the MI index):

1) One can be disappointed by (the) lack of sth., failure, response, elections, results, decision 

2) One can be disappointed with results, players, performance, result, (the) lack of sth., decision, way

3) One can be disappointed in one’s expectation, love, (not) having …, me, you, him, her …

A closer scrutiny of the concordance lines prompts the following conclusion:

  • No1 > some external factor/situation caused my feelings of disappointment.
  • No2 > I’m not happy with the quality/state of something. Note: It seems that no1 and no2 can be used interchangeably with certain collocates with the meanings remaining very close.
  • No3 > a way to express disillusion or reproach.



Google Fight

A couple of days ago I came across a post by Svetlana Kandybovich, in which she shares some great ideas for using Google in an L2 classroom. One of the tips I particularly liked was Googlefighta website that allows users to compare the number of search results returned by Google for two given queries. 

This tool is generally used for entertainment; you type two keywords and click on the ‘Fight’ button. The winner is the one which gets the biggest number of results on Google. So, I originally planned to use the tool in class for fun too, as a warm-up after Easter holidays, but at the same time, I secretly hoped for a sudden influx of sophisticated ideas related to language learning.

What obviously first came to mind was the concept of word frequency. It occurred to me that my students would find it interesting to see the differences in frequency counts of two words belonging to the same category/lexical set. To spice the activity up, we played a bidding game – each student made a bid on one of the words before I displayed  the actual results on the screen. So, for example, we found out that cat got more hits than dog. Those who had voted for cat won a point. But you can go further with this; you can develop this somewhat meaningless game into a useful linguistic exercise. If you check Longman Communication 3000, you’ll see that both cat (as a noun) and dog (as a noun) are in the list of the 3000 most frequent words in both spoken and written English, but dog is a bit more frequent in written English than cat. If you’re brave enough to play with corpora a bit, you can go to COCA, where the word cat gets 17, 284 hits, while the word dog gets 40, 020 hits. Now, you can ask your students why they think it is so. Why the different results? Is it because there are more cats than dogs or vice versa? Does the word dog have only one meaning? Is it always a noun? What about cat? Does the fact that Google doesn’t sort out words according to parts of speech influence the frequency counts? Are the words displayed plural, singular or both? What about various abbreviations and acronyms?

The exact numbers are not terribly important, but through these activities, you can develop in your students the ability to notice some very important aspects of lexis. This can be a nice lead-in to some dictionary work. I personally like working with a paper edition of Dictionary of Contemporary English because the meanings of words are listed in order of frequency, i. e. the most common meaning is shown first. The 3000 most common words in English are printed in red letters, which shows which are the most important words to learn/know. This is a very important piece of information some dictionaries neglect, and as a result, students can’t work with it.

I was very pleased with my students because they asked me some interesting questions during today’s lesson; for example, they asked me to type in pairs such as colour/color, favourite/favorite because they wanted to see which spelling was more frequent. Once again, it was interesting to think about the why. This pushed the discussion into a different dimension. Ironically, here in the Czech Republic we like to say and believe that we teach British English, using coursebooks published in the UK, yet from the global perspective, the American way of spelling of certain words seems to be more common. This finding may subsequently lead to an interesting debate about the role of English in today’s changing world. Some other words my students were interested in were: football/soccer, black/white, film/movie, cinema/theatre, etc.


Making lessons authentic via the use of corpora

In one of my previous posts, I talked about a simple way of using corpora in class. I truly believe that a corpus can be a handy tool for any language learner, but the size of a corpus, as well as its layout, usually appears daunting at first glance, especially to less proficient learners of English. There’s no need to ask your students to laboriously analyse L2 data from a huge corpus when they still struggle with the language itself. In other words, as corpora are collections of authentic language, I estimate that an average A2 learner will find the enormous amount of data and the level of the language somewhat off-putting. From a teacher’s point of view, one of the prerequisites of a successful corpus-based lesson is its simplicity; it’s sufficient to show one simple, practical thing at a go.

Let me give you an example. My intermediate students (B1-B2) need to practise various written genres. Last time they were asked to write a formal letter as a reply to a job advertisement. I normally stick to a very commonplace procedure: I collect the assignments and correct and grade them at home, using various colour codes and abbreviations. I select the most recurring errors from all the essays, and afterwards I give my students detailed feedback (I wrote about it in detail here). However, I’m convinced there is more you can do for your students’ language progress.

Here are a couple of excerpts from a student’s writing I’ve just corrected:

Dear Sir or Madam,

I am writting to apply for the post of a part-time job, which I saw on a billboard next to the hospital. 

………. I consider myself to be accommodating, hard-working and I am good in talking and playing with children. 

……. I am enclosing my CV.
……. I look forward to hearing from you. 

For starters, you can teach your students a very simple thing – how to check the frequency of certain phrases and how to look for alternatives. As you can see, my student had decided to use Dear Sir or Madam at the beginning of the letter. This is fine, but I can ask the class if they are familiar with other ways of addressing people in formal correspondence. Let’s first look at the student’s choice, namely at how frequent it is in comparison with other possibilities. Dear Sir or Madam got 7 hits in the British National Corpus.

I remember that when I was an intermediate learner myself, we were taught that we can also use a plural form, Dear Sirs or Madams. Let’s check what the BNC has to say …

The empty result may imply that this way of greeting people is pretty unnatural. When checking out other possibilities, you’ll come across an option that is more frequent than the other two above – Dear Sir/Madam (26 hits). However, although the corpus shows that this way of addressing people is quite common, it doesn’t say if it’s always the best option. It turns out that under certain circumstances, it is safer to opt for a different greeting.

Let’s have a look at another chunk from the student’s writing I find worth focusing on a bit: I am writing to [apply for]. Now, I’d zoom in on I am writing to …  The first question I would ask my students is: Apart from apply, what other verbs can follow? You can find out by looking at the right collocation candidates. You’ll get the following examples:

I am writing to confirm
I am writing to express….¨
I am writing to inform
I am writing to thank ….
I am writing to offer ….

You might want to work with the above chunks and ask your students to complete them with their own answers. What do we normally confirm/express? What preposition do we typically use with thank/inform? What can you offer? Alternatively or additionally, you can check the corpus again; by clicking on a few example sentences you can see what other users of English opted for.

There’s one more thing I’d definitely elaborate on in a follow-up lesson and that is the phrase: I consider myself …. You can let your student figure out for themselves that it’s possible to say I consider myself to be [adjective/noun] or just I consider myself [adjective/noun]. I believe that their own discovery will make the structures more memorable than if they just saw them in a grammar table in their coursebook. Encourage your students to only focus on the red parts of the sentences and their immediate surroundings. Although there are tons of other things you can do with the sample sentences, this is, for the time being, just enough for an intermediate learner of English.

What comes to mind now is a personalised speaking activity. What qualities would you ascribe to you/your friends/other members of your family? I consider myself/my best friend/my brother …

The activities I describe above constitute the base of a very authentic lesson, which draws on students’ own written production, as well as examples of writing of other users of English. Such a lesson elaborates on what students already know, plus it demonstrates how to work with a very useful online tool.

Apart from corpora, there are other tools that work in the same vein, such as FrazeIt, Just The Word, or Netspeak, which are probably more user-friendly since you don’t need to register for them. Needless to say, Google is the easiest and most accessible web-based tool for working with authentic language, and it comes in handy when one needs to look up something really quickly.