Using corpora to enhance translation practice, a webinar with Ana Frankenberg-Garcia

Hosted by the ITI French Network

Sarah Bowyer and Dean Evans have joined forces for this event report, sharing their different perspectives, as a newcomer to the world of corpora, and as a committed fan looking to delve deeper.

Getting my feet wet: Sarah Bowyer

I have long been embarrassed to admit how little I knew about corpora. I’d managed to get away without knowing what they were for quite some time since the topic only seemed to come up in passing and very few fellow translators had ever mentioned that they used corpora in their day-to-day work. Luckily, the French Network’s Events Officers Alanah Reynor and Holly-Anne Whyte showed me what I’d been missing with an enlightening webinar from Dr Ana Frankenberg-Garcia from the University of Surrey.

Assuming no prior knowledge (to my great relief!), Ana reassuringly explained that, at its most straightforward, a corpus is simply a body of text compiled for linguistic research. Corpora can be searched using specialist technology, known as corpus linguistic software, that has been optimised to answer questions about specific language use. Some general corpora can contain billions of words, while others are more targeted to specialist areas such as pharmaceutical language or even individual documents. As a legal translator, my ears also pricked up at the brief mention of parallel corpora (source texts and translations) such as those available to compare EU law.

Corpora are clearly useful in the academic world and I was interested to hear about how they are used in the study of languages and to compile dictionaries. But what really appealed to me was how they could be put to practical use as a translator. It is important to note that corpora give us “a snapshot of what language is really like” (which may even include mistakes, since none of us are – or should that be is? – perfect) and are not contrived examples or subjective opinions. Concordance searches can show us how terms are actually used in context (for example, when would we say “she is married with”, rather than “she is married to” in English?) and collocation queries can help us to see how words are used in combination and to unpick the differences between similar words such as “avis” and “opinion”. All of this is clearly helpful for terminology research and when justifying translation or editing decisions.

Ana suggested some great places to start for beginners like me, including Sketch Engine for Language Learning ( I left the webinar entirely persuaded that incorporating the use of corpora into my translation process would be helpful in my own work. I have since made the reasonably affordable investment in Sketch Engine and use it almost every day. I am sure that I can learn a lot more about its functionality to get even more out of the investment, but that’s a topic for another session!

Diving deeper: Dean Evans

I wasn’t a complete corpora newbie before Ana’s webinar. My current tool of choice is Sketch Engine, and I’ve been using it for my translations – and general writing – for a couple of years now. Yet I’ve always had a niggling feeling that I wasn’t using corpora to their full potential, so it was great to listen to Ana delve deep into how we can really use them to enhance our translation practice.

I normally turn to Sketch Engine for collocation queries, but Ana also explained the concordance search feature and how to use it to decide which phrasing works best in a particular context. For instance, is it better to write “look forward to” or “looking forward to” in formal letters?

She also taught us that we can upload documents to corpora software to extract the most common terms. This would be useful when preparing for an interpreting assignment or for creating a glossary before getting on with a translation. And it would be particularly handy to ensure consistency if a project was to be split between a number of translators.

Ana explained that learning to use corpora was now compulsory on the MA course at Surrey. One of the most interesting parts of her talk for me was hearing about projects completed by her students, who have created their own bilingual corpora in diverse fields such as ceramics, fitness and knitting – what an excellent way to gain insight into the terminology used by specialists in your field and language combinations!

It was great to see a lively chat box too, with attendees sharing useful links to free corpora tools they use. Examples included the GloWbE web-based English corpus, Leipzig university’s corpus and WebCorp Live. For those better acquainted with corpora technology, Ana herself recommended AntConc.

A big thanks to Ana for a very informative talk, and to our events team Holly-Anne Whyte and Alanah Reynor for organising it!