24C3 - 1.01

24th Chaos Communication Congress
Volldampf voraus!

Speakers
Martin Haase/maha
Schedule
Day Day 2 (2007-12-28)
Room Saal 2
Start time 16:00
Duration 01:00
Info
ID 2284
Event type lecture
Track Science
Language en
Feedback

Linguistic Hacking

How to know what a text in an unknown language is about?

It is sometimes necessary to know what a text is about, even if it is written in a language you don't know. This can be quite problematic, if you do not even know in what language it is written. This talk will show how it is possible to identify the language of a written text and get at least some information about the contents, in order to decide whether a specialist and which specialist is needed to know more.

The talk deals with the following issues:

1 How to identify a language

  • texts in non-Roman writing systems and how the writing system can show what language we deal with,
  • how to identify languages with the help of sample texts,
  • tricks that help to make at least an intelligent guess.

2 How to get an idea about the contents of a text

  • identifying (important) content words and grammar,
  • quick and dirty translations,
  • how to translate a text from a language you hardly know.

The talk will introduce a variety of means, ranging from pre-internet (and pre-computational) approaches to contemporary web resources.