Information Extraction for Thai Documents

Sukhahuta, Rattasit and Smith, Dan J. (2000) Information Extraction for Thai Documents. In: 5th International Workshop on on Information Retrieval with Asian Languages, 2000-09-30 - 2000-10-01.

Full text not available from this repository. (Request a copy)

Abstract

An increasing amount of electronically available information is stored in Asian language documents, which makes Information Retrieval (IR) and Information Extraction (IE) for these languages important for a large number of users. Analysis and extraction of information in these languages presents several interesting problems not seen in Western European languages; these are interesting in their own right and for the insights they can give into more general IR and IE techniques. We describe these problems and our system for Thai language IE One of the main concerns when working with Thai natural language is that the structure of the language itself is highly ambiguous. The analyser therefore requires more sophisticated techniques and large amounts of domain knowledge to cope with these ambiguities. We describe our approach to a natural language analysis system that performs preprocessing for the Thai language and the extraction module to retrieve specific information according to the predefined concept definitions.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Vishal Gautam
Date Deposited: 26 Aug 2011 15:13
Last Modified: 15 Dec 2022 01:07
URI: https://ueaeprints.uea.ac.uk/id/eprint/23288
DOI: 10.1145/355214.355229

Actions (login required)

View Item View Item