Information Extraction for Thai Documents

Sukhahuta, Rattasit and Smith, Dan J. (2000) Information Extraction for Thai Documents. In: Proceedings of the fifth international workshop on on Information retrieval with Asian languages (IRAL 2000), 2000-09-30 - 2000-10-01.

Full text not available from this repository. (Request a copy)

Abstract

An increasing amount of electronically available information is stored in Asian language documents, which makes Information Retrieval (IR) and Information Extraction (IE) for these languages important for a large number of users. Analysis and extraction of information in these languages presents several interesting problems not seen in Western European languages; these are interesting in their own right and for the insights they can give into more general IR and IE techniques. We describe these problems and our system for Thai language IE One of the main concerns when working with Thai natural language is that the structure of the language itself is highly ambiguous. The analyser therefore requires more sophisticated techniques and large amounts of domain knowledge to cope with these ambiguities. We describe our approach to a natural language analysis system that performs preprocessing for the Thai language and the extraction module to retrieve specific information according to the predefined concept definitions.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
Related URLs:
Depositing User: Vishal Gautam
Date Deposited: 26 Aug 2011 15:13
Last Modified: 18 Mar 2020 08:31
URI: https://ueaeprints.uea.ac.uk/id/eprint/23288
DOI: 10.1145/355214.355229

Actions (login required)

View Item View Item