Details

Information Extraction: Algorithms and Prospects in a Retrieval Context


Information Extraction: Algorithms and Prospects in a Retrieval Context


The Information Retrieval Series, Band 21

von: Marie-Francine Moens

130,89 €

Verlag: Springer
Format: PDF
Veröffentl.: 10.10.2006
ISBN/EAN: 9781402049934
Sprache: englisch
Anzahl Seiten: 246

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

This book covers content recognition in text, elaborating on past and current most successful algorithms and their application in a variety of settings: news filtering, mining of biomedical text, intelligence gathering, competitive intelligence, legal information searching, and processing of informal text. Today, there is considerable interest in integrating the results of information extraction in retrieval systems, because of the demand for search engines that return precise answers to flexible information queries.
Information extraction regards the processes of structuring and combining content that is explicitly stated or implied in one or multiple unstructured information sources. It involves a semantic classification and linking of certain pieces of information and is considered as a light form of content understanding by the machine. Currently, there is a considerable interest in integrating the results of information extraction in retrieval systems, because of the growing demand for search engines that return precise answers to flexible information queries. Advanced retrieval models satisfy that need and they rely on tools that automatically build a probabilistic model of the content of a (multi-media) document.
The book focuses on content recognition in text. It elaborates on the past and current most successful algorithms and their application in a variety of domains (e.g., news filtering, mining of biomedical text, intelligence gathering, competitive intelligence, legal information searching, and processing of informal text). An important part discusses current statistical and machine learning algorithms for information detection and classification and integrates their results in probabilistic retrieval models. The book also reveals a number of ideas towards an advanced understanding and synthesis of textual content.

The book is aimed at researchers and software developers interested in information extraction and retrieval, but the many illustrations and real world examples make it also suitable as a handbook for students.
1 Information Extraction and Information Technology.- 1.1 Defining Information Extraction.-1.2 Explaining Information Extraction.- 1.3 Information Extraction and Information Retrieval.- 1.4 Information Extraction and Other Information Processing Tasks.- 1.5 The Aims of the Book.- 1.6 Conclusions. -1.7 Bibliography.- 2 Information Extraction from an Historical Perspective.- 2.1 Introduction.- 2.2 A Historical Overview.- 2.3 The Common Extraction Process.- 2.4 A Cascade of Tasks.- 2.5 Conclusions.- 2.6 Bibliography.- 3 The Symbolic Techniques.- 3.1 Introduction.- 3.2 Conceptual Dependency Theory and Scripts.-3.3 Frame Theory.-3.4 Actual Implementations of Symbolic Techniques.- 3.5 Conclusions.- 3.6 Bibliography.- 4 Pattern Recognition.- 4.1 Introduction.- 4.2 What is Pattern Recognition?.- 4.3 The Classification Scheme.- 4.4 The Information Units to Extract.- 4.5 The Features.- 4.6 Conclusions.- 4.7 Bibliography.- 5 Supervised Classification.- 5.1 Introduction.- 5.2 Support Vector Machines.- 5.3 Maximum Entropy Models.- 5.4 Hidden Markov Models.- 5.5 Conditional Random Fields.- 5.6 Decision Rules and Trees.- 5.7 Relational Learning.- 5.8 Conclusions.- 5.9 Bibliography.- 6 Unsupervised Classification Aids.- 6.1 Introduction.- 6.2 Clustering.- 6.3 Expansion.- 6.4 Self-training.- 6.5 Co-training.- 6.6 Active Learning.- 6.7 Conclusions.-6.8 Bibliography.- 7 Integration of Information Extraction in Retrieval Models.- 7.1 Introduction.- 7.2 State of the Art of Information Retrieval.- 7.3 Requirements of Retrieval Systems.- 7.4 Motivation of Incorporating Information Extraction.- 7.5 Retrieval Models.- 7.6 Data Structures.- 7.7 Conclusions.- 7.8 Bibliography.- 8 Evaluation of Information Extraction Technologies.- 8.1 Introduction.- 8.2 Intrinsic Evaluation ofInformation Extraction.- 8.3 Extrinsic Evaluation of Information Extraction in Retrieval.- 8.4 Other Evaluation Criteria.- 8.5 Conclusions.-
8.6 Bibliography.- 9 Case Studies.- 9.1 Introduction.- 9.2 Generic versus Domain Specific Character.- 9.3 Information Extraction from News Texts.- 9.4 Information Extraction from Biomedical Texts.- 9.5 Intelligence Gathering.- 9.6 Information Extraction from Business Texts.- 9.7 Information Extraction from Legal Texts.- 9.8 Information Extraction from Informal Texts.- 9.9 Conclusions.- 9.10 Bibliography.- 10 The Future of Information Extraction in a Retrieval Context.- 10.1 Introduction.- 10.2 The Human Needs and the Machine Performances.- 10.3 Most Important Findings.- 10.4 Algorithmic Challenges.- 10.5 The Future of IE in a Retrieval Context.- 10.6 Bibliography.-
Information extraction regards the processes of structuring and combining content that is explicitly stated or implied in one or multiple unstructured information sources. It involves a semantic classification and linking of certain pieces of information and is considered as a light form of content understanding by the machine. Currently, there is a considerable interest in integrating the results of information extraction in retrieval systems, because of the growing demand for search engines that return precise answers to flexible information queries. Advanced retrieval models satisfy that need and they rely on tools that automatically build a probabilistic model of the content of a (multi-media) document.
The book focuses on content recognition in text. It elaborates on the past and current most successful algorithms and their application in a variety of domains (e.g., news filtering, mining of biomedical text, intelligence gathering, competitive intelligence, legal information searching, and processing of informal text). An important part discusses current statistical and machine learning algorithms for information detection and classification and integrates their results in probabilistic retrieval models. The book also reveals a number of ideas towards an advanced understanding and synthesis of textual content.

The book is aimed at researchers and software developers interested in information extraction and retrieval, but the many illustrations and real world examples make it also suitable as a handbook for students.
Comprehensive overview of current and past technology for information extraction
Innovative integration of information extraction and retrieval
Novel avenues for research and algorithmic development
Focus on applicability of the techniques in many domains