CS1012 Natural Language Processing Syllabus

CS1012        NATURAL LANGUAGE PROCESSING                3  0  0  100

The aim is to expose the students to the basic principles of language processing and typical applications of natural language processing systems

•    To provide a general introduction including the use of state automata for language processing
•    To provide the fundamentals of syntax including a basic parse
•    To explain advanced feature like feature structures and realistic parsing methodologies
•    To explain basic concepts of remotes processing
•    To give details about a typical natural language processing applications  

UNIT I        INTRODUCTION                            6
Introduction: Knowledge in speech and language processing – Ambiguity –
Models and Algorithms – Language, Thought and Understanding. Regular Expressions and automata:  Regular expressions – Finite-State automata. Morphology and Finite-State Transducers: Survey of English morphology – Finite-State Morphological parsing – Combining FST lexicon and rules – Lexicon-Free FSTs: The porter stammer – Human morphological processing

UNIT II     SYNTAX                                 10
Word classes and part-of-speech tagging: English word classes – Tagsets for English – Part-of-speech tagging – Rule-based part-of-speech tagging – Stochastic part-of-speech tagging – Transformation-based tagging – Other issues. Context-Free Grammars for English: Constituency – Context-Free rules and trees – Sentence-level constructions – The noun phrase – Coordination – Agreement – The verb phase and sub categorization – Auxiliaries – Spoken language syntax – Grammars equivalence and normal form – Finite-State and Context-Free grammars – Grammars and human processing. Parsing with Context-Free Grammars: Parsing as search – A Basic Top-Down parser – Problems with the basic Top-Down parser – The early algorithm – Finite-State parsing methods.

Features and Unification: Feature structures – Unification of feature structures – Features structures in the grammar – Implementing unification – Parsing with unification constraints – Types and Inheritance. Lexicalized and Probabilistic Parsing: Probabilistic context-free grammar – problems with PCFGs – Probabilistic lexicalized CFGs – Dependency Grammars – Human parsing.

UNIT IV     SEMANTIC                                   10
Representing Meaning: Computational desiderata for representations – Meaning structure of language – First order predicate calculus – Some linguistically relevant concepts – Related representational approaches – Alternative approaches to meaning. Semantic Analysis: Syntax-Driven semantic analysis – Attachments for a fragment of English – Integrating semantic analysis into the early parser – Idioms and compositionality – Robust semantic analysis. Lexical semantics: relational among lexemes and their senses – WordNet: A database of lexical relations – The Internal structure of words – Creativity and the lexicon.

UNIT V     APPLICATIONS                            8
Word Sense Disambiguation and Information Retrieval: Selectional restriction-based disambiguation – Robust word sense disambiguation – Information retrieval – other information retrieval tasks. Natural Language Generation: Introduction to language generation – Architecture for generation – Surface realization – Discourse planning – Other issues. Machine Translation: Language similarities and differences – The transfer metaphor – The interlingua idea: Using meaning – Direct translation – Using statistical techniques – Usability and system development.
TOTAL : 45
1.    Daniel Jurafsky & James H.Martin, “ Speech and Language Processing”, Pearson Education (Singapore) Pte. Ltd., 2002.

1.    James Allen, “Natural Language Understanding”, Pearson Education, 2003.

Next Post »

Still not found what you are looking for? Try again here.