Natural Language Processing and Text Mining

Instructors: Dr. Berberidis, Prof. Papadopoulos
Teaching Hours and Credit Allocation: 30 Hours, 6 Credits
Course Assessment: Exam & Coursework

 

 

Aims

This course covers the main principles and techniques of Natural Language Processing (NLP) and its associated computational tools, especially with regards to written text. The course provides the required background material on computational linguistics and statistical language analysis and describes the machine-learning-based models that are widely used for analysis. Typical NLP tasks, such as text parsing, classification and translation will also be described and the students will gain familiarity with widely used software tools for these purposes.

 

Learning Outcomes

On completing the course, students will be able to:

  • Understand how natural language processing (NLP) draws upon other areas of computer science and data analysis.
  • Design and build computer systems and software for various tasks of NLP.
  • Understand and implement the most important algorithms and techniques in NLP and text mining.
  • Formulate models and construct computational solutions to text and speech-based processing problems.

 

Content

  • Introduction to natural language processing and its challenges.
  • Syntax and parsing (syntactic, semantic).
  • Language and speech modeling.
  • Text classification and clustering.
  • Sentiment analysis.
  • Machine translation.

 

Reading

  • Manning C., Schutze H. (1999), Foundations of statistical natural language processing, MIT Press.
  • Jurafsky D., Martin J. (2008), Speech and language processing, Prentice Hall, 2nd edition.
  • Bird S., Klein E., Loper E. (2009), Natural language processing with Python: analyzing text with the Natural Language Toolkit, O’Reilly.