Programming for Data Science

Instructor(s): Dr. C. Berberidis
Teaching Hours and Credit Allocation: 30 Hours, 6 Credits
Course Assessment: Exam Coursework

 

Aims

The course will examine fundamental programming concepts and principles in the context of Data Science as well as provide students with the proper way of thinking about problems like a Data Scientist. The course covers data selection, iteration and abstraction, functional decomposition and algorithm design as they are applied in typical programming languages, tools and APIs used in Data Science. Students will also learn how to produce high quality computer code by solving actual Data Science problems.

 

Learning Outcomes

On completing the course, students will be able to:

  • Understand and apply computational thinking in terms of programming methods and data structures.
  • Capture and represent data and learn the basic data analysis, processing and visualization tasks
  • Become proficient in the basic data analysis algorithms and their implementation.
  • Use software tools and programming languages that are particularly suitable for data science and analytics.

 

Content

  • Data science methodologies.
  • Types of data, hierarchy and representation.
  • Basic data processing and analysis tasks and algorithms.
  • Data analysis software tools and programming languages.
  • Parallel and distributed programming acceleration techniques.

 

Reading

  • Igual L., Segui S., Virtia J. et al (2017), Introduction to data science: a Python approach to concepts, techniques and applications, Springer.
  • McKinney W. (2012), Python for data analysis: data wrangling with Pandas, NumPy and iPython, O’Reilly.
  • Wickham, H., Grolemund G. (2017), R for data science: import, tidy, transform, visualize and model data, O’Reilly.