About
In July of 2024, the 3CA.UBI organized a 24-hour comprehensive course titled “Introduction to Machine Learning and Data Science with Python.” This course gathered over 30 students from diverse academic backgrounds, including Bachelor’s, Master’s, and PhD students. The program was designed to provide a robust foundation in machine learning, deep learning, and data science, offering practical, hands-on experience with popular tools such as Python, PyTorch, Pandas, Matplotlib, and Jupyter Notebooks.
The Challenge
There was a growing need to equip students from various academic disciplines with practical, cutting-edge machine learning and data science skills. The challenge was to introduce students to these advanced topics in a way that would not only cover the theoretical aspects but also empower them with the ability to solve real-world problems using data science and machine learning.
The Solution
The course curriculum was designed to balance theory with hands-on applications. Starting with Python programming basics, students were progressively introduced to advanced concepts like machine learning algorithms and deep learning networks. The participants learned the basics of Linux command lines and how to use SSH to remotely access HPC environments, laying the foundation for future supercomputing work.
Students worked on a real-world dataset, making predictions using various machine learning models. Alternatively, they could choose datasets from their own fields of study, leading to a variety of projects in areas such as mechanical engineering, sports analytics, and image recognition.
Participants applied machine learning techniques like data cleaning, feature engineering, and hyperparameter tuning, using tools like Scikit-learn to build accurate models. The course also featured an introduction to deep learning with PyTorch, where students built simple neural networks and explored convolutional neural networks (CNNs) for image recognition tasks.
Each student presented their findings, demonstrating their understanding of the concepts. The use of visualization tools like Matplotlib and Seaborn further helped them interpret and present their data effectively.
Services Provided
The instructor of the course was a postdoctoral researcher at the 3CA.UBI, supported by the
EUROCC project.
Impact
The course had a significant impact on participants, enhancing their understanding of machine learning and data science. Students now possess a well-rounded skillset that enables them to tackle real-world challenges using machine learning techniques. The blend of practical and theoretical training has prepared them for future careers and academic pursuits, particularly in applying machine learning across various domains like engineering, sports, and computer vision.
Though students did not run large computations on HPC during the course, they are now equipped with the knowledge to utilize these resources effectively. The upcoming course, Hands-on Supercomputing in Machine Learning, offered by the 3CA.UBI, will build on this foundation, enabling participants to further optimize their models using HPC.