Accèder directement au contenu

Machine Learning in Python with scikit-learn

Graduated Program in Life Science, Department of Biology, ENS-PSL
BIO-AA-PG- | Machine Learning in Python with scikit-learn (ENS/Biology)
Level | Semester : PhD and Postdocs | S2
Where : Biology department, ENS
Duration : 6 weeks
Dates : April 8th – May 27th, 2024
Maximum class size : 16 students

2024-2025 program


Aurélien Wyngaard, Department of Biology, ENS
Denis Thieffry, Department of Biology, ENS




Python | Programming | Linux | Machine learning | scikit-learn

Course prerequisites

A little bit of Linux and good bases in Python (being able to handle numpy arrays, ideally pandas dataframes, and knowing how to make plots).
If you have no Linux/Unix background, you can check the first sections of an online course such as

Course objectives and description

The objective of the course is to initiate young life-science scientists to the bases of machine learning, and how to use it in Python with the scikit-learn package.


The course will include twelve classes (two per week), each two-hours long, over a period of six weeks (with a one-week break), in April-May 2024.
A large part of each class will be devoted to practical coding exercises.
A few hours of homework per week.


• The participants will be regularly asked to explain their code during the classes.
• Coding exercises and quizzes will be proposed over the duration of the course.

Course material

The course will be based on the INRIA open online course (, adapted towards biology.

2024-2025 program

• April 8th : Introduction and tabular data exploration
• April 11th : Fitting a scikit-learn model on numerical data (1)
• April 15th : Fitting a scikit-learn model on numerical data (2) and on categorical data
• April 18th : Selecting the best model (1)
• April 22nd : Selecting the best model (2) and dealing with hyperparameters
• April 25th : Linear models (1)
• April 29th : Linear models (2)
• May 2nd : Linear models (3)

Week off (May 6th-12th)

• May 13th : Decision tree models
• May 16th : Evaluating model performance (1)
• May 21st : Evaluating model performance (2) (/ !\ on Tuesday, not Monday !)
• May 27th : Evaluating model performance (3) and ensemble of models