W2. Homework - Pipeline
W2. Homework - Pipeline
Groups are asked to choose a classification or regression data set from those available in the literature (no
more than 5000 examples to avoid lengthy training processes) and to select off-the-shelf models, namely,
those learned during the classes and those available in the Scikit-learn Python library. Some public
repositories to search and retrieve datasets of varying size and complexity can be found at:
- UCI repository
- https://github.com/caesar0301/awesome-public-datasets
Reports can be a DOC document, a PDF document (along with the Python scripts that generate the reported
figures and results) or a Jupyter Notebook (with saved checkpoint). Other formats (e.g. link to Google
Colab) must be agreed with the professor.
When uploading the report, please indicate name, surname and ID (DNI number) of all members of the
team.