This website serves content about the Machine Learning course project. In this project we will follow several stages:

  • Data annotation
  • Feature extraction
  • Model building and training
  • Competition

Data annotation

We now begin the first and one of the most important stages of our machine learning course project. Each of you has the task of annotating a portion of the dataset that will be provided as training data. The performance of your models will be as good as the quality of the data you will annotate, so please pay special attention to this stage.

I have developed a system for you to get your annotation files. I have created an assignment for data annotation (called Project-Annotation) and there are already feedback for this assignment. Please use the special code provided to you in the feedback section (similar to CS412-ac7ba96d49fd). You will also use this code to access your personalized reports. So please make sure you can locate it.

Please use the following link to download your data from ' http://www.onurvarol.com/CS412_2025_InstaInfluencers/data/tasks_ CODE-HERE .csv '. Replace CODE-HERE with your personal code provided to you. In this file you will find 150 links for your annotation tasks. Each of the links included in the task file will take you to a Google survey where you will need to follow the instructions there. It is important that you complete all annotation tasks to successfully complete this task and continue for the next stages of the project. Since Instagram applies a strong policy for scraping, please annotate about 25 accounts each day. You will also need an Instagram account to access profile information.

Your grade and the success of your models will depend on your performance in this phase, so please allow adequate time and concentration for this phase.

Annotation Statistics

We can study how much does it take to annotate one instance. Below you can see the distriobution of seconds it takes for annotator to process one instance.

We can analyze the progress of different annotators and how many instances they annotated for each category.



Score distribution for the annotation task shown below. I consider number of annotation completed (70%) and the mean accuracy of the annotations (30%) the components for final scores.

Feature extraction and modelling

You will be provided a sample pipeline for feature extraction and modelling when we release the information for the first round. Our competition continues with model training and evaluation. We are proving raw data for feature extraction and labeled dataset for initial training. In the first round you will be building models and submitting your predictions as text files.

Please first download your own annotations following the link created for you. Use the following link to download your data from ' http://www.onurvarol.com/CS412_2025_InstaInfluencers/reports/report_ CODE-HERE .html '. Replace CODE-HERE with your personal code provided to you. (Will be available once annotation task completed!)

Competition

We will run this competition in 3 rounds. Details of the competition can be found below.
R# Status Start date End date Results
1 COMPLETED 16/12/2024 25/12/2024 RESULTS PAGE
2 COMPLETED 26/12/2024 05/01/2025 RESULTS PAGE
3 ACTIVE 06/01/2025 10/01/2025 RESULTS PAGE