This website serves content about the Machine Learning course project. In this project we will follow several stages:

  • Data annotation
  • Feature extraction
  • Model building and training
  • Competition

Data annotation

We now begin the first and one of the most important stages of our machine learning course project. Each of you has the task of annotating a portion of the dataset that will be provided as training data. The performance of your models will be as good as the quality of the data you will annotate, so please pay special attention to this stage.

I have developed a system for you to get your annotation files. I have created an assignment for data annotation (called Project-Annotation) and there are already feedback for this assignment. Please use the special code provided to you in the feedback section (similar to CS412ac7ba96d49fd). You will also use this code to access your personalized reports. So please make sure you can locate it.

First, please visit the link below and complete a 1-minute survey about your social media usage. FORM LINK HERE (You can login with your SU account)

Next, use the following link to download your data from ' http://www.onurvarol.com/Annotation-CS412-202201/data/tasks_ CODE-HERE .csv '. Replace CODE-HERE with your personal code provided to you. In this file you will find 400 links for your annotation tasks (200 tweets, 200 users). Each of the links included in the task file will take you to a Google survey where you will need to follow the instructions there. It is important that you complete all annotation tasks to successfully complete this task and continue for the next stages of the project.

Your grade and the success of your models will depend on your performance in this phase, so please allow adequate time and concentration for this phase.

Annotation Statistics

We can study how much does it take to annotate one instance. Below you can see the distriobution of seconds it takes for annotator to process one instance.

We can analyze the progress of different annotators and how many instances they annotated for each category.



Score distribution for the annotation task shown below. I consider number of annotation completed (60%) and the mean accuracy of the annotations (40%) the components for final scores.

Feature extraction and modelling

Our competition continues with model training and evaluation. We are proving raw data for feature extraction and labeled dataset for initial training. In the first round you will be building models and submitting your predictions as text files.

Please first download your own annotations following the link created for you. Use the following link to download your data from ' http://www.onurvarol.com/Annotation-CS412-202201/reports/report_ CODE-HERE .html '. Replace CODE-HERE with your personal code provided to you. (Some students forgot!!! to fill out user survey, so their personal report generated by thei 5-digit student-ID. If you are one of these students, please use your student-ID.)

Competition

We will run this competition in 3 rounds. Details of the competition can be found below.
R# Status Start date End date Results
1 FINISHED 30/12/2022 10/01/2022 RESULTS PAGE
2 FINISHED 11/01/2022 20/01/2022 RESULTS PAGE
3 FINISHED 21/01/2022 24/01/2022 RESULTS PAGE