Data-centric reports
Dashboard Reports
The project dashboard focuses on data-centric reports about the data in the project. Currently it consists of the following reports.
Check entity data bias from entity distribution
Entity distribution shows how many annotations belongs to each entity. If the number of annotations of a particular entity is less or more as compared to other entities, it shows that the dataset is biased against or towards that entity. This report gives an insight that what kind of new data should be sourced into the project to balance the entity distribution.
Entities can directly be edited by clicking the edit button
Classifications
Classifications list the words classified against each entity. This table helps in identifying words which are classified as what entities. If a word is classified as more than one entity then it is important to verify such annotations and confirm the validity of that annotation.
This table also helps to know the occurrences of each annotated words in the dataset.
Missed Classifications
Missed classifications show the words that have been classified once in the dataset and missed at other locations of the dataset. This is very important to identify mistakes where the annotator might have either missed classifying the word or might have wrongly classified that word.
Sorting on Missed Records would let the user know annotations which might have been missed by the annotator
And sorting on Classification Counts would let the user know annotations that might have been wrongly annotated by the annotator.
Anomalies
Anomalies highlight those annotations where the system feels the annotation might be a mistake. This might need annotation correction of the underlying data to be modified.
Unclassified words
This table helps in identifying unclassified words in the project. Instead of browsing the dataset, this table helps sort the words based on their word length and number of occurrences.
Filtering words
The default filter hides the words that are marked to be hidden
by the user. This can be changed by changing the dropdown.
Default
shows only the words that are not marked hidden
, Show All
shows all words and Hidden word
shows only the hidden words
Words can be hidden
or shown
based on the preference of the user, to hide/ show a word click on the hide
button on the left side of the row associated with the word.
Algorithms trained
Shows a quick glance of how well each algorithm has been trained on the dataset, and which one is the best trained algorithm.