An Entity of Type: Thing, from Named Graph: http://dbpedia.org, within Data Space: dbpedia-live.demo.openlinksw.com

Type of coefficient

Property Value
dbo:description
  • type of coefficient (en)
  • Unterart eines Koeffizienten (de)
  • és una mesura d'associació per a dues variables binàries. (ca)
dbo:wikiPageWikiLink
dbp:author
  • Davide Chicco (en)
dbp:date
  • November 2020 (en)
  • January 2025 (en)
dbp:reason
  • The article only ever talks about the 2x2 case and binary variables. How does the phi coefficient extend to other cases? (en)
  • this section is a summary of claims from a single source, and has some misconception of the use of F1 score in binary classification with no clear "positive" class . (en)
dbp:text
  • In order to have an overall understanding of your prediction, you decide to take advantage of common statistical scores, such as accuracy, and F1 score. : : However, even if accuracy and F1 score are widely employed in statistics, both can be misleading, since they do not fully consider the size of the four classes of the confusion matrix in their final score computation. Suppose, for example, you have a very imbalanced validation set made of 100 elements, 95 of which are positive elements, and only 5 are negative elements . And suppose also you made some mistakes in designing and training your machine learning classifier, and now you have an algorithm which always predicts positive. Imagine that you are not aware of this issue. By applying your only-positive predictor to your imbalanced validation set, therefore, you obtain values for the confusion matrix categories: : TP = 95, FP = 5; TN = 0, FN = 0. These values lead to the following performance scores: accuracy = 95%, and F1 score = 97.44%. By reading these over-optimistic scores, then you will be very happy and will think that your machine learning algorithm is doing an excellent job. Obviously, you would be on the wrong track. On the contrary, to avoid these dangerous misleading illusions, there is another performance score that you can exploit: the Matthews correlation coefficient [40] . : . By considering the proportion of each class of the confusion matrix in its formula, its score is high only if your classifier is doing well on both the negative and the positive elements. In the example above, the MCC score would be undefined . By checking this value, instead of accuracy and F1 score, you would then be able to notice that your classifier is going in the wrong direction, and you would become aware that there are issues you ought to solve before proceeding. Consider this other example. You ran a classification on the same dataset which led to the following values for the confusion matrix categories: : TP = 90, FP = 4; TN = 1, FN = 5. In this example, the classifier has performed well in classifying positive instances, but was not able to correctly recognize negative data elements. Again, the resulting F1 score and accuracy scores would be extremely high: accuracy = 91%, and F1 score = 95.24%. Similarly to the previous case, if a researcher analyzed only these two score indicators, without considering the MCC, they would wrongly think the algorithm is performing quite well in its task, and would have the illusion of being successful. On the other hand, checking the Matthews correlation coefficient would be pivotal once again. In this example, the value of the MCC would be 0.14 , indicating that the algorithm is performing similarly to random guessing. Acting as an alarm, the MCC would be able to inform the data mining practitioner that the statistical model is performing poorly. For these reasons, we strongly encourage to evaluate each test performance through the Matthews correlation coefficient , instead of the accuracy and the F1 score, for any binary classification problem. (en)
dbp:title
  • Ten quick tips for machine learning in computational biology (en)
dbp:wikiPageUsesTemplate
dct:subject
rdfs:label
  • Phi coefficient (en)
  • Coeficiente phi (es)
  • 파이 계수 (ko)
  • Współczynnik fi (pl)
  • Phi相關係數 (zh)
owl:sameAs
prov:wasDerivedFrom
foaf:isPrimaryTopicOf
is dbo:wikiPageDisambiguates of
is dbo:wikiPageRedirects of
is dbo:wikiPageWikiLink of
is foaf:primaryTopic of
Powered by OpenLink Virtuoso    This material is Open Knowledge     W3C Semantic Web Technology     This material is Open Knowledge    Valid XHTML + RDFa
This content was extracted from Wikipedia and is licensed under the Creative Commons Attribution-ShareAlike 4.0 International