About: Phi coefficient

Property	Value
dbo:description	type of coefficient (en) Unterart eines Koeffizienten (de) és una mesura d'associació per a dues variables binàries. (ca)
dbo:wikiPageWikiLink	dbc:Machine_learning dbr:Machine_learning dbc:Cheminformatics dbc:Statistical_ratios dbr:Udny_Yule dbc:Information_retrieval_evaluation dbr:Confusion_matrix dbr:Fowlkes–Mallows_index dbr:Statistics dbr:Karl_Pearson dbr:Binary_classification dbr:Computational_biology dbr:Accuracy dbc:Bioinformatics dbr:Pearson_correlation_coefficient dbr:Correlation_and_dependence dbr:Polychoric_correlation dbr:Point-biserial_correlation_coefficient dbr:Markedness dbr:Youden's_J_statistic dbc:Statistical_classification dbr:Geometric_mean dbr:Brian_Matthews_(biochemist) dbc:Computational_chemistry dbc:Summary_statistics_for_contingency_tables dbr:Contingency_table dbr:Cohen's_kappa dbr:Dual_(mathematics) dbr:BMC_Genomics dbr:BioData_Mining dbr:False_positive dbr:False_negative dbr:F1_score dbr:Binary_variables dbr:Regression_coefficient dbr:Cramér's_V_(statistics) dbr:Pearson's_chi-square_test dbr:Informedness dbr:True_negative dbr:True_positive dbr:Measure_of_association
dbp:author	Davide Chicco (en)
dbp:date	November 2020 (en) January 2025 (en)
dbp:reason	The article only ever talks about the 2x2 case and binary variables. How does the phi coefficient extend to other cases? (en) this section is a summary of claims from a single source, and has some misconception of the use of F1 score in binary classification with no clear "positive" class . (en)
dbp:text	In order to have an overall understanding of your prediction, you decide to take advantage of common statistical scores, such as accuracy, and F1 score. : : However, even if accuracy and F1 score are widely employed in statistics, both can be misleading, since they do not fully consider the size of the four classes of the confusion matrix in their final score computation. Suppose, for example, you have a very imbalanced validation set made of 100 elements, 95 of which are positive elements, and only 5 are negative elements . And suppose also you made some mistakes in designing and training your machine learning classifier, and now you have an algorithm which always predicts positive. Imagine that you are not aware of this issue. By applying your only-positive predictor to your imbalanced validation set, therefore, you obtain values for the confusion matrix categories: : TP = 95, FP = 5; TN = 0, FN = 0. These values lead to the following performance scores: accuracy = 95%, and F1 score = 97.44%. By reading these over-optimistic scores, then you will be very happy and will think that your machine learning algorithm is doing an excellent job. Obviously, you would be on the wrong track. On the contrary, to avoid these dangerous misleading illusions, there is another performance score that you can exploit: the Matthews correlation coefficient [40] . : . By considering the proportion of each class of the confusion matrix in its formula, its score is high only if your classifier is doing well on both the negative and the positive elements. In the example above, the MCC score would be undefined . By checking this value, instead of accuracy and F1 score, you would then be able to notice that your classifier is going in the wrong direction, and you would become aware that there are issues you ought to solve before proceeding. Consider this other example. You ran a classification on the same dataset which led to the following values for the confusion matrix categories: : TP = 90, FP = 4; TN = 1, FN = 5. In this example, the classifier has performed well in classifying positive instances, but was not able to correctly recognize negative data elements. Again, the resulting F1 score and accuracy scores would be extremely high: accuracy = 91%, and F1 score = 95.24%. Similarly to the previous case, if a researcher analyzed only these two score indicators, without considering the MCC, they would wrongly think the algorithm is performing quite well in its task, and would have the illusion of being successful. On the other hand, checking the Matthews correlation coefficient would be pivotal once again. In this example, the value of the MCC would be 0.14 , indicating that the algorithm is performing similarly to random guessing. Acting as an alarm, the MCC would be able to inform the data mining practitioner that the statistical model is performing poorly. For these reasons, we strongly encourage to evaluate each test performance through the Matthews correlation coefficient , instead of the accuracy and the F1 score, for any binary classification problem. (en)
dbp:title	Ten quick tips for machine learning in computational biology (en)
dbp:wikiPageUsesTemplate	dbt:Cleanup_section dbt:Main dbt:Reflist dbt:Color dbt:Statistics dbt:Diagnostic_testing_diagram dbt:Explain dbt:Long_quote dbt:Machine_learning_evaluation_metrics dbt:Diagonal_split_header dbt:Short_description dbt:Blockquote
dct:subject	dbc:Machine_learning dbc:Cheminformatics dbc:Statistical_ratios dbc:Information_retrieval_evaluation dbc:Bioinformatics dbc:Statistical_classification dbc:Computational_chemistry dbc:Summary_statistics_for_contingency_tables
rdfs:label	Phi coefficient (en) Coeficiente phi (es) 파이 계수 (ko) Współczynnik fi (pl) Phi相關係數 (zh)
owl:sameAs	freebase:Phi coefficient wikidata:Phi coefficient dbpedia-tr:Phi coefficient dbpedia-zh:Phi coefficient dbpedia-he:Phi coefficient dbpedia-es:Phi coefficient dbpedia-pl:Phi coefficient dbpedia-ko:Phi coefficient dbpedia-global:Phi coefficient dbr:Phi coefficient
prov:wasDerivedFrom	wikipedia-en:Phi_coefficient?oldid=1291853978&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:Phi_coefficient
is dbo:wikiPageDisambiguates of	dbr:Phi_(disambiguation)
is dbo:wikiPageRedirects of	dbr:Matthews_correlation_coefficient dbr:Matthews_Correlation_Coefficient dbr:Matthews_coefficient dbr:Matthews_correlation_measure dbr:Matthews_measure dbr:Mean_square_contingency dbr:Mean_square_contingency_coefficient dbr:Pearson's_phi
is dbo:wikiPageWikiLink of	dbr:P4-metric dbr:List_of_analyses_of_categorical_data dbr:Karl_Pearson dbr:Effect_size dbr:Cramér's_V dbr:Binary_classification dbr:Unistat dbr:Contingency_table dbr:Phi_(disambiguation) dbr:Tschuprow's_T dbr:List_of_statistics_articles dbr:Matthews_correlation_coefficient dbr:Evaluation_of_binary_classifiers dbr:Partial_Area_Under_the_ROC_Curve dbr:Matthews_Correlation_Coefficient dbr:Matthews_coefficient dbr:Matthews_correlation_measure dbr:Matthews_measure dbr:Mean_square_contingency dbr:Mean_square_contingency_coefficient dbr:Pearson's_phi
is foaf:primaryTopic of	wikipedia-en:Phi_coefficient