Possibilistic Similarity Measures for Data Science and Machine Learning Applications

被引：4

作者：

Charfi, Amal ^{[1
]}

Bouhamed, Sonda Ammar ^{[1
,2
]}

Bosse, Eloi ^{[2
,3
]}

Kallel, Imene Khanfir ^{[1
,2
]}

Bouchaala, Wassim ^{[4
]}

Solaiman, Basel ^{[2
]}

Derbel, Nabil ^{[1
]}

机构：

[1] Univ Sfax, Natl Sch Engineers Sfax, Control & Energy Managment CEM Lab, Sfax 3038, Tunisia

[2] IMT Atlantique, Image & Informat Proc Dept iTi, F-838182923 Brest, France

[3] Expertises Parafuse Inc, Quebec City, PQ G1W 4N1, Canada

[4] Tunisian Profess Training Agcy, Sfax 3000, Tunisia

来源：

IEEE ACCESS | 2020年 / 8卷

关键词：

Uncertainty; Possibility theory; Measurement uncertainty; Machine learning; Atmospheric measurements; Particle measurements; Indexes; Classification; distance; entropy; learning; measures of specificity; possibility distributions; similarity; uncertainty; INFORMATION; UNCERTAINTY; NOTION;

D O I：

10.1109/ACCESS.2020.2979553

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Measuring similarity is of a great interest in many research areas such as in data sciences, machine learning, pattern recognition, text analysis and information retrieval to name a few. Literature has shown that possibility is an attractive notion in the context of distinguishability assessment and can lead to very efficient and computationally inexpensive learning schemes. This paper focuses on determining the similarity between two possibility distributions. A review of existing similarity measures within the possibilistic framework is presented first. Then, similarity measures are analyzed with respect to their capacity to satisfy a set of required properties that a similarity measure should own. Most of the existing possibilistic similarity measures produce undesirable outcomes since they generally depend on the application context. A new similarity measure, called InfoSpecificity, is introduced and the similarity measures are categorized into three main methods: morphic-based, amorphic-based and hybrid. Two experiments are being conducted using four benchmark databases. The aim of the experiments is to compare the efficiency of the possibilistic similarity measures when applied to real data. Empirical experiments have shown good results for the hybrid methods, particularly with the InfoSpecificity measure. In general, the hybrid methods outperform the other two categories when evaluated on small-size samples, i.e., poor-data context (or poor-informed environment) where possibility theory can be used at the greatest benefit.

引用

页码：49198 / 49211

页数：14

共 50 条

[31] Data and Machine Learning in Polymer Science
Yun-Qi Li
Ying Jiang
Li-Quan Wang
Jian-Feng Li
Chinese Journal of Polymer Science, 2023, 41 : 1371 - 1376
[32] Sentiment analysis using machine learning: Progress in the machine intelligence for data science
Revathy, G.
Alghamdi, Saleh A.
Alahmari, Sultan M.
Yonbawi, Saud R.
Kumar, Anil
Haq, Mohd Anul
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 53
[33] Benchmarking antibody clustering methods using sequence, structural, and machine learning similarity measures for antibody discovery applications
Chomicz, Dawid
Konczak, Jaroslaw
Wrobel, Sonia
Satlawa, Tadeusz
Dudzic, Pawel
Janusz, Bartosz
Tarkowski, Mateusz
Deszynski, Piotr
Gawlowski, Tomasz
Kostyn, Anna
Orlowski, Marek
Klaus, Tomasz
Schulte, Lukas
Martin, Kyle
Comeau, Stephen R.
Krawczyk, Konrad
FRONTIERS IN MOLECULAR BIOSCIENCES, 2024, 11
[34] Machine learning in suicide science: Applications and ethics
Linthicum, Kathryn P.
Schafer, Katherine Musacchio
Ribeiro, Jessica D.
BEHAVIORAL SCIENCES & THE LAW, 2019, 37 (03) : 214 - 222
[35] Big data, machine learning and uncertainty in foresight studies
Muraro, Vinicius
Salles-Filho, Sergio
FORESIGHT, 2024, 26 (03): : 436 - 452
[36] Option Return Predictability with Machine Learning and Big Data
Bali, Turan G.
Beckmeyer, Heiner
Morke, Mathis
Weigert, Florian
REVIEW OF FINANCIAL STUDIES, 2023, 36 (09) : 3548 - 3602
[37] Data modeling in machine learning based on information-theoretic measures
Liu, YH
Li, AJ
Luo, SW
2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1219 - 1222
[38] Boosting-Based Machine Learning Applications in Polymer Science: A Review
Malashin, Ivan
Tynchenko, Vadim
Gantimurov, Andrei
Nelyub, Vladimir
Borodulin, Aleksei
POLYMERS, 2025, 17 (04)
[39] Increasing the Density of Laboratory Measures for Machine Learning Applications
Abedi, Vida
Li, Jiang
Shivakumar, Manu K.
Avula, Venkatesh
Chaudhary, Durgesh P.
Shellenberger, Matthew J.
Khara, Harshit S.
Zhang, Yanfei
Lee, Ming Ta Michael
Wolk, Donna M.
Yeasin, Mohammed
Hontecillas, Raquel
Bassaganya-Riera, Josep
Zand, Ramin
JOURNAL OF CLINICAL MEDICINE, 2021, 10 (01) : 1 - 23
[40] What Role Does Hydrological Science Play in the Age of Machine Learning?
Nearing, Grey S.
Kratzert, Frederik
Sampson, Alden Keefe
Pelissier, Craig S.
Klotz, Daniel
Frame, Jonathan M.
Prieto, Cristina
Gupta, Hoshin V.
WATER RESOURCES RESEARCH, 2021, 57 (03)

← 1 2 3 4 5 →