Determining Extractive Summary for a Single Document Based on Collaborative Filtering Frequency Prediction and Mean Shift Clustering

Faculty Computer Science Year: 2019
Type of Publication: ZU Hosted Pages:
Authors:
Journal: IAENG International Journal of Computer Science IAENG International Journal of Computer Science Volume:
Keywords : Determining Extractive Summary , , Single Document Based    
Abstract:
This paper presents a new unsupervised algorithm for determining extractive summary for a single document using term frequency prediction, which is obtained from memory-based collaborative filtering (CF) approach, and Mean Shift Clustering algorithm. The new algorithm uses Term-Sentence Collaborative Filtering (TSCF) for predicting term frequency. These term frequencies are used in sentence ranking according to the presence percentage of each word/term in each sentence. TSCF computes term frequencies for either terms present or missing (sparse) in a sentence via collaborative filtering prediction algorithm. The new algorithm uses Mean Shift Clustering algorithm as a final framework to group sentences according to their ranks to get more coherent summaries. Experiments show the effect of using different weighting functions including: Term Frequency (TF), Term Frequency Inverse Document Frequency (TFIDF) and binary TF. In addition, they show the effect of using different distance metrics that support sparse matrices representations including: Cosine, Euclidean and Manhattan. Experiments also, show the effect of using L1 and L2 normalization. ROUGE is used as a fully automatic metric in text summarization on DUC2002 datasets. Results show ROUGE-1, ROUGE-2, ROUGE-L and ROUGE-SU4 average recall, precision and f-measure scores, which show the effectiveness of the new algorithm. Results show that the proposed TSCF algorithm has promising results and outperforms related baseline techniques in many ROUGE scores.
   
     
 
       

Author Related Publications

    Department Related Publications

    • Abdallah Gamal abdallah mahmoud, "A Group Decision Making Framework Based on Neutrosophic TOPSIS Approach for Smart Medical Device Selection", Springer US, 2019 More
    • Ahmed Salah Mohamed Mostafa, "Real-Time and Automatic System for Performance Evaluation of Karate Skills Using Motion Capture Sensors and Continuous Wavelet Transform", Hindawi, 2023 More
    • Ibrahiem Mahmoud Mohamed Elhenawy, "Improving crisis events detection using distilbert with hunger games search algorithm", MDPI, 2022 More
    • Abdallah Gamal abdallah mahmoud, "Modern Soft Computing: Techniques and Applications", 2024 More
    Tweet