EPOBPA: Extensible Parallelizable Optimized Buddy Prima Algorithm

Mona Farouk
Mohamed Abdel Gawwad

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Data is the growing new natural resource; in today’s world many organizations have a vast amount of data and they need a method to get benefits of it. Every day different systems generate more than 2.5 Quintilian bytes of data; and there is a fact that 90% of the data have been created in the last 10 years. Frequent Itemset Mining (FIM) finds valuable associations and establishes a correlation relationship between large sets of data items. Association rules describe attribute value conditions that occur frequently together in a given dataset. Existing FIM algorithms depend mainly on candidate sets generation like Apriori algorithm or on constructing data structure to handle datasets like FP-Growth. In the big data era these techniques come with high time overhead. E-POBPA presents a new FIM technique to handle big data with neither candidate generation step nor creating a specific data structure. The proposed algorithm is built upon the original Buddy Prima algorithm. It encompasses a distribution method that makes it customizable for any hardware architecture used. The Experimental results show that E-POBPA surpasses state of the art techniques in its time performance. The time improvement over other approaches ranges between 36% and 99% depending on the dataset and the minimum support used.

Version published to 10.21203/rs.3.rs-6844956/v1 on Research Square
Jul 31, 2025

Sassy: Searching Short DNA Strings in the 2020s

This article has 2 authors:
1. Rick Beeloo
2. Ragnar Groot Koerkamp
This article has no evaluationsLatest version Jul 26, 2025
TriFMatch: a flash subgraph matching algorithm with effective filtering techniques

This article has 6 authors:
1. jiezhong He
2. Yixin Chen
3. Menghan Jia
4. Zhouyang Liu
5. Dongsheng Li
6. Tankian Lee
This article has no evaluationsLatest version Jul 18, 2025
Comparative Performance Analysis of Four RDBMS Systems Integrated with Django's ORM

This article has 3 authors:
1. Mahmoud Nasr
2. Desoky Abdelqawy
3. Mohammad El-Ramly
This article has no evaluationsLatest version Jul 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Sassy: Searching Short DNA Strings in the 2020s

TriFMatch: a flash subgraph matching algorithm with effective filtering techniques

Comparative Performance Analysis of Four RDBMS Systems Integrated with Django's ORM