Thursday, April 9, 2015

Comparison of Algorithms



As an example of thesis, a study comparison among data mining algorithms might be a good one. Take for instance, approximate string matching algorithms. They can be coded in a suitable language and applied on data sets and their run times can be found to get the approximate matched string in response to the query string. And an analysis of about which algorithm is optimal to use can be done.
Here is an analysis of three approximate string matching algorithms:

Lipschitz Embedding Algorithm uses a straight forward procedure to match an approximate string to the query string. It creates reference strings, uses mathematical formula to find out minimum edit distance and the corresponding nearest string to the query string. Ball partitioning algorithm (which I haven’t covered but can be found from suitable web resources) creates a VP tree and visits left and right nodes of the tree to find out minimum edit distance and the corresponding string which is nearest to the query string. Execution time increases with the increase of visited nodes. Brute force algorithm for approximate string matching uses the most inefficient way to find out the nearest string. It calculates edit distances of all the text strings from the query string and then checks all the distances to find out the minimum distance.As a result, logically as well as based on their run times on data sets, among the three approximate string matching algorithms, the fastest is Lipschitz Embeddings Algorithm, then the Ball Partitioning algorithm and the slowest is Brute Force algorithm.



No comments:

Post a Comment