# MAT 2051 Unit 5 Discussion DQ1 Data Mining, Time Complexity, and Algorithms

In Appendix C, you read about different programming control structures used to write pseudocode and actual computer algorithms, such as if statements, while and for loops, and function calls. For this discussion, assume you work for a data mining company and your job is to write a program to find information on various Web sites pertaining to sales of the Lenovo X200. After your algorithm finds this data, more complex analysis will be done to extract more meaningful information from the data.

Your algorithm is going to scan different sites and search for the character string “Lenovo X200.” Assume you decide to use an algorithm similar to Text Search (see algorithm 4.2.1 on page 178 of your text for an explanation of what this is). If the algorithm finds a site that contains the string (that is, Lenovo X200), assume that it then stores all data or all the text on that particular site into a storage area.

To understand this problem fully, answer the following questions:

What is data mining?

What is a character string?

What is the worst case run time of this algorithm in terms of p, m, t, n (that is, what is O)?

How long do you think it will take this algorithm to run? Note the time complexity as O (run time in terms of n).

Assume that each Web site, on average, has character strings of length 10,000 and that the length of the character string “Lenovo X200” is 11. How many computations will the algorithm need to make per site?

Why is speed and the analysis of algorithm speed so important?

