Previous: 10 Sorting and Searching
Up: 10 Sorting and Searching
Next: 10.2 Improving Search - Sorting the Data
Previous Page: 10 Sorting and Searching

10.1 Finding a Data Item - The Search Problem

Suppose we have a collection of data items of some specific type (e.g. integers), and we wish to determine if a particular data item is in the collection. The particular data item we want to find is called the key and out task is to search the records in the data base to find one which ``matches'' the key.

The first decision we must make is how to represent the collection of data items. In Chapter we saw a data structure which could hold a collection of data items all of the same type: the array. So we can consider our data base of integer values to be stored in an array. Our task then becomes:

Task:

SRCH0: Search an array for an index where the key is located; if key is not present, print a message. Repeat until an end of file is entered for the key.

In this task, we choose to return the index where the key is located because this index will allow us to retrieve the entire record in the case where our array is part of a database. The simplest approach to determine if a key is present in an array is to make an exhaustive search of the array. Start with the first element, if the key matches, we are done; otherwise move on to the next element and compare the key, and so on. We simply traverse the array in sequence from the first element to the last as shown in Figure 10.1. Each element is compared to the key. If the key is found in the array, the corresponding array index is returned. If the item is not found in the array, an invalid index, say -1, is returned. This type of search is called Sequential Search or Linear Search because we sequentially examine the elements of the array. In the worst case, the number of elements that must be compared with the key is linearly proportional to the size of the array.

Linear search is not the most efficient way to search for an item in a collection of items; however, it is very simple to implement. Moreover, if the array elements are arranged in random order, it is the only reasonable way to search. In addition, efficiency becomes important only in large arrays; if the array is small, there aren't many elements to search and the amount of time it takes is not even noticed by the user. Thus, for many situations, linear search is a perfectly valid approach. Here is a linear search algorithm which returns the index in the array where key is found or -1 if key is not found in the array:

initialize index i to 0
     traverse the array until exhausted
          if array[i] matches key
               return i;
     return -1.

We will implement the algorithm for search as a function, seqsrch() since searching an array may be required in many programs, and a function incorporating linear search can be used for many applications. The function is passed the array of integers, x[], the number of elements in the array, lim, and the key to search for, key. The function is shown in Figure 10.2.

The loop compares each element of the array with key. If an element with the same value is found at an index i, i is returned. The loop terminates when the array limit is reached; in which case, no element equal to key was found in the array, and -1 is returned. If there is more than one element in the array with the same value as key, the function terminates the search as soon as the first element is found, returning its index.

We have defined DEBUG in sortsrch.c so that debug statements we may add to the code will be compiled. We have also included file sortsrch.h in sortsrch.c which, as shown in Figure 10.3 contains the prototypes for functions defined in sortsrch.c since some of the functions defined in sortsrch.c may be used by other functions to be written in the file. We will continue to add more functions to sortsrch.c and corresponding prototypes to as we proceed through this chapter; however, we will not always show the additions of prototypes to sortsrch.h.

Our task now requires us to write a simple driver to repeatedly call the function, seqsrch(). For simplicity, we will declare an initialized array. The driver uses a function, pr_aray_line(), to print an array with ten elements per line to save space. Figure 10.4 shows the program driver.

The driver first prints an initialized array, id[], and then searches for items entered by the user. If an item is found in the array, its index is printed. Otherwise, a message is printed. The function, pr_aray_line() is included in sortsrch.c and its prototype in sortsrch.h. These additions are shown in Figure 10.5.

The function prints at successive elements on one line until the array index modulo 10 is zero when it prints a newline and continues. A sample session for the program srcharay.c is shown below:



Previous: 10 Sorting and Searching
Up: 10 Sorting and Searching
Next: 10.2 Improving Search - Sorting the Data
Previous Page: 10 Sorting and Searching

tep@wiliki.eng.hawaii.edu
Sat Sep 3 06:58:41 HST 1994