There are two basic activities that will introduce challenges to the usefulness of arrays and linked lists: searching and inserting data. We'll talk about searching first, as it will motivate the need to carefully insert data.
Let's say you have a list of numbers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19. A typical activity is based on the desire to search that list for a particular value that may or may not be in the list and return its location. If I asked you to find the location of '9', for example, you would probably scan across the list of values until you spotted it, and announce "It's right there!" You might even say "It's the sixth number on the list!" This is precisely the strategy of searching that is used in a linked list.
Code:
+---------+ +---------+ +---------+ +---------+ +---------+
head --> | 1, next---> | 3, next---> | 5, next---> | 7, next---> | 9, next---> ...
+---------+ +---------+ +---------+ +---------+ +---------+
^ ^ ^ ^ ^
| | | | |
search 1 search 2 search 3 search 4 found on 5linkedsearch.cpp:
Code:
struct node
{
int data;
node* next;
};
node* find(node* head,int tofind)
{
node* temp=head;
while (temp->data < tofind)
{
temp=temp->next;
}
if (temp->data == tofind)
return temp;
else
return NULL;
}Code:
list size: 10 100 1,000 10,000 100,000 1,000,000 ave. comps.: 6 51 501 5,001 50,001 500,001
Look at the list of numbers again: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19. Since it's in order, I can approach the problem from a different perspective: look at the middle value (9 in this case) Is it the value we're looking for? Is it lower than what we're looking for? If it isn't the value, then we can focus our search on the top or bottom half of our list. 2 comparisons throw out half the list. Repeating the process will result in every 2 comparisons chopping the potential chunk of list in half. With arrays, we can find the middle of our list easily. Below is the logic to find 13:
Code:
Index bounds are 0 - 9, midpoint is index 4 +----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 *4* 5 6 7 8 9 9<13 so index bounds are 5-9, midpoint is index 7 +----++----++----++----++----+ | 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----+ 5 6 *7* 8 9 15>13 so index bounds are 5-6, midpoint is index 5 +----++----+ | 11 || 13 | +----++----+ *5* 6 11<13 so index bounds are 6-6, midpoint is index 6 +----+ | 13 | +----+ *6* found at index 6!
arraysearch.cpp:
Code:
int find(int* searcharray, int tofind, int size)
{
int min=0;
while (min <= size)
{
if(searcharray[(min+size)/2]=tofind)
{
return (min+size)/2;
}
else if(searcharray[(min+size)/2]<tofind)
{
min=(min+size)/2+1;
}
else
{
size=(min+size)/2-1;
}
}
return -1;
}Code:
list size: 10 100 1,000 10,000 100,000 1,000,000 ave. comps.: 7 13 20 27 33 40
Based on what we've seen so far, this clearly indicates that arrays are far better than linked lists, and we should immediately stop using linked lists, right? Not so fast! There was a requirement on that array: it had to contain the elements in order! That doesn't just happen by magic. Every time you add a new element to your list of data, there is an overhead cost. If you are not willing to accept that overhead cost, your data will not be sorted, and an array cannot pull that neat trick. An array search would then be just as slow as a linked list search: go through all the data and hope you find it early.
There are two ways to deal with this situation. 1) sort the data before doing a search, or 2) make sure the data remains sorted as you add new values. Since doing a sort is an expensive operation ( O(n*log(n)) or worse, depending on algorithm ), we're going to opt for keeping our data sorted as we add values. This means we need to look at how expensive it will be to insert an item. To keep things simple, we'll just look at the cost of insertion, and remember that there is always going to be an overhead cost (finding where the data goes). The total cost on an insert will be [search cost] + [insert cost], which we will deal with after we look at the cost of an insert.
A linked list is pretty simple, just adjust the pointers (in this case to add 6):
Code:
this:
+---------+ +---------+
head --> ... --> | 5, next---> | 7, next---> ...
+---------+ +---------+
+---------+
| 6, next---> NULL
+---------+
becomes:
+---------+ +---------+
head --> ... --> | 5, next | | 7, next---> ...
+-----|---+ +---------+
| ^
| +---------+ |
+-> | 6, next---+
+---------+linkedinsert.cpp:
Code:
node* insert(node* head, int toadd)
{
node* temp;
if (head->data < toadd)
{
temp=new node;
temp->data = toadd;
temp->next = head;
return temp;
}
else
{
while ((temp->next!=NULL)&&((temp->next)->data < toadd))
{
temp=temp->next;
}
if (temp->data < toadd)
{
temp->next = new node;
temp=temp->next;
temp->data = toadd;
}
else
{
node* temp2=new node;
temp2->data = toadd;
temp2->next = temp->next;
temp->next = temp2;
temp=temp2;
}
return temp;
}
}Code:
Insert 6 at index 3 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 15 || 17 || 19 || | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 move elements to the right to free space +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 15 || 17 || 19 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 15 || 17 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 15 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 13 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 11 || 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 9 || 9 || 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 7 || 7 || 9 || 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10 and now add 6 at index 3 +----++----++----++----++----++----++----++----++----++----++----+ | 1 || 3 || 5 || 6 || 7 || 9 || 11 || 13 || 15 || 17 || 19 | +----++----++----++----++----++----++----++----++----++----++----+ 0 1 2 3 4 5 6 7 8 9 10
arrayinsert.cpp:
Code:
int insert(int* searcharray, int toadd, int size)
{
int min=0;
int max=size;
while (min <= max)
{
if(searcharray[(min+max)/2]=toadd)
{
min = (min+max)/2;
for(int i=size;i>min;i--)
{
searcharray[i+1]=searcharray[i];
}
searcharray[min]=toadd;
return min;
}
else if(searcharray[(min+max)/2]<toadd)
{
min=(min+max)/2+1;
}
else
{
max=(min+max)/2-1;
}
}
min=(min+max)/2;
if (searcharray[min+1]<toadd)
{
min=min+1;
}
for(int i=size;i>min;i--)
{
searcharray[i+1]=searcharray[i];
}
searcharray[min]=toadd;
return min;
}The total cost of search plus insert becomes: linked list = n/2 + 1 + 2, array = 2*log_2(n) + (n/2)*(cost of move). Both are O(n) for the total insert, with array being potentially much slower depending on the type of data involved. Based on these two methods for storing data, it looks like you should use arrays if you are mainly searching for data that is small, and you should use linked lists if you have a lot of inserts or the data is large.
In the next tutorial, we'll try to get the best of both worlds
discuss this topic to forum
