Castle Paradox Forum Index Castle Paradox

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 Gamelist   Review List   Song List   All Journals   Site Stats   Search Gamelist   IRC Chat Room

Arrays vs. Linked Lists
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    Castle Paradox Forum Index -> HELP!
View previous topic :: View next topic  
Author Message
Mr B




Joined: 20 Mar 2003
Posts: 382

PostPosted: Sat Nov 12, 2005 1:26 pm    Post subject: Arrays vs. Linked Lists Reply with quote

Hi guys,

I am having a bit of difficulty deciding whether to use an array or a linked list for a certain OHR application.

I have an array of 20 element sets. My game has a master script that loops continuously, checking the array during every cycle. It only needs to check the "active" elements of the array, which are probably going to be around 2-5 in general, though it can briefly explode to 18 or so.

Each element set contains the ID of a script and the clock cycle time at which the script is to be executed. Since these scripts will have all sorts of delays, there is no way to guarantee that the elements inserted last will be the first to finish.

If I cycle through the array as an array, then I may need to cycle up to the last element even if I have only one active element! This would be a waste of time.

If I make the elements into a singly-linked-list, then I will only have to traverse as many elements as are active. However, this will mean creating several additional scripts to handle insertion, deletion, pointer re-allocation, etc.

Either way, I will be having a stack of references to unused elements so that insertion doesn't require cycling.

So here is my question: since I will be having only 20 or so elements in my list, which method will take the least time to use? I hate the inefficiencies of cycling through all of those elements (even if I stop the cycling after it passes the active number of elements), but I am concerned about the time overhead of calling a large number of additional scripts.

At this point I am leaning towards a linked list, but I would appreciate your input before jumping into several hours of scripting.
Back to top
View user's profile Send private message
Moogle1
Scourge of the Seas
Halloween 2006 Creativity Winner
Halloween 2006 Creativity Winner



Joined: 15 Jul 2004
Posts: 3377
Location: Seattle, WA

PostPosted: Sat Nov 12, 2005 9:03 pm    Post subject: Reply with quote

Linked list will definitely be faster if you don't anticipate a large array size. If it only has 2-5 on the average case, you're looking at 50% less traversal time at least.
_________________
Back to top
View user's profile Send private message Visit poster's website AIM Address
Inferior Minion
Metric Ruler



Joined: 03 Jan 2003
Posts: 741
Location: Santa Barbara, CA

PostPosted: Sun Nov 13, 2005 12:02 am    Post subject: Reply with quote

Moogle1 wrote:
Linked list will definitely be faster if you don't anticipate a large array size. If it only has 2-5 on the average case, you're looking at 50% less traversal time at least.


Not always true. Arrays have the advantage of being a static block of memory. Linked Lists have the overhead for insertion, deletion, and seeking. You can directly access any block in an array at any time. You have to traverse through the linked list to get to the nth element.

Mind you, I have never looked at plotscripting and have no idea how either data structure would be implemented in the language, so I'm basing this on general concepts of arrays and linked lists.

Note: I'm using Big-O notation. O(#) refers to how many operations on average it takes to complete the task. I'm assuming memory creation/deletion are simple operations. I've also made the assumption you've created a singly linked list (only 1 pointer to the next element in the list) and you keep track of both the head and tail of the list.

Array Implementation:
* Accessing any element: O(1): array[element]
* Storing to any element: O(1): array[element] = something
* Deleting an element: O(1): array[element] = 0 (or -1 or some other constant)
* Traversing the array: O(25)

Linked List:
* Accessing the nth element in the list: O(n)
* Storing any element: O(3)
- Create a new node: O(1)
- Append to the tail: O(1)
- Move the tail: O(1)
* Deleting the nth element: O(n + 3)
- Access the nth element in the list: O(n)
- Make the n-1th pointer point to the n+1th element (if not deleting the head): O(1)
- Delete the nth element: O(1)
- Update the head and or tail as needed: O(1)
* Traversing a list of n elements: O(n)

You lose traversal time, but you gain in storing, deleting, and accessing. If you don't keep track of the tail, storing shoots up to O(n+2) since you have to find the end of the list before you insert.

You only gain roughly 6 operations for the linked list, which which is significantly lower than the constant 25 operations to traverse the array. Remember you'll have to update a "previous" node if you want to delete anything. Or you can create a doubly linked list (point forwards and backwards). That would bring the operations up to O(8). This is assuming the time cost for dynamically allocating memory is relatively small. This also means that Linked Lists will take longer when you have a relatively full list.
_________________


Last edited by Inferior Minion on Tue Nov 15, 2005 6:39 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address MSN Messenger
Mike Caron
Technomancer




Joined: 26 Jul 2003
Posts: 889
Location: Why do you keep asking?

PostPosted: Tue Nov 15, 2005 10:26 am    Post subject: Reply with quote

You could always allow 20+1 elements, and then put a marker (such as -1) to mark the end of the list. Why not zero? Because the default is zero. If a script in the middle finishes early, there will be fragmentation. So if you have a map like this:


[0]: 251
[1]: 0
[2]: 12

Then 2 will never be accessed.

Vs.

[0]: 251
[1]: 0
[2]: 12
[3]: -1

Then your script will ignore 0s, and stop at -1.

Often, in other languages, I find myself storing an arbitrary number of <whatever>s, and having to return IDs. Simply going up by one every time isn't practical, since they're often stored in a flat array. So, my code usually looks something like this (in pseudo code, aObj() is the array):

Code:
function getObj() as obj
  dim i as integer
  for i = 0 to ubound(aObj)
    is aObj(i) taken
      insert new obj in i
      return i
    end is
  next
  redim aObj(i)
  insert new obj in i
  return i
end function


With an implementation like that, you don't need to keep track of which elements are taken (0 is free, -1 is EoL).

I'd probably write that in HamsterSpeak like:

Code:
script, add element, id, begin
  #pretend the first array element is global #10
  #should probably be a constand
  variable(i, found,inloop)
  inloop := true
  i := 0
  while(in loop) do, begin
    if(read global(10 + i) == -1) then (inloop := false)
    if(read global(10 + i) == 0) then, begin
      write global(10 + i,id)
      found := true
      inlooop := false
    end
    i += 1
  end
  i -= 1 #since it'll be point past the -1, so we need to back up
  if(found == 0) then, begin
    write global(10 + i,id)
    write global(10 + i + 1, -1) #set the new end of array
  end
end


Deleting an element is easy (just set it to 0), and it automatically takes up the first available slot, so it doesn't expand unnecissarily.
_________________
I stand corrected. No rivers ran blood today. At least, none that were caused by us.

Final Fantasy Q
OHR Developer BLOG
Official OHRRPGCE Wiki and FAQ
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Mr B




Joined: 20 Mar 2003
Posts: 382

PostPosted: Tue Nov 15, 2005 12:10 pm    Post subject: Reply with quote

Okay; it looks as if the advantage of linked lists completely overwhelms everything else. I am going to do a singly-linked-list, because there are only so many global variables. I will use an array for the stack, though.

Some more info on what I am doing; the nature of the game I am making is such that I will need scripts to be called with some sort of delay after a trigger (AI scripts, event coordination, etc.), but other scripts need to be processed in the mean time.

Therefore I will have a master script that will cycle continuously, each time incrementing a variable (which will double back on itself after it reaches a certain value). I will also have a linked list of global variable groups, each one containing the reference ID of a script and the cycle time at which that script is to be executed (among other things, such as values to be passed to the scripts).

At each cycle of the master script, the list will be examined to see if any scripts need to be executed.

As for the list itself, I think that I will insert the elements in terms of their execution time, from soonest to furthest off. I am also thinking of having a global variable that contains the cycle time of the next script that needs to be executed (this will be updated every time that a script is added or deleted). This way I only need to check a single variable once per cycle to see if I should even bother going through the list or not.

Ordinarily I would just examine the execution time of the first element in the list, but because of the cycling around that the cycle variables do, the smallest execution time is not always the next to be executed (if I wrap the cycle values around at 1000, and I am at 990 and have a script that needs to be executed in 20 cycles, it would be executed when the global cycle variable = 10, not 1010; however, I may have a script that needs to be executed at 998, which is sooner).

Does anyone see any pitfalls to this approach?
Back to top
View user's profile Send private message
Inferior Minion
Metric Ruler



Joined: 03 Jan 2003
Posts: 741
Location: Santa Barbara, CA

PostPosted: Tue Nov 15, 2005 1:08 pm    Post subject: Reply with quote

It sounds like you're tryin got build a Priority Queue using a linked list. The downside is in the worst case it'll take O(n) to insert elements. Keeping things sorted should make your life easier, but I don't think the Linked List is your best solution.

Check out Heaps. They're Trees implemented using Arrays to create Priority Queues.

The following link builds a Max Heap (the root is the largest value in the tree) but a Min Heap can easily be created by flipping a few if statements:

http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/heap/heapen.htm

Implementation in HS requires an array, a variable for numElements, and the Heap Sort functions from that link. This should speed up insertion, deletion, and give you access to the raw data via the array if needed. The way I see your event scheduler working is a simple external counter, as you mentioned. You insert elements into your Priority Queue using an offset relative to that counter. Each time you increment your counter, you simply compare it against the first element in the array. When it's time to execute an event, you pop off the root element of your Priority Queue, processing each element until the root no longer matches the counter.

Checking becomes O(1), deleting becomes O(log n), insertion becomes O(log n), and if you ever really feel like traversing the array, it's O(n) because you know exactly how many elements are in the tree.

Hope that helps.


Edit: Fixed the deletion Big O, was a bit off. Also, was thinking about the external counter. If it loops, you'll screw up your heap. When your counter loops, those internal triggers will all shoot to smaller values than anything in your heap. You'll want to compare Time To Live (internal trigger minus external counter, taking looping into consideration), not simply the internal trigger values. This adds an extra calculation per compare, but should still be faster than using a Linked List or standard array implementation.
_________________
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address MSN Messenger
TMC
On the Verge of Insanity




Joined: 05 Apr 2003
Posts: 3240
Location: Matakana

PostPosted: Wed Nov 16, 2005 4:12 am    Post subject: Reply with quote

Horray, complicated uses of HS! :D:D:D

But Mike, why would we want an expanding array here? It's sort of irrelevant (and dangerous to use... at least, I tend to squeeze in my arrays as tightly and haphazardly as possible)

Yeah, there is no point doing this if you are going to only sort depending on the counter value instead of calculating time to live. I assume that each event needs to be only called once for one tick. If an event needs to be called every tick for 10, say, then things change and IM might correct me, but I don't think heaps will efficiently handle that (no deleting means recursive traversion). So, what exactly did you mean by "active elements"?

However, moving globals around in HS is slow, there is no command to do so (I've been meaning to implement one) which means it has to be done manually. So it would be a bad idea to use the heap idea directly. You could instead have a heap with pointers to the elements in the array along with activation time, an array of 20 elements, and a stack of unused elements. This is getting a little complicated and global hungry.

Incase you do want to use a heap, and because I'd never implemented one before (especially not in HS!) and wanted to have some fun, I got carried away and wrote these scripts (which are entirely not bug proof but 90% logic proof):

Code:
global variable (
1, current counter
2, stack ptr
3, num elements
)

define constant (
100, STACK BASE
120, HEAP BASE
)

#example of your master loop
script, master, begin
  while (1) do (
    while (read global (HEAP BASE) == current counter) do (
      #process element at read global (HEAP BASE + 1)
      delete heap root ()
    )
    wait
    current counter := (current counter + 1), mod, 1000
  )
end


#inserts values into the heap, deals with the stack, and returns a ptr to
#an element to fill with data
script, new element, run timer, begin
  variable (element ptr, live time)

  # find a free element

  stack ptr -= 1
  if (stack ptr << 0) then () #throw an error?
  element ptr := read global (STACK BASE + stack ptr)

  live time := run timer -- current counter
  if (live time << 0) then (live time += 1000)


  # insert into heap (at the bottom)

  variable (cur ptr, parent ptr)

  write global (HEAP BASE + num elements * 2, run timer)
  write global (HEAP BASE + num elements * 2 + 1, element ptr)
  cur ptr := num elements
  num elements += 1


  # sort heap from bottom up

  while (cur ptr) do (   #until reach root
    # find parent
    parent ptr := (cur ptr -- 1) / 2

    live parent := read global (HEAP BASE + parent ptr * 2) -- current counter
    if (live parent << 0) then (live parent += 1000)
 
    if (live parent >> live time) then (
      live time := live parent
      heap swap (HEAP BASE + parent ptr * 2, HEAP BASE + heap ptr * 2)
      cur ptr := parent ptr
    ) else (cur ptr := 0)  #jump out of loop

  )

  return (element ptr)
end

# after pulling a reference off the heap and processing the element, call this is it is done
script, delete heap root, begin

  #stick it on the stack
  write global (STACK BASE + stack ptr, read global (HEAP BASE + 1))
  stack ptr += 1

  #put last heap node at the root
  num elements -= 1
  write global (HEAP BASE, read global (HEAP BASE + num elements * 2))
  write global (HEAP BASE + 1, read global (HEAP BASE + num elements * 2 + 1))


  #sort heap from top down

  variable (cur ptr, child1, child2, live1, live2, live parent)
  cur ptr := 0
  live parent := read global (HEAP BASE) -- current counter
  if (live parent << 0) then (live parent += 1000)

  while (cur ptr * 2 << num elements) do (  #has children


    child1 := cur ptr * 2 + 1
    child2 := child1 + 1

    #if more than 1 child, find the lowest life
    if (child2 <= num elements) then (

      live1 := read global (HEAP BASE + child1 * 2) -- current counter
      if (live1 << 0) then (live1 += 1000)
      live2 := read global (HEAP BASE + child2 * 2) -- current counter
      if (live2 << 0) then (live2 += 1000)

      if (live1 >> live2) then (child1 := child2, live1 := live2)

    )

    if (live parent >> live1) then (
      heap swap (HEAP BASE + child1 * 2, HEAP BASE + cur ptr * 2)
      cur ptr := child1
      live parent := live1
    ) else (cur ptr := 999)  #quit

  )
end

script, heap swap, ptr1, ptr2, begin
  variable (temp)
  temp := read global (ptr1)
  write global (ptr1, read global(ptr2))
  write global (ptr2, temp)
  temp := read global (ptr1 + 1)
  write global (ptr1 + 1, read global(ptr2 + 1))
  write global (ptr2 + 1, temp)
end
 


I'm sorry.

I'm not so sure heaps are the way to go under the limits of HS. On the other hand, it didn't work out too badly and is the way to go for very large lists.

PS: I found that getting extra globals by swapping out to save files works ready well, and it's not slow either (as long as you don't do it every tick or so). In Pheoret, I had a little set mem page script which swapped the correct set of 7 or 800 globals from savefiles to global array, checking against a tracking variable to check what the current was (and saving that if needed). I needed to set a mempagechanged variable when I modified the contents of the page.
_________________
"It is so great it is insanely great."
Back to top
View user's profile Send private message Send e-mail
Inferior Minion
Metric Ruler



Joined: 03 Jan 2003
Posts: 741
Location: Santa Barbara, CA

PostPosted: Wed Nov 16, 2005 9:39 am    Post subject: Reply with quote

The Mad Cacti wrote:
Horray, complicated uses of HS! Big grinBig grinBig grin

But Mike, why would we want an expanding array here? It's sort of irrelevant (and dangerous to use... at least, I tend to squeeze in my arrays as tightly and haphazardly as possible)

Yeah, there is no point doing this if you are going to only sort depending on the counter value instead of calculating time to live. I assume that each event needs to be only called once for one tick. If an event needs to be called every tick for 10, say, then things change and IM might correct me, but I don't think heaps will efficiently handle that (no deleting means recursive traversion). So, what exactly did you mean by "active elements"?


This could be handled by having the script which wants to run multiple times re-insert itself with a Time to Live of 1, but that would require a global variable for the script to decrement (start at 10 and as long as it is greater than 0, re-insert with TTL of 1). You're looking at a maximum of 4 swaps (log 20 in base 2) per re-insertion.

Linked list implementation:

* Execution Time
* Script Number
* Next Global (no guarantee the next item in the list is the next set of global variables)
* Number of Executions

When the TTL reaches 0, if number of executions is greater than 1, decrement number of executions and increment the Execution Time.

An array implementation would use the same implementation as Linked Lists, except you don't need Next Global. When the Execution Time is negative, skip the element, otherwise check to see if it needs execution.

The Mad Cacti wrote:
However, moving globals around in HS is slow, there is no command to do so (I've been meaning to implement one) which means it has to be done manually. So it would be a bad idea to use the heap idea directly. You could instead have a heap with pointers to the elements in the array along with activation time, an array of 20 elements, and a stack of unused elements. This is getting a little complicated and global hungry.


I don't see why the stack is needed. You're keeping track of 2 values: run timer and execution script. Throwing in an element pointer uses a global to point to a single global to return the script execution value. If more elements really are needed, your code would work, but you need to change your stack. It only allows for 1 variable per element right now and is a waste of globals.

Now that I look at it more closely, I think I see what you're trying to do with the stack, but you haven't initialized it anywhere and I'm not entirely sure you're using it correctly. I'll take another look at it when it's not 4 AM See the bottom of my post for how I would implement it

The Mad Cacti wrote:
Incase you do want to use a heap, and because I'd never implemented one before (especially not in HS!) and wanted to have some fun, I got carried away and wrote these scripts (which are entirely not bug proof but 90% logic proof):

Code:
global variable (
1, current counter
2, stack ptr
3, num elements
)

define constant (
100, STACK BASE
120, HEAP BASE
)

#example of your master loop
script, master, begin
  while (1) do (
    while (read global (HEAP BASE) == current counter) do (
      #process element at read global (HEAP BASE + 1)
      delete heap root ()
    )
    wait
    current counter := (current counter + 1), mod, 1000
  )
end


#inserts values into the heap, deals with the stack, and returns a ptr to
#an element to fill with data
script, new element, run timer, begin
  variable (element ptr, live time)

  # find a free element

  stack ptr -= 1
  if (stack ptr << 0) then () #throw an error?
  element ptr := read global (STACK BASE + stack ptr)

  live time := run timer -- current counter
  if (live time << 0) then (live time += 1000)


  # insert into heap (at the bottom)

  variable (cur ptr, parent ptr)

  write global (HEAP BASE + num elements * 2, run timer)
  write global (HEAP BASE + num elements * 2 + 1, element ptr)
  cur ptr := num elements
  num elements += 1


  # sort heap from bottom up

  while (cur ptr) do (   #until reach root
    # find parent
    parent ptr := (cur ptr -- 1) / 2

    live parent := read global (HEAP BASE + parent ptr * 2) -- current counter
    if (live parent << 0) then (live parent += 1000)
 
    if (live parent >> live time) then (
      live time := live parent
      heap swap (HEAP BASE + parent ptr * 2, HEAP BASE + heap ptr * 2)
      cur ptr := parent ptr
    ) else (cur ptr := 0)  #jump out of loop

  )

  return (element ptr)
end

# after pulling a reference off the heap and processing the element, call this is it is done
script, delete heap root, begin

  #stick it on the stack
  write global (STACK BASE + stack ptr, read global (HEAP BASE + 1))
  stack ptr += 1

  #put last heap node at the root
  num elements -= 1
  write global (HEAP BASE, read global (HEAP BASE + num elements * 2))
  write global (HEAP BASE + 1, read global (HEAP BASE + num elements * 2 + 1))


  #sort heap from top down

  variable (cur ptr, child1, child2, live1, live2, live parent)
  cur ptr := 0
  live parent := read global (HEAP BASE) -- current counter
  if (live parent << 0) then (live parent += 1000)

  while (cur ptr * 2 << num elements) do (  #has children


    child1 := cur ptr * 2 + 1
    child2 := child1 + 1

    #if more than 1 child, find the lowest life
    if (child2 <= num elements) then (

      live1 := read global (HEAP BASE + child1 * 2) -- current counter
      if (live1 << 0) then (live1 += 1000)
      live2 := read global (HEAP BASE + child2 * 2) -- current counter
      if (live2 << 0) then (live2 += 1000)

      if (live1 >> live2) then (child1 := child2, live1 := live2)

    )

    if (live parent >> live1) then (
      heap swap (HEAP BASE + child1 * 2, HEAP BASE + cur ptr * 2)
      cur ptr := child1
      live parent := live1
    ) else (cur ptr := 999)  #quit

  )
end

script, heap swap, ptr1, ptr2, begin
  variable (temp)
  temp := read global (ptr1)
  write global (ptr1, read global(ptr2))
  write global (ptr2, temp)
  temp := read global (ptr1 + 1)
  write global (ptr1 + 1, read global(ptr2 + 1))
  write global (ptr2 + 1, temp)
end
 


I'm sorry.

I'm not so sure heaps are the way to go under the limits of HS. On the other hand, it didn't work out too badly and is the way to go for very large lists.

PS: I found that getting extra globals by swapping out to save files works ready well, and it's not slow either (as long as you don't do it every tick or so). In Pheoret, I had a little set mem page script which swapped the correct set of 7 or 800 globals from savefiles to global array, checking against a tracking variable to check what the current was (and saving that if needed). I needed to set a mempagechanged variable when I modified the contents of the page.


Regarding your code, I don't see where stack ptr is ever initially set. Your first line in new element decrements stack ptr and throws an error if it's less than 0. The first line in hte master script should probably be stack ptr = 20. Don't you also need to store 20 global variables in those spots?

The second issue I see is in delete heap root. You increment the stack ptr without checking if there are any elements. Given an empty heap, you'll be incrementing the stack ptr every time current counter == 0.

Apart from that, and given my understanding of HS, I'd say it should work.

Given that Mr. B is still unsure which data structure to use for this, I'm going to do a quick overview of each structure:

All implementations:
* Running counter
* Stack of free elements (see Heap for explanation)
- The number of globals needed per element is different for each data structure.

Array:
* 3 globals per element:
- run timer
- script execution
- number of executions
* 1 global to keep track of how many elements you have in your list.
- You only have 20 free elements in your stack, so you can't have more than 20 elements in your list

Your master loop will have to check all 20 elements every run through.
Your insertion time is O(1)

Linked List:
* 4 globals per element:
- run time
- script execution
- number of executions
- next element in list
* 1 global to point to the head of the list
* 1 global to point to the tail of the list
* 1 global for number of elements
- Same as the array implementation

If you sort the list, you're looking at O(n) per insertion
+ You don't have to traverse the entire list in the master loop
- Slower insertion time
However, if you simply append to the tail, you're looking a O(1) per insertion
+ Faster Insertion
- Have to traverse the entire list in the master loop

Heap:

* 2 globals per array element
- Run Time
- Element Pointer
* 2 globals per element
- Script to execute
- Number of executions
* 1 global for the number of elements

You need to initialize your free elements stack. I'd take TMC's script, throw in a third constant, ELEMENT BASE, and throw in a for loop from 0 to 19 before the while(1) do in master. In that for loop, set STACK BASE + i = ELEMENT BASE + i * 2. That way, you allot 2 globals per element and you have 20 empty elements in your stack.

Insertion is done by popping an element off the stack, inserting values into that element, and storing the run time and element pointer into the array. The array is then sorted using the heap sort algorithm in TMC's script. Worst case is you have to swap the new node all the way to the root, log n swaps, which leaves you with a max of 4 swaps in a full heap.

Deletion is the reverse of insertion. You stick the deleted element pointer back into the stack, move the last element to the head of the heap and propogate down as needed. Again, max swaps is 4 in your case.

Since your list is sorted, the only time you ever traverse the entire list is when all elements in the list need to be executed. You simply check the first element's run time against your counter, process and delete as needed. By throwing in the Number of Executions variable, you can re-insert elements with Run Time +1 and Number of Executions - 1. At a possible 4 swaps per re-execution, this is minimal when compared to the time you save and the run time to do the same thing in arrays and linked lists.


Given you said you're looking at a small list most of the time, heaps are ideal. In terms of implementation, it's a sorted linked list with O(log n) instead of O(n). By sorting the list, you dramatically increase your run time. The arrays is 20 executions PER counter increment. Worst case of your linked list is also 20 executions. You're not going to be executing a script EVERY counter increment, and the heap gives you 1 operation per increment. If you do need to run a script multiple times, you gain a max of 4 swaps to re-insert, but that's faster than traversing through a possible 20 elements to sort your linked list on EVERY insertion. A few extra operations to swap elements pays off greatly when you consider the increase in the master loop.
_________________
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address MSN Messenger
Mike Caron
Technomancer




Joined: 26 Jul 2003
Posts: 889
Location: Why do you keep asking?

PostPosted: Wed Nov 16, 2005 10:31 am    Post subject: Reply with quote

Hmm, I suppose that my idea could be intepreted as a linked list.

But, I inteded it to be a means of speeding up a fixed size array. In his very first post, he says that he will only use 2-5 elements, with around 18 being a heavy-usage situation.

So, using that crazy big O notation, an insertion would be between O(1) and O(n), depending on how much fragmentation there is, a deletion would be O(1) (you already know where it is, just set it to zero), and accessing it would be O(1), for the same reason.

Unless, of course, multiplication gets slower depending on how many elements there are >_^
_________________
I stand corrected. No rivers ran blood today. At least, none that were caused by us.

Final Fantasy Q
OHR Developer BLOG
Official OHRRPGCE Wiki and FAQ
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Mr B




Joined: 20 Mar 2003
Posts: 382

PostPosted: Wed Nov 16, 2005 1:39 pm    Post subject: Reply with quote

Wow..! I haven't thought about heaps for a year.

I think that I will end up using a linked list for this particular problem. The heap looks really attractive, but in this particular case I don't think that it will offset the expense of additional variables and scripting complication. I don't know exactly how many actions I will end up storing (~2-5 on average was just a random guess), but I don't anticipate having so many as would make a heap shine.

I intend to have a global variable containing the cycle value at which the next most recent action is to be executed (updated for additions and deletions), which should allow me to avoid cycling through the list when there are no scripts to be executed.

I considered swapping variables and and out of other save slots, but I wasn't confident about the time delay. If it is as low as you seem to be saying, I may very well have to look in to it (considering how much data is popping up in this project).

Thanks for all of this data. I wasn't really certain whether such a thing could be done in the OHR, but this really boosts my confidence.

So um yeah. Thanks for taking the trouble for all of this research -- I really appreciate it.
Back to top
View user's profile Send private message
TMC
On the Verge of Insanity




Joined: 05 Apr 2003
Posts: 3240
Location: Matakana

PostPosted: Wed Nov 16, 2005 5:06 pm    Post subject: Reply with quote

Quote:
I don't see why the stack is needed. You're keeping track of 2 values: run timer and execution script. Throwing in an element pointer uses a global to point to a single global to return the script execution value. If more elements really are needed, your code would work, but you need to change your stack. It only allows for 1 variable per element right now and is a waste of globals.


Mr. B said "containing the reference ID of a script and the cycle time at which that script is to be executed (among other things, such as values to be passed to the scripts)" so I assumed that the elements might be large and allowed them to be any size without slowing down the sorting horrendously.

I was slack and didn't write any initiation code.

Right. The while loop in the master loop should read
Code:
    while (read global (HEAP BASE) == current counter && num elements >> 0) do (
      #process element at read global (HEAP BASE + 1)
      delete heap root ()
    )


Quote:
Given you said you're looking at a small list most of the time, heaps are ideal.


Did you mean linked lists? At 2-5 actions scheduled, linked lists would be faster because you don't have to resort to delete the first element.
_________________
"It is so great it is insanely great."
Back to top
View user's profile Send private message Send e-mail
Inferior Minion
Metric Ruler



Joined: 03 Jan 2003
Posts: 741
Location: Santa Barbara, CA

PostPosted: Wed Nov 16, 2005 5:46 pm    Post subject: Reply with quote

Mr B wrote:
Wow..! I haven't thought about heaps for a year.

I think that I will end up using a linked list for this particular problem. The heap looks really attractive, but in this particular case I don't think that it will offset the expense of additional variables and scripting complication. I don't know exactly how many actions I will end up storing (~2-5 on average was just a random guess), but I don't anticipate having so many as would make a heap shine.

I intend to have a global variable containing the cycle value at which the next most recent action is to be executed (updated for additions and deletions), which should allow me to avoid cycling through the list when there are no scripts to be executed.

I considered swapping variables and and out of other save slots, but I wasn't confident about the time delay. If it is as low as you seem to be saying, I may very well have to look in to it (considering how much data is popping up in this project).

Thanks for all of this data. I wasn't really certain whether such a thing could be done in the OHR, but this really boosts my confidence.

So um yeah. Thanks for taking the trouble for all of this research -- I really appreciate it.


By my count, a Linked List should require more globals: 20 for the stack (I don't know how you'd manage the "memory" efficiently without it), 4 "glob al" and 4 per element in your implementation, 3 per element if you throw out the number of executions variable for multiple executions. That's 104 or 84 globals total.

The Heap uses 20 for the stack, 1 "global" for the number of elements, 2 gobals per array element and 2 globals per "node". That's 101 globals. Just like with Linked Lists, you can remove that number of executions variable, only in heaps it saves you a massive amount of global variables. The only reason we're using a stack in the heap is to allow an element in the array to have more than one value associated with it. Remove number of executions and you remove the stack. You replace element ptr with the script number you wish to execute and you're left with 2 globals per array element and 1 "global" global for the number of elements. That's 41 globals in total.

In terms of efficiency, the biggest hit in the heap is swapping. However, you're dealing with a tree and you only swap up and down levels. At 20 elements, the most you can swap a single element is 4 times. Under 7 elements and you're looking at 2 swaps maximum per insertion. Using 4 elements as an example (a little long, and was considering an animated gif to illustrate my point, but was too lazy to build it Raspberry!):

Insert 1st element:
Linked List (Sorted):
- Grab from stack
- Set 4 variables
- Set head
- Set tail
- increment number of elements
* Running total (rough estimate): 7 operations
Linked List (Unsorted):
- Grab from stack
- Set 4 variables
- Set head
- Set tail
- increment number of elements
* Running total (rough estimate): 7 operations
Heap:
- Grab from stack
- Set 4 variables
- increment number of elements
* Running total (rough estimate): 5 operations

Insert 2nd element:
Linked List (Sorted):
- Grab from stack
- Set 4 variables
- Sort 2 elements (maximum 1 check)
- Set Next Element variable
- Update tail
- Possibly update head
- Increment number of elements
* Running total (Assuming worst case every time): 17 operations
Linked List (Unsorted):
- Grab from stack
- Set 4 variables
- Append to tail, setting the Next Element variable
- Update tail
- Increment number of elements
* Running total: 14 operations
Heap:
- Grab from stack
- Set 4 variables
- Maximuim 1 swap
- Increment number of elements
* Running total: 12 operations

Insert 3rd element:
Linked List (Sorted):
- Grab from stack
- Set 4 variables
- Sort 3 elements (maximum 2 checks)
- Set Next Element variable
- Update tail
- Possibly update head
- Increment number of elements
* Running total (worst case doesn't update head): 28 operations
Linked List (Unsorted):
- Grab from stack
- Set 4 variables
- Append to tail, setting the Next Element variable
- Update tail
- Increment number of elements
* Running total: 21 operations
Heap:
- Grab from stack
- Set 4 variables
- Maximuim 1 swap
- Increment number of elements
* Running total: 19 operations

Insert 4th element:
Linked List (Sorted):
- Grab from stack
- Set 4 variables
- Sort 4 elements (maximum 3 checks)
- Set Next Element variable
- Update tail
- Possibly update head
- Increment number of elements
* Running total: 40 operations
Linked List (Unsorted):
- Grab from stack
- Set 4 variables
- Append to tail, setting the Next Element variable
- Update tail
- Increment number of elements
* Running total: 29 operations
Heap:
- Grab from stack
- Set 4 variables
- Maximuim 2 swaps
- Increment number of elements
* Running total: 27 operations

Time to execute a script:
Linked List (Sorted):
- Analyze head, perform script accordingly
- Move head pointer
- Return global base to stack
- Decrement number of elements
- Check head and repeat if needed
* Running total (No repeat): 45 operations
Linked List (Unsorted):
- Traverse through list (4 checks max)
- Analyze node
- Update list to handle removing a node
- Possibly update head
- Possibly update tail
- Return global base to stack
- Decrement number of elements
- Continue through list and repeat if needed (You will always have to traverse the entire list in case multiple scripts need executing)
* Running total: 38 operations
Heap:
- Analyze root
- Return global base to stack
- Delete root (Maximum 1 swap)
- Decrement number of elements
- Check root and repeat if needed
* Running total: 32 operations

At 4 elements you're already seeing increased performance from the heap. The difference between the Sorted and Unsorted Linked List is when you traverse the entire list. In a Sorted version, you traverse the entire list on every insertion. In an Unsorted list, you traverse the entire list on every deletion. For every insertion, you'll have a deletion, but you get increased performance with the unsorted list when multiple scripts share the same run time.

You can extend my example out. For elements 5 - 7, the heap will have a max swap of 2 for both insertion and deletion. 8 will have 3 for insertion 2 for deletion. 9 - 16 will have 3 for both insertion and deletion. 17 will have 4 for insertion, 3 for deletion. 18 - 20 will have 4 for both insertion and deletion. These are maximum swaps, meaning worst case you do that many, but you won't always. An Unsorted Linked List maintains it's fast insertion, but deletion increases linearly. 5 elements requires 5 checks when you delete. Your max swap is always less than 5. In the end, you're gaining speed.
_________________
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address MSN Messenger
Mr B




Joined: 20 Mar 2003
Posts: 382

PostPosted: Thu Nov 17, 2005 2:20 pm    Post subject: Reply with quote

Wow...my head is spinning.

How can an unsorted linked list consistently require fewer actions than a sorted one?

The heap sounds good, but I don't know if it is taking into account my specific needs. Let me run over what I am thinking of doing with the linked list.

1.) The linked list will contain elements. Each element will be a set of global variables. The variables will contain A) a pointer to the next element in the list (or 0/-1 if the end), B) a cycle stamp showing the cycle it will be executed at, C) the ID of the script to be executed, D,E,F,G) four variables with data specific to the script being called.

2.) The linked list will contain x elements sorted by execution time; the element at the top of the list has the lowest cycle signature, and the one at the end has the largest. The element with the lowest cycle signature is not necessarily the element to be executed next, due to integer wrapping (or whatever it's called). However, the element to be executed next will almost always be at or near the top. The element at the bottom will almost always be the last one to be executed.

3.) There is a global variable that points to the first element in the list -- the one with the lowest cycle stamp (but not necessarily the one to be executed next).

4.) There is a global variable that points to the final element in the list -- the one with the highest cycle stamp. Whenever an element is inserted, it is checked against the final element variable to see if it is larger. If so, it is inserted after the final element and then the pointer is advanced.

5.) There is a global variable that registers the cycle at which the next action/element will be executed. This allows the scripts to completely bypass searching through the list if there is no script to be executed that cycle. This variable is updated every time an action is inserted or deleted.

6.) Whenever the scripts determine that an action/element needs to be executed, it begins at the first element in the list and continues/executes/deletes until it reaches an element with a cycle stamp larger than the current cycle.

It seems to me that the only time a heap would have an advantage is when integer wrapping occurs -- other than that, the execution process is linear right down the list, right?
Back to top
View user's profile Send private message
Inferior Minion
Metric Ruler



Joined: 03 Jan 2003
Posts: 741
Location: Santa Barbara, CA

PostPosted: Thu Nov 17, 2005 7:27 pm    Post subject: Reply with quote

Mr B wrote:
Wow...my head is spinning.

How can an unsorted linked list consistently require fewer actions than a sorted one?


You're working with a Linked List on a very small set of data. The intrinsic problem is you can't arbitrarily access any element in that list. In order to sort, you have to access each element individually to determine where the new element goes. Best case, it's appended to the front of the list, but that still requires 1 check. Worst case, it's added to the end of the list, which requires a full list traversal.

An unsorted Linked List always appends to the end of the list without checking any values. It's downfall is it requires a full traversal every deletion. You delete everything you insert. Therefore, in a worst case scenario your sorted Linked List only inserts elements that have a larger TTL than anything in the list and requires a full traversal during every insertion. During deletion, it must check the next node in case you have multiple nodes with the same Cycle Execution values. An unsorted Linked List requires a full list traversal anyways, so you have a slight performance boost whenever multiple elements share a Cycle Execution value.

In other words:
Sorted Linked List Worst Case
Insertion: Full list traversal
Deletion: Minimum 2 checks

Unsorted Linked LIst
Insertion: 1 operation, no checks
Deletion: Full list traversal, which means deleting multiple elements doesn't cost any more than deleting a single element. It's a slight performance boost, and only means increased performance on a small list, which you think you'll have most of the time.

Notice how both require a Full List Traversal, however the insertion on an unsorted list is quicker than deletion on a sorted list. Since you're primarily dealing with, as you estimated, 2 - 3 elements and a possible boost to 20 here and there, your sorted list is more than likely going to be running this worst case most of the time. Probability of the newly inserted element being the largest element in a list of 3 elements is 1 in 4, 25%. Yes, the larger the list gets, the less likely a worst case outcome becomes, but you're not dealing with a large list most of the time. Hindered improvement during a short burst of 20 elements makes up for consistent hindering followed by short bursts of improvement, in my opinion.

The easiest way to test would be to run the code and see what happens.

Mr B wrote:
The heap sounds good, but I don't know if it is taking into account my specific needs. Let me run over what I am thinking of doing with the linked list.

1.) The linked list will contain elements. Each element will be a set of global variables. The variables will contain A) a pointer to the next element in the list (or 0/-1 if the end), B) a cycle stamp showing the cycle it will be executed at, C) the ID of the script to be executed, D,E,F,G) four variables with data specific to the script being called.

2.) The linked list will contain x elements sorted by execution time; the element at the top of the list has the lowest cycle signature, and the one at the end has the largest. The element with the lowest cycle signature is not necessarily the element to be executed next, due to integer wrapping (or whatever it's called). However, the element to be executed next will almost always be at or near the top. The element at the bottom will almost always be the last one to be executed.

3.) There is a global variable that points to the first element in the list -- the one with the lowest cycle stamp (but not necessarily the one to be executed next).

4.) There is a global variable that points to the final element in the list -- the one with the highest cycle stamp. Whenever an element is inserted, it is checked against the final element variable to see if it is larger. If so, it is inserted after the final element and then the pointer is advanced.

5.) There is a global variable that registers the cycle at which the next action/element will be executed. This allows the scripts to completely bypass searching through the list if there is no script to be executed that cycle. This variable is updated every time an action is inserted or deleted.

6.) Whenever the scripts determine that an action/element needs to be executed, it begins at the first element in the list and continues/executes/deletes until it reaches an element with a cycle stamp larger than the current cycle.

It seems to me that the only time a heap would have an advantage is when integer wrapping occurs -- other than that, the execution process is linear right down the list, right?


First off, if you're sorting, you should use the TTL, not the Cycle Execution to sort. You're defeating the purpose of sorting the list if you start sticking scripts that won't be run in front of scripts that need to be run. Plus, the TTL calculation is 2 extra calculations.

The execution process is linear in a linked list. The more elements in your list, the longer the execution time. That's not the case in a heap. It's logarithmic. Your run time is log(n). For your application, the heap takes care of 2 - 6 much more efficiently than the linked list. I've altered TMC's code to match your specific needs:

Code:
global variable (
1, current counter
2, stack ptr
3, num elements
)

define constant (
100, STACK BASE
120, ARRAY BASE
160, FREE ELEMENTS BASE
5  , ELEMENT VARS
)

#Array Model:
# ARRAY BASE: Cycle Execution Time
#        + 1: Element Base Pointer

#Element Model:
# ELEMENT BASE: Script ID
#          + 1: Script Arg 1
#          + 2: Script Arg 2
#          + 3: Script Arg 3
#          + 4: Script Arg 4


#example of your master loop
script, master, begin
  variable (i)
  while ( i << 20 ) (
    # Write the available element base locations into the stack
    write global ( STACK BASE + stack ptr, FREE ELEMENTS BASE + i * ELEMENT VARS )
    i += 1
    stack ptr += 1
  )
  while (1) do (
    while (num elements >> 0,and,read global (ARRAY BASE) == current counter) do (
      #process element at read global (ARRAY BASE + 1)
      # As far as processing goes, ARRAY BASE + 1 stores the global for the element base
      # See the Element Model for accessing the global variables
      delete heap root ()
    )
    wait
    current counter := (current counter + 1), mod, 1000
  )
end


#inserts values into the heap, deals with the stack, and returns a ptr to
#an element to fill with data
script, new element, run timer, begin
  variable (element ptr, live time)

  # find a free element

  stack ptr -= 1

  # Any elements left?
  if (stack ptr << 0) then () #throw an error?

  element ptr := read global (STACK BASE + stack ptr)

  live time := run timer -- current counter
  if (live time << 0) then (live time += 1000)


  # insert into heap (at the bottom)

  variable (cur ptr, parent ptr)

  write global (ARRAY BASE + num elements * 2, run timer)
  write global (ARRAY BASE + num elements * 2 + 1, element ptr)
  cur ptr := num elements
  num elements += 1

  # sort heap from bottom up

  while (cur ptr) do (   #until reach root
    # find parent
    parent ptr := (cur ptr -- 1) / 2

    live parent := read global (ARRAY BASE + parent ptr * 2) -- current counter
    if (live parent << 0) then (live parent += 1000)

    if (live parent >> live time) then (
      live time := live parent
      heap swap (ARRAY BASE + parent ptr * 2, ARRAY BASE + heap ptr * 2)
      cur ptr := parent ptr
    ) else (cur ptr := 0)  #jump out of loop

  )

  return (element ptr)
end

# after pulling a reference off the heap and processing the element, call this is it is done
script, delete heap root, begin

  # Any elements in the heap?
  if ( num elements <= 0 ) then () #throw an error?

  #stick the element base pointer back on the stack
  write global (STACK BASE + stack ptr, read global (ARRAY BASE + 1))
  stack ptr += 1

  #put last heap node at the root
  num elements -= 1
  write global (ARRAY BASE, read global (ARRAY BASE + num elements * 2))
  write global (ARRAY BASE + 1, read global (ARRAY BASE + num elements * 2 + 1))


  #sort heap from top down

  variable (cur ptr, child1, child2, live1, live2, live parent)
  cur ptr := 0
  live parent := read global (ARRAY BASE) -- current counter
  if (live parent << 0) then (live parent += 1000)

  while (cur ptr * 2 << num elements) do (  #has children


    child1 := cur ptr * 2 + 1
    child2 := child1 + 1

    live1 := read global (ARRAY BASE + child1 * 2) -- current counter
    if (live1 << 0) then (live1 += 1000)

    #if more than 1 child, find the lowest life
    if (child2 <= num elements) then (

      live2 := read global (ARRAY BASE + child2 * 2) -- current counter
      if (live2 << 0) then (live2 += 1000)

      if (live1 >> live2) then (child1 := child2, live1 := live2)

    )

    if (live parent >> live1) then (
      heap swap (ARRAY BASE + child1 * 2, ARRAY BASE + cur ptr * 2)
      cur ptr := child1
      live parent := live1
    ) else (cur ptr := 999)  #quit

  )
end

script, heap swap, ptr1, ptr2, begin
  variable (temp)
  temp := read global (ptr1)
  write global (ptr1, read global(ptr2))
  write global (ptr2, temp)
  temp := read global (ptr1 + 1)
  write global (ptr1 + 1, read global(ptr2 + 1))
  write global (ptr2 + 1, temp)
end


You'll notice the code doesn't do anything when you've run out of elements. Not sure how to throw an error in HS. As long as you don't insert more than 20 elements, you should be fine. There are 2 comments showing where the script should crap out if you run out of elements or try to delete an element that doesn't exist. Technically, the latter could be fixed to just do nothing, but it's probably better to warn yourself that the script is doing something it shouldn't be doing instead of just ignoring it.

I've left the actual processing of an element that needs executing to you. When you want to insert an element, use the following code:

Code:
variable(temp)
temp := new element( Cycle Execution Time )
write global( temp, Script ID )
write global( temp + 1, Arg1 )
write global( temp + 2, Arg2 )
write global( temp + 3, Arg3 )
write global( temp + 4, Arg4 )

_________________
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address MSN Messenger
Mr B




Joined: 20 Mar 2003
Posts: 382

PostPosted: Fri Nov 18, 2005 10:10 am    Post subject: Reply with quote

You're right; it should be sorted by TTL instead of the cycle value. However, if integer wrapping were ignored, the cycle value would be identical to the TTL, right?

Inferior Minion wrote:
The execution process is linear in a linked list. The more elements in your list, the longer the execution time. That's not the case in a heap. It's logarithmic. Your run time is log(n). For your application, the heap takes care of 2 - 6 much more efficiently than the linked list.


Wait, how does this work? If I sort the linked list by TTL, then the first element(s) will always be the one to be executed (if any are to be executed at all). If the management script just continues down the list until it hits an element that shouldn't be executed, it gets nearly 1 execution per element traversal. However, the first few elements in a heap are not at all guaranteed to be the ones to be executed, right?

Additionally, if I have a pointer to the last element in the sorted list, then I will very rarely need to traverse the entire list in order to insert a new element. I mean, it could still happen, but not likely.

I guess what I have been considering for the linked list is a sort of modified queue, where new elements can be inserted at any point. Since I don't know how many elements I will actually be using most of the time, insertion is probably a lot more important that I have been treating it. And insertion is where a heap would vastly overwhelm a linked list...
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Castle Paradox Forum Index -> HELP! All times are GMT - 8 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group