Heapsort Problems<

windows

Guest
I got a heapsort algorithm of the net (<!-- m --><a class="postlink" href="http://linux.wku.edu/~lamonml/algor/sort/heap.html">http://linux.wku.edu/~lamonml/algor/sort/heap.html</a><!-- m -->) and modified it from C++ into PHP, and for a slightly more complicated problem.

Now I am having a little difficulty with it working.

here is my code:


function test_heapSort($arr, $arrSize)
{
for ($i = ($arrSize / 2) - 1; $i >= 0; $i--)
$arr = test_siftDown($arr, $i, $arrSize);

for ($i = $arrSize - 1; $i >= 1; $i--)
{
$tmp = $arr[0];
$arr[0] = $arr[$i];
$arr[$i] = $tmp;
$arr = test_siftDown($arr, 0, $i-1);
}

return $arr;
}

function test_siftDown($arr, $root, $btm)
{
$done = 0;

while(($root*2 <= $btm) && (!done))
{
if ($root*2 == $btm)
$maxChild = $root * 2;
elseif ($arr[$root*2]['time'] > $arr[$root*2+1]['time'])
$maxChild = $root *2;
else
$maxChild = $root * 2 + 1;

if ($arr[$root]['time'] < $arr[$maxChild]['time'])
{
$tmp = $arr[$root];
$arr[$root] = $arr[$maxChild];
$arr[$maxChild] = $tmp;
$root = $maxChild;
}
else
$done = 1;
}

return $arr;
}


I put in this array:

Time : 2003-12-18 11:17:58 ID : 60 Value : -48.18
Time : 2003-12-18 11:18:07 ID : 60 Value : -37.96
Time : 2003-12-18 11:18:17 ID : 60 Value : 5.92
Time : 2003-12-18 11:17:58 ID : 59 Value : 32.80
Time : 2003-12-18 11:18:07 ID : 59 Value : 40.95
Time : 2003-12-18 11:17:58 ID : 61 Value : -15.37
Time : 2003-12-18 11:18:07 ID : 61 Value : -5.16
Time : 2003-12-18 11:18:07 ID : 61 Value : 2.99
Time : 2003-12-18 11:18:17 ID : 61 Value : 46.87

and got this out:

Time : 2003-12-18 11:18:07 ID : 60 Value : -37.96
Time : 2003-12-18 11:18:17 ID : 60 Value : 5.92
Time : 2003-12-18 11:17:58 ID : 59 Value : 32.80
Time : 2003-12-18 11:18:07 ID : 59 Value : 40.95
Time : 2003-12-18 11:17:58 ID : 61 Value : -15.37
Time : 2003-12-18 11:18:07 ID : 61 Value : -5.16
Time : 2003-12-18 11:18:07 ID : 61 Value : 2.99
Time : 2003-12-18 11:18:17 ID : 61 Value : 46.87
Time : 2003-12-18 11:17:58 ID : 60 Value : -48.18


seeing as how I am trying to sort for time - it isn't going well is it...

any ideas?
Cheers in advance,

HKif all you are doing is sorting an array then use asort() or one of the others.

how are you calling that function?just calling it as

$arr = heapsort($arr, sizeof($arr));



Problem is - I am handling lots of data (my array could have upto 10,000 timestamps in there), and went to find more efficient algorithm - the Heapsort seemed the best performer.Well it seems it just throws the rows around a bit, at the moment...
So let me see if I get this right,
in the array $arr, you have this(?):

$arr[n]['time'] // contains strings like "2003-12-18 11:17:58"
$arr[n]['id'] // contains strings/ints like "60"
$arr[n]['value'] // contains strings/floats like "-48.18"

And you want to sort by date(ascending or descending? too lazy to examine the code that close). Comparing strings like "2003-12-18 11:17:58" will not work, as far as I know, and that's what you're doing at the moment... You have to make them into timestamps first(integers), and then compare them, right?
mktime() (<!-- m --><a class="postlink" href="http://www.php.net/manual/en/function.mktime.php">http://www.php.net/manual/en/function.mktime.php</a><!-- m -->) would do that for you... Or am I not seeing the real problem?it does sort with asort() - but i need to cut down on the time spent sorting.

because strings and characters can be given ordinal values - ie, a comes before b, as much as 1 comes before 2

i will try the timestamps idea anyway - cheersYeah maybe sort() function could do it.. But I think I would use the timestamps, it's just my style ;)
If you are so concerned with the time spent sorting the array, you should use another language I think, something that works on a lower level and that is compiled. PHP just doesn't perform very well with this type of thing.. But you probably know that already ;)job constraints.

PHP is free and relatively painless to setup on w2k servers.

we can't charge for the tool I'm working on, so it has to be cheap to make.


to distribute compiled programs means paying for the compiler and licences and stuff - never get my boss to sign off on it.


but enough about the troubles.

the speed thing is not too much of a problem - the servers will have very low user base anyway (between 1 and 5), so the server load is pretty low.

but the heapsort does improve the performance of a numeric only array by about 50%.there are a lot of sorting for arrays. make sure you get the corret one.

sort() : sorts an array.
ksory() : sort by keys
asort() : Sort an array and maintain index association
arsort() : Sort an array in reverse order and maintain index association
uasort() : Sort an array with a user-defined comparison function and maintain index association
uksort() : Sort an array by keys using a user-defined comparison function
usort() : Sort an array by values using a user-defined comparison function

the last three you could do as well. just an idea.do you know what algorithm they use in those functions?

I think its a simple bubble sort - which is too slow for the amount of data i am trying to handle.I wouldn't think you can get any faster that an internal function.well, it still needs to run through an algorithm which passes through the array at least once.

the question is how much effort did they put in coding that function.


fact is, I tested my heapsort on an array of integers against asort(), and heapsort won.
problem I have now is that for some reason it won't sort the dates properly.umm horus, you have a couple of problems. the code works. you just have to fix your typos.

function test_siftDown($arr, $root, $btm)
{
$done = 0;

while(($root*2 <= $btm) && (!$done))
{
if ($root*2 == $btm)
$maxChild = $root * 2;
elseif ($arr[$root*2]['time'] > $arr[$root*2+1]['time'])
$maxChild = $root *2;
else
$maxChild = $root * 2 + 1;

if ($arr[$root]['time'] < $arr[$maxChild]['time'])


also double check your text case on these

$arr[$maxChild]['time'])okay - the times for each variable are correct now, but they are still grouped:


Time : 2003-12-18 11:17:58 ID : 60 Value : -48.18
Time : 2003-12-18 11:18:07 ID : 60 Value : -37.96
Time : 2003-12-18 11:18:17 ID : 60 Value : 5.92

Time : 2003-12-18 11:17:58 ID : 59 Value : 32.80
Time : 2003-12-18 11:18:07 ID : 59 Value : 40.95

Time : 2003-12-18 11:17:58 ID : 61 Value : -15.37
Time : 2003-12-18 11:18:07 ID : 61 Value : 2.99
Time : 2003-12-18 11:18:17 ID : 61 Value : 46.87




also double check your text case on these

$arr[$maxChild]['time'])

:confused: how do you mean?still grouped? how are you printing the array to the screen?

$arr[$maxChild]['time'])
$arr[$maxChild]['Time'])

make sure you have the correct one as those are totally different.damn damn damn damn.

well i fixed it all now!

just to let you see the big picture, I'll attach my code

there was some problem in the heapsort which you guys helped to fix, but there was another problem in the order I was doing things - I originally applied my unique function to before the heapsort.

i put that right, and hey presto, it's ordered correctly now.well when I ran the code you have here and the array I got the output just right. glad you got it fixed.i tried it on a larger data population last night - bugger!

it's fine until you get halfway, and then the timestamps start getting random.


you don't want me to post all of that data :)sounds like it just loses it train of thought after awhile. can you attach a page here with just the array. it doesn't matter how big it is, well the allotted amount for a attatchment. :)
 
Back
Top