TCO 13: Marathon Finals, part 3

Part 1 Part 2 Part 3 Part 4

Current Status

While I was doing all of the tweaking, the situation that arose on our leaderboard was quite interesting. ACRush was steadily climbing up through the ranks and at 16:45 he made a submission which suddenly puts him right behind me – my lead shrinks from 10%+ to something between 2% and 5%. On the other hand no one else seems to be doing better. The strangest part is that wleite hasn’t submitted anything since 14:40 – I have no idea what could be the reason, especially since he’s not the person to hide his best submission till the last seconds. And he was still in 3rd place. Everyone else had low score, and all of the them rarely submitted anything, so I assumed that they just took an approach where they are implementing something complex right from the start (which is completely opposite of what I was doing). To sum things up – I couldn’t feel any less confident about my temporary first place than I did.

Between one batch of local tests and another, I tried to come up with some more advanced strategies, than the silly one that I was currently using. One thing that had me worried, was that it’s quite probable that I’ll run out of all the small things I can do to improve my score, before the contest finishes. In other words, I thought the potential of my solution (simulating a lot of different greedy strategies, without using any partial results) was pretty low and it’s quite probable that I’m going to hit the dead end 1-2 hours before the end of the contest. I’ll be too tired to rewrite half of my code and I won’t have enough time to properly test it, which should be my top priority considering that I’m making a ton of bugs throughout this contest.

So once again I starred at the visualizer, awaiting for miracle ideas. I noticed two very important drawbacks of my program. When worker gets back to base, there are usually 2 different equidistant tiles where he can stop. Right now I’m choosing any of them, but it’s quite probable that one of them is closer to the next resource. Each such decision means wasted 2 turns for the worker and this can add up very quickly, especially in the early turns. The second one is much more noticeable. If high-valued resource is surrounded/blocked by ones with low value, it’s very probable that I will never harvest it, even if it’s very close to base. This happens very often in tests with high density where most of the resources have low value. I spent overall 10-20 minutes on thinking how to fix any of them, but unfortunately I didn’t come up with any reasonable solution. In case of first problem, every idea that popped up, had a flaw that I’ll need to run more BFSes. So any potential gain might be lost, because my code will be much slower. And in case of my second problem, I just have no ideas at all. But at least I made a mental note, that I have found two new places for possible improvement.

Advanced Concepts: Distribution of Tests

From now on I’m moving away from chronological play-by-play description. It’s no longer suitable, because simultaneously I was doing many things – tweaking, bug-fixing, analysis and implementing/testing advanced strategies.

Way more than a half of competition is behind us, and now it’s perfectly clear that our solutions (me and my competitors) are going to be quite bad when compared to what would be possible to create if this lasted two weeks. This means I want to focus only on grabbing easy points. Method that can often help with that and the one that I haven’t used so far is analyzing the test case distribution.

My first impression is that the size, number of bases/workers, and the number of turns is not really important. The strategy stays pretty much the same and the main difference is the number of calls to simulate() I can make. At least I haven’t noticed any major differences while watching the visualizer. The two variables that I’ve left out are density (between 25% and 75%), and the resource distribution (somewhere between linear and power law). It turns out that low-density tests are pretty uninteresting due to two very simple reason – they contain so few resources that it’s very easy to harvest all, or almost all valuable ones and it’s very easy to place bases since you can get to most places unobstructed.  This means that score differences on those tests will be very small.

I can use this knowledge in several ways: I can stop caring about the low-density tests and focus only on high-density ones. If I’m able to solve dense tests efficiently, I’ll still get a good solution on those with very few resources. I could even remove low-density tests from my batch testing, so that I’d reduce the testing time by about 50%. The safe way for finding those sparse tests is to calculate variance (or even (maximum_score - minimum_score) / maximum_score) over last few submissions and remove those with the lowest values. The last thing is that it provides me with an interesting insight – each 1% improvement is actually a 2% improvement for half of the tests. This means that I’m way more further away from a good solution that I thought.

During the contest, I decided to not remove the low-density tests. Because by the time I figured this out, there was not enough time available to make this change worthwhile.

Advanced Concepts: Phases

One very important (if not the most important) skill in problem solving is to never have your mind fixed on a single idea. I’m always trying to look at what I’m doing from different angles. At first I tried to intuitively find the solution for the whole problem. And I failed pretty hard at that. Then I switched my thinking to base placement subproblem, assuming I can do everything else naively and I still could get a decent result. After that, I assumed that I’m sticking to what I wrote and I’m just trying to find little improvements here and there. But as I said earlier, I had strong feelings, that it won’t be enough. So once again, I start up the visualizer and this time, I’m trying to understand what my solution actually does.

I noticed that there are 3 distinct phases in my solution:

  1. Single Base Phase: As we already know, when we deal with a single base, it’s few orders of magnitude easier than dealing with many. Our goal here is to build all of the workers ASAP. In addition, all of our workers are generally close to our single base, so their position doesn’t really matter. This makes it possible to actually use a scoring function to evaluate non-terminal states, i.e. states of the simulation before the final turn. Another important aspect is that the time needed to simulate the early turns should be rather low. And it doesn’t matter if we save ourselves 2 turns at the start of the game, or at the end. The effect is absolutely the same, because we will still end up with having 2 “additional” turns.
  2. Base Building Phase: This is the main part and up to this point, when I looked at the problem, this was the only phase I saw. It’s so complex that it’s uninteresting right now :)
  3. End Game phase: After we finish building all of the bases and workers, the game is again highly simplified. We no longer have to build anything, which means that we only care about the final amount of resources we have on the very last turn.

Let’s expand on the concepts behind the first phase. Currently my approach is inefficient for two reasons, my solution is really bad at removing “blocking” resource tiles and often it might be better to harvest a resource with lower “resource-per-turn” ratio, but one that is much closer, so that the next worker is created sooner.

I decided to solve the first phase with an alternative state search kind of algorithm. My state is defined as an array of resource tiles. And when simulating, instead of maximizing my greedy evaluation function, I’m going for the first tile on the list, that’s not yet harvested and there’s a valid path to it. My transposition function is to roll left/roll right – I’m choosing one element (tile) in array and move it to different place. Preferably somewhere close. The quality (score) for a state is the turn number when all of the workers are built. The initial solution can be easily created by running simulate() and saving the order in which I harvested the resources. And as a metaheuristic I’ll use hill climbing.

As for the “End Game” phase, there’s a nice observation – let’s assume that we want to harvest some predefined set of resource tiles. If we want do it fast, we should prioritize removing resources that when removed would decrease the distance to other resources. The simplest way to do so is to harvest the nearest first. We can use it in a following way, run our simulate() two times. First time in order to save the information about which tiles were harvested, and another run to make use of it.

Back to Reality

Now we have all of the elements and I can finally explain what exactly happened. Around 16:00, I spent 40-60 minutes working on the single base phase. I used exactly the same algorithm that I just explained and literally it changed nothing. To be precise, new state always got worse result than the starting one. I couldn’t figure out what was the reason – was it a bug or was it just bad concept right from the start. Finally at around 17:00 I completely gave up on it, I was way too tired to do it and I still had a small lead (and there was huge gap between two of us and everyone else).

At 17:30 we were supposed to have a dinner break, so I went back to tweaking. That’s when I added weights for edges when computing base placement, permutations on order, better time-limit handling and added path recalculation on building new base. The dinner break was moved to 17:50 and was very short – it was shortened by a “unanimous” decision while I was in the toilet ;) But I’ve lost only 2, maybe 3 minutes, so I won’t complain.

When I got back, I worked on improving speed of my solution. I added early exit for BFSes. I added a small random value for base placement algorithm. I removed all of the unused code since it was so messy that I could barely understand what’s happening. Meanwhile on the leaderboard, both me and ACRush continued the fight for the first place. The difference stayed usually between 1-2.5%. Every time I improved a bit, he quickly caught up and vice versa. Here you can see a pretty accurate snapshot of the leaderboard at 19:00. The most interesting part was that neither wleite nor ainu7 has submitted anything for the past few hours.

The contest finishes at 21:00, so there are only around 2 hours left. At this moment I thought that I finished most of the small things that could be done (I missed a huge one – look at the comments of part 2). Anyway, it was a last moment to try anything more advanced. I decided to not go back to debugging the single base phase. Instead, out of the blue I had this very strange idea of using hill climbing on base positions. The basic idea is as follows – I run my basic algorithm for most of the available time (let’s say 80%), and then I take my best run so far and I’m saving position and time for each base that was placed. During the next simulations I rerun it with the same parameters but I change the position/time of one of the bases. So this is very similar to what I wanted to do during the single base phase (which didn’t worked out). Guess what, it didn’t worked out as well :) This time I believe I had no bugs – just the idea was bad. Even if the base was moved only 1 tile, it’s going to end up with a completely different solution. And each simulation costs me a lot of time.

Around 19:35, while I was still wasting time on my new and shiny concept, ACRush made a huge 5% jump with a single submission. Suddenly, I was the one struggling to catch up. It wouldn’t be that bad if I wouldn’t be in the middle of doing nothing without any perspectives. I put my new creation (HC on base placement) on hold and I quickly add an “end game” phase algorithm that was already explained. It gives me around 1.5%, but it’s still around 1.5% more to ACRush. I go back to base placement. 15 minutes more of wasting time and it’s 20:00. Meanwhile nhzp339 was quickly catching up to us. The only good news is that wleite finally submitted (after 5 hours!), and he’s still far from top.

There’s one thing that I haven’t talked about so far. The submission queue. It took around 2 minutes to run 100 tests on server. This is pretty fast, but we have eleven people sharing one queue. If everyone is going to submit solution at the end (and I know they will), it’s at least 12-15 minutes before I’ll have another chance of resubmitting my solution. Having somewhat secure second place, I don’t want to risk it, by submitting anything during the last 10 minutes, unless the change is very small or I’m out of top 2. Realistically, I have around 40-45 minutes of available time at most.

As I was only around 1% behind ACRush I decided to try small tweaks. After all, doing those got me a steady 5% per hour for the last few hours. This looked like a good decision, especially since it somewhat guarantees me at least a small improvement. I’m probably sacrificing a bit of monetary expected value (first place is $15K and second is $5K), for higher chance to end up as a runner-up. But this is fine. At this point I’m probably more motivated by avoiding the failure (dropping out of top 2 after leading for several hours) than hoping for a win. I’m also tired as hell – the lack of any physical exercises takes its toll. If anyone offered me a second place by withdrawing from the contest now, I’d happily take it :)

The following 30 minutes are very depressing for me. Every small tweak that I’m trying, is either worse or statistically insignificant. I no longer have time for full batch testing – I usually have to abort them as soon as I’m noticing that they are worse, unfortunately I have to make this by hand (my tester doesn’t have such feature) just by looking at the results files. Overall I gained ~0.4% at most, but this was almost as good as nothing. While I desperately searched for any points, nhzp339 reduced the distance to me. The only good news was that ACRush had a similar problem as myself.

The Finish Line

This is the leaderboard at 20:30. More or less, I have enough time for one more shot. I’m pretty sure I have no bugs, the only exception is a possible time-out, but I don’t want to really think about it now. I go back to watching visualizer one last time, waiting for a miracle idea. This is my penultimate submission #47 (it’s an overall 10% improvement over #26):

Immediately I recalled the mental note about the two possible improvements that were mentioned at the top of this note: when worker returns to base choose the best tile instead of a random one (to minimize the distance to the next resource); sometimes the high-value resources are surrounded by worthless ones and because of this, I’m never harvesting them. The second problem is very obvious in seed #7 (in all of the videos it’s a third test). The only problem is that I already lost significant amount of time on them, and even more importantly I did it while I still had some energy left. Now after nearly 12 hours, I can no longer think clearly and I can’t really code anything complex – or at least I shouldn’t.

I take the first problem and again I can’t make any progress. I don’t have enough time to do a proper analysis, but my intuition tells me that potential slowdown (less simulations) will offset any gain from better worker management. For the second problem, it’s clear that I have to find a very simple approach. A complex one, would be to create multi-stop paths for workers, so that I can plan ahead more than one move. There’s no way I’m rewriting my core parts of code at this point. But I found a simple way – I can artificially increase the value of blocking resources, so that I’m increasing the probability that they will be harvested. Definitely not the flawless solution, but it should work.

I’m now left with two subproblems: how to detect blocking resources and by how much should I increase the value. I saw in visualizer that the most common scenario is when higher valued resource is directly blocked from all 4 sides by low-value ones. Inspired by this, I decided to detect blocked resources in a simplest possible way – checking if it’s surrounded from all 4 sides. And the bonus points for blocking resources was simply the value of the resource in the middle, divided by some magic constant. Unfortunately I wasn’t thinking exactly straight and I wrote this whole “bonus value” functionality in a way that it’s recalculated every turn – which resulted in a massive slowdown. And there’s no need for such thing – it’s more than enough to recalculate only positions that have changed. But I haven’t done that either, since I wanted to submit anything soon. Or more like now. And if I submit now, I should still have 2-4 minutes to resubmit my solution, in case there’s a hidden bug somewhere. So I simply changed my recalculation routine to run every 10 turns – definitely not ideal but every minute was gold at this moment. I start my local testing, and I saw that on test #7 I got a 60% (!!!) improvement, I decided to submit – this was enough of information for me. At 20:45 I submit my last submission, below you can how it performed – there are no noticeable differences except test #7 (third one).

The submission queue was already full, but it seemed that I’ll get the results before 21:00 and I’ll have one more chance to submit. Scoreboard is going crazy – almost everyone has a submission in the queue, and it’s extremely hard to tell what the actual scores look like. That’s because when relative scoring is used and someone has his solution in the queue, his score is frozen and his results are not taken into account when calculating the scores of others participants. So when 2/3 of all competitors submit at one point, it gets very messy. Meanwhile my local tests finish and I see that the new idea gave me a 1,5% improvement, which should be enough to take over ACRush, but only by a small margin. I’m a torn between stopping doing anything (because another submission could be very risky) or trying to polish my new enhancement. Ideally I would like to resubmit only if someone was very close to me (both over and under), in other situations I would just leave last submission intact – but I won’t have such opportunity here if anyone submits few minutes before the end. I decided to try few more tweaks, but unfortunately nothing gave me any statistically significant amount of points. No more submits for me.

With 1-2 minutes to ago I started packing my things – stealing official TopCoder’s pencils, to be exact ;), and I got up from my competitor station. The queue was still full. ACRush submitted with 2,5 left on the clock, nhzp339 with 2, and ainu7 in the last minute. I’m pretty much shaking – I have barely enough strength to even walk. Those last 30 minutes were nerve-wrecking for me, and I have to wait 10-15 more minutes to see the “final” provisional scoreboard. I was able to exchange a few words with ACRush and he told me that his last submission, was actually a resubmit of old solution. I felt a bit safer, because I knew I’ve already beaten it on the provisional tests. I get off stage. Rustyoldman and darnley (sorry if I confused you with someone else!) ask me what actually happened. For spectators the leaderboard was probably even less readable than for me. I try to explain everything I can, but I’m pretty sure that I’m just mumbling some non-sense. When I’m so exhausted “I can’t English”.

We look at the big leaderboard, while the results slowly kick in, and with each new result I feel a little better. ACRush is indeed around 1% behind me. And both nhzp339 and ainu7 are another 1% lower, both making significant jump with last submission (in case of ainu7 it was 10%!). This is what we saw. I feel extremely lucky. If the competition would end 30 minutes sooner or later, I’m pretty sure you would not see this blog ;)

The contest has ended, but the worst part is actually before me (and few others). We have to wait two more days to see the results. My lead is rather significant and I’m guessing I have around 60% of winning, as I’m still worried about the possibility of time-out. That waiting is absolutely the most unpleasant part of any competition. You no longer have any influence over what happens, but your mind is still overwhelmed with the past and constantly over-analyzing various “what-if” situations. And won’t know peace until it sees the final scores. This is a third TCO in a row that this happens for me. It’s one of the very few moments when I can honestly say “I hate my life” :)

And here goes artificial suspense… ;)

Continue to Part 4

5 thoughts on “TCO 13: Marathon Finals, part 3

  1. Mimino

    How many local tests do you usually run to determine whether the new solution is better/worse?
    Do you run the tests for full 10s time limit (that counts for 360 test per hour) or do you reduce the limit on local runs?

    Reply
    1. Psyho

      The number of tests is not fixed. I usually start at 100 (although I may base my decisions even on the first 10 or so) and then later it might be increased when it’s no longer enough. Such decisions (to increase number of tests) are made by hand.

      During the finals I believe I used a constant number of 100 tests. It was fast enough (although I wouldn’t mind if it would be 10x faster), but barely enough accurate – improvements of 0.2-0.3% could easily perform worse.

      Normally I use such local time limits that would match the same amount iterations/speed as on server. For example here it was enough to print out the number of BFSes and scale it appropriately.

      I’d say in at least 70-80% of contest my local testing just mimics the one on server. And the most common exception is using a subset of seeds, just as it’s explained in the write-up.

      Reply

Leave a Reply

Your email address will not be published.