TCO 13: Marathon Finals, part 2

Part 1 Part 2 Part 3 Part 4

Multibase Solution

It’s 11:30, we’re 2,5h into the competition and I’m now absolutely forced to do the multibase strategy. No more running away. And I still have absolutely no clue how to do it. If something looks way too complex, it’s a good idea to try and split it into smaller parts. In this exact situation, I came up with the following subproblems:

  1. Which worker should build the base
  2. Where it should be placed
  3. When it should be placed
  4. When I have more than 1 base, how should I manage the workers
  5. How many bases I should build

Now I’m looking for the simplest way to accomplish each of those subproblems. First of all, let’s assume I’m trying to build at most 1 base at a time. (1) is pretty trivial. Let’s build it with any worker that can do it. Since there’s a constraint that only worker that doesn’t carry any resources can build the bases, let’s use only those. And if all of them are busy with carrying resources, just wait for one of them to be free. (2) is the scary part, so I just imagine that it isn’t here and I’m moving on with my list. The simplest solution to (3) is to build it as soon as we can (we have resources) and build the bases only after I have build all of my workers. Remember that in each simulation I know how many workers I’m going to get. For (4) I can just move each of my workers to the closest base. My intuition is that the workers will “manage themselves” after some time. It’s definitely far from perfect, but on the up side, I already have that implemented, so this is an ideal solution. (5) is even more trivial, because I use the number of bases as an input to my simulate() function and I’m assuming that I’ll just iterate over all possible combinations of workers and bases.

So I’m left with only (2), because it turned out that all other subproblems were either trivial or I can just ignore them their complexity and I won’t loose too many points. I figured out that I have 2 different type of strategies for finding a possible placement for bases. Either I can pre-calculate the best positions for bases and try to build them in those places. Or I can build them on-the-fly as I’m simulating probably using some kind of a greedy function.

Now, the important part here is that at 12:00 we have an obligatory lunch. We can take our pencils & papers with us, but that’s all. The lunch break lasts around 30-45 minutes. And it’s the best moment for a long toilet break ;) When the break happens I don’t want to be in the middle of coding, so I’m doing all of the small things that can be done before the break and I’m leaving thinking about (2) for lunch. Again, good time management.

It also turned out I have to rewrite my simulation and path generation part, because for some stupid reason I thought that it’s a good idea to remove the resource tile as soon as I generated a path for worker to gather it. And it can’t work because if I’m going to use that worker to build the base, he will never gather that resource. A bit more wasted time.

Next was lunch and sadly, it went uneventful for me. I already decided before the lunch, that I’ll do the more ambitious idea with pre-calculating the bases positions. Again, it seems like a complex task, so I decided to split it into subproblems. The first one is to write a function that will evaluate how good my chosen positions are, and the other one is to select them. The evaluation function is pretty straightforward and goes in the same spirit as my basic greedy approach – I compute the average value of resource_value / manhattan_distance_to_closest_base. With such evaluation function, the task of finding correct placement for bases is now very easy – with little MM experience you should know that any basic variation of hill climbing will do the job. One final note is that I completely forgot about the order in which I should place the bases and I did nothing about it, which means I’m placing them in some exact order (always the same).

Now when it came to practice and my wonderful idea had a reality check, it turned out that in many tests most cells are just inaccessible for workers and I can’t put the base wherever I want. At this point I figured out that computing positions for new bases without taking the actual resource layout, just won’t be good enough. More importantly, since I’m using only greedy function for mining the next resource and there’s no planning ahead, I can’t really predict what will happen. And this will only get worse with more bases. But it’s been nearly 2h without me submitting anything and I feel an urge to finish it ASAP. Otherwise I’ll loose a lot of motivation. I just want to do anything – even placing random bases would be much better than not placing them at all. I thought about the random approach for a bit, but I decided that I’ll just place the bases in closest spot that is accessible. This doesn’t sound that dumb at first, but wait till you see how it performs ;)

It’s almost 13:30, and the funny thing is that I actually felt good about my base placement algorithm. After all, I just spent nearly an hour on coding something quite complex and it took me over 100 lines of code to get it. By that time wleite got a massive lead on everyone and it was clear that he’s using multiple bases. I believe I was still in 2nd place at that time. Anyway, I’m testing my new and shiny solution on few seeds. It looks awful (you can see the results below), but since I’m simulating everything and choosing the best solution I’m able to generate, in theory I can’t get any worse result. In theory. So I’m expecting a massive improvement. My local tester crashes on seed #11, but who cares about my local tester :) I submit my solution (#7) and I get a decent improvement of… -20%. Not what I expected. Here you can find the blog post made by Rustyoldman that also contains the scoreboard after I made submission #7 (or to be more exact, just before my #8 was scored).

Again, priorities first. Let’s eliminate the bugs. Over the next 30 minutes I’m making few more submits. Each time when I removed yet another bug introduced in my new path generation routines. And finally at 14:15 with submission #11 my local tests don’t crash anymore. Since I had no more crashes, I could run tests locally and see that I have a nice 20% improvement over #6, but I was still behind wleite. Then I discovered that due to being worried about time limit, I only use at most 2 bases. After all, this gives me already 20-40 full simulations (number of workers * 2) and since I’m measuring time for everything, I already see that for some tests it’s very close to timing out. I’m changing my code to test all combinations of bases/workers and I add an early exit if I’m close to time limit. Another 7-8% improvement (submission #13) but I believe it was still behind wleite.

Even a quick glance at how my solution performs gives a clear message that my base placement algorithm sucks badly. But the main difference between now and two hours ago is that it exists now, and I can endlessly stare at the visualizer and maybe some day I’ll understand why it sucks so bad. But let’s note the obvious defects of it. The main problem that immediately pops up is that I very often put the bases insanely close to each other. After all, my logic didn’t take into account the current placement of bases, it only accounted for where the bases were supposed to go. So I scrapped my precious 100+ lines of code logic and replaced it with a simple greedy “maximize distance to closest edge and other bases”, which took less than 10 lines. One batch of local tests later and I’m screaming inside “give me back my lost time”, because I just got an over 10% improvement by essentially discarding my complex logic and using something very naive as a replacement. This goes as my submission #15, which I’m pretty sure puts me in 1st place and now it’s tweaking time.

As you see, the solution is still pretty stupid, but at least I don’t have a problem that I’m putting 2 bases very close to each other which is actually a big step forward.

At this point I feel quite comfortable. It’s around half of the contest, I have a very simple solution and lots of places where I know I can make improvements. To summarize, my solution works as follows. I have a function simulate() that simulates whole process from scratch and gets number of workers and bases as parameters. First I’m trying to build all of the workers then I’m trying to build all of the bases. Each worker harvests a resource that maximize value / distance ratio and base is build in a tile which maximizes the distance to other bases and edges of the map. There’s no planning ahead in my algorithm and everything is pretty random. The strong point is that I’m calling simulate() with all of the combinations of number of workers and bases to build. And since I’m keeping the best result, one of such simulation hopefully produces quite decent result. And finally, so far every combination of parameters is run only once.

The Midpoint

This is the perfect moment to start experimenting solution. What I do is I’m rapidly testing a lot of small different ideas, sometimes well-thought sometimes almost entirely random. And see how they affect my score. It’s every important to distinguish what I’m doing from what I did before. I stopped applying my problem solving skills and right now I’m just a curious child. I want to toy around with every part of the code that I believe could affect results, and if it improves, I’ll keep the change. I’ll also do a lot of jumping around between the different parts of code – mainly, because I don’t want to miss any obvious improvement.

My one batch of testing takes between 2 and 3 minutes. The goal is to have a tester running almost whole time. The way I do this, is actually pretty simple – I work on adding some small functionality, and if it’s not done by the time the tester finishes, I simply tweak one of the parameters and run the tester with the tweak, while I’m still working on the adding functionality. It takes some practice, but at some point the slowing effect from doing this will be negligible. Unfortunately sometimes I have to rewrite big chucks of code and I’m unable to do it. This process would be much more effective if I had a “batch testing” functionality for my local tester. Maybe one day I’ll add it. It would definitely be handy during the breaks.

Here’s the list of things I could do:

  1. Improve the speed of BFS. Currently I’m doing a full BFS that has to visit every cell. But I can add an early exit as soon as my search is deep enough. For finding resources, this happens when I can’t obtain any better value / distance ratio, and for returning one it’s just the first base I encounter. The other optimization I can do is to reduce the arrays that describe the map from 2D to 1D. Based on my past experience this can give 20-30% speed improvement. It’s also always easy to evaluate if improving speed is worth it – just increase your local time limit and see how it affects the scores.
  2. Experiment with my greedy function for worker management. It’s probably very little work and potentially can give me some very much needed points.
  3. Experiment with bases placement – don’t always build the base right away and check if maybe distance to closest base should be more important (have bigger weight) than the distance to the edge of the map.
  4. If I’m able to evaluate all of the workers/bases combinations, then I have a lot of time left. The easiest way to use it, is to repeat the simulations with same parameters, but just add small randomization to the process. And as I explained earlier, even slightest change can produce completely different solution.
  5. Remove the workers/bases combinations that definitely give worse results. It’s quite intuitive that for fixed amount of bases, the optimal result for each number of workers should be concave or very close to. The opposite (fixing number of workers and use bases as a function argument) should also be true.
  6. Don’t always prioritize building workers over building bases. There may be situations where it makes sense to build first base before all of the workers are done. Also, this can add some additional randomness to the solution.

If this was a normal match (i.e. something that lasts at least a week), I’d have this written on the paper and titled “TODO”. I’d also prioritize all of the tasks. The most basic way is to order them by expected score improvement divided by the time it takes to implement it. But this is a huge oversimplification, because each task has much more characteristics. Expected information gain (how does it help me with creating new ideas), required focus (in what state of concentration I have to be in order to correctly implement it), variance of expected improvement (how confident I am about this idea), chance of getting outdated (the probability that some other idea will replace it and the time spent on it will go to waste). Believe me or not, but I really take all of those factors into account, when I’m choosing what to do next. And as soon as I figure out that spending another minute on prioritizing is worth less than a minute of coding, I’ll stop doing it.

Preceding paragraph is slightly off-topic, but it’s here to give you the bigger picture. In case of finals, I try to keep such list in my head, but it’s easy to forget something. The weights I give to each characteristic are also drastically different. I’m staying away from things that require focus, because they drain a lot of mental stamina. I’ll also go for tasks that have a short implementation time, because I’m trying to stay motivated and the easiest and most efficient way is to have small consistent improvements. And finally, it seems that at least for me, the correct strategy for the finals, is to minimize the variance as much as I can. This plus being lucky :)

Tweaking Time

As mentioned earlier, I’m jumping straight into my chaotic tweaking mode. The first thing to do was to make a full use of available execution time for each test case. I needed a way to run the simulations with the same amount of workers and bases, but in a way that it provided me with different results (and hopefully not very similar). I added an additional parameter to simulate() that specified the probability that I’m going to build the base in each turn, assuming I have already enough money for it and I still should build more bases. So the original implementation just had this probability set to 1.0. If there’s still time available after running all of the workers/bases combinations, I run everything again but this time with that probability set to 0.05. This is #17 (#16 had a stupid bug) and it gives ~1% improvement. At this point each 1% is actually a decent-sized step forward, since I guessed that a 2% lead in provisional tests will be close to a guaranteed victory. My #18 is an addition of 3rd run with probability of 0.02.

While running visualizer I noticed that the score that my solution prints out, quite often doesn’t match the score provided by official visualizer and I was pretty sure that few hours ago it worked fine. So another bug and I’m putting away all of the tweaking for a while. Since my solution never gave any errors, it was clear that I never overspent any resources. It took me around 15 minutes to find a very nasty error. Due to how problem is constructed, when workers return to base with harvested resources, you get money at the end of the turn. My implementation used a single variable to accumulate how much money have I earned in one turn. But instead of writing +=, I wrote =. So when more than one worker got to base on the same turn, I only added the money from the last one. That’s the reason why some of the languages don’t have such operators :) This bug was actually pretty devastating. Not only I wasn’t able to correctly select the best solution, but also I didn’t build the workers and bases as fast as I could. As a bonus, while searching for the bug, I was running local tests with a different magic number for greedy worker management function. And I was able to find the improvement here too. Fixing the bug and updating the constant gave me a nice 4% improvement (submission #19) and at this point I had a very comfortable lead.

Over the next hour I got back to my tweaking plan and I tested a lot of small ideas. I’ve added some more runs with predefined values of that probability for building bases (the reason those probabilities were predefined/hardcoded was that I believed the distribution of them should be non-linear and since I need to generate them only once, I might as well type them by hand). I’ve also added a conditional statement in my strategy to discard those with way too many workers. And I wrote specialized BFS for generating path to the resource, this way I could remove the number of if statements in my main BFS function.

Below you can see how my submission #26 performs. This was submitted at 16:37 and it’s an improvement of 7-8% over my submission #15 (the previous clip). It’s hard to notice any major differences, it just seems to be better and do less stupid things. And this is the result of all the small improvements I did, which revolve around making more simulations. So it may act smarter, but realistically it’s just faster.

Next on the line was tweaking the algorithm that selects place for new bases. So far I’m trying to maximize the minimum manhattan distance to all of the edges of the map and all of the existing bases. It seems quite decent, but I really want to check if it’s indeed best to give the same weight to edge distance and base distance. After few runs with various constants, I settled with an edge weight set to 1.4. This and finally using the closest worker to build bases gives another 2-3%.

I also experimented a bit with the time limit (so far I used only 9 out of 10 available seconds), and with using different orders for my strategy (i.e. try to sometimes build the next base before finishing all of the workers). I always try to keep a very close eye on the scores. While I was making various submits, I noticed that often when I make change that affects running time (changing time limit, improving speed, etc.) my new provisional score doesn’t match my local scores. I assumed this was due to some timeouts on the server, since except those situations, the scores were always very stable for me. After correcting this (which essentially meant reducing the time limit and adding few more checks for timing out), my provisional score jumped another few percent on the leaderboard.

Continue to Part 3

6 thoughts on “TCO 13: Marathon Finals, part 2

  1. Ivan Kazmenko

    The simulation videos suggest that an extra weight added to the resources very close to base may help in the greedy function choosing the next resource, since their presence usually lengthens the way to quite a few other resources. Did you experiment with something like that?

    1. Psyho

      If you add more weight for closer resources, you will also harvest all of those resource blocks that were not in the way and had very low value. So there’s a big trade-off and simple weight adjustment based on distance, will never make this efficient. In part 3 I’ll describe the idea that handles this problem much better.

      1. Ivan Kazmenko

        I’ve just tested my claim out of curiosity. Taking your submission #15 from the contest and hard-coding a 5x multiplication of the weights at distance 1 and a 2x multiplication of the weights at distance 2, I got 1.4% improvement in the practice room which I believe is greater than a statistical error. Perhaps one can do better with the idea.

        On the other hand, if you were planning to replace the greedy approach altogether, it would probably be counter-productive anyway to waste the time trying to apply small improvements to it.

        1. Psyho

          I feel a bit dumb now :)

          In short, you’re right – I even toyed around with this idea and managed to get a total of 4% improvement in my final solution using those tricks. And 4% is insane difference – it’s more than the difference between the final 1st and 4th place.

          Although my earlier comment still stands – simple weight fixing won’t completely solve this, but frankly neither my “advanced” algorithm, since it’s used only while we still have only 1 base.

          So it seems I missed something very simple after all. Props to you for noticing it. Funnily enough, if I had this change and I’d mention it in my blog notes, I’d probably call it “obvious”, as it’s completely obvious once you see it :)

      2. gassafm

        Don’t feel dumb, you’ve done great nevertheless :) !

        The stronger version of this feeling, like “how could I not notice this, it’s obvious!”, is what I get virtually every time after taking part in marathon-like competitions.

        1. Psyho

          In some way, the whole point of this write-up is to show people how I’m approaching problem to reduce the probability of missing some “obvious” yet crucial observations. And to later use them, to build more advanced solutions.

          Well, at least in theory. And you gave a great example that it’s just impossible to always notice every important detail :)


Leave a Reply

Your email address will not be published.