07 September 2018

Reflections on DEFCON CTF 2018

Played DEFCON CTF for my alma mater's CTF team from the 10th to 12th of August.  Having returned home and mostly-fixed my sleep cycle, reflections.  For another perspective on the same events, see Down to the Wire.

Why / how:

We had some summer interns this year from the college CTF team, and they were talking about DEFCON a lot during lunches.  I caught the bug and decided to see if I could get a spot.  This was a source of some consternation for team leadership, since more people is more expensive (food and badges), and I hadn't actually played any CTF since...  Hack.Lu of 2014, probably, and certainly not DEFCON quals of 2018, which is usually the heuristic for finals attendance (in my defense, I was in a plane over the Pacific for much of quals).  I had also never played in DEFCON finals before.  The interns thought I'd be worthwhile to have on the team anyway, and went to bat for me.  I sweetened the deal by getting my employer to cover my lodging.  Still, I was left with the feeling that I was on thin ice, and consequently felt a strong desire to earn my keep.

Preparation:

The decision that I was going to finals occurred in early June, leaving me about two months to get back into CTF-shape.  I began playing through pwnable.kr after work, practicing with pwntools, and worked on reverse-engineering problems during Google CTF on June 22-23.  During Google CTF, I noticed a number of deficiencies in my play.  Lack of a decompiler put me at a big speed disadvantage relative to other reverse-engineers; fortunately one of the challenges was self-modifying and not very amenable to decompilation, so I contributed there as well as on a cross-architecture reversing challenge.  I also noticed that I felt acutely stressed / under pressure during the event, and this caused me to make mistakes which cost me time.  My sleep discipline leading up to Google CTF had been poor (about 6 hours a night for the preceding three weeks), and I had difficulty maintaining intensity of effort across a 14-hour day of reversing.  Generally I was happy with the state of my knowledge and ability to assimilate new documentation, but my execution needed improvement.  Unfortunately, most of these issues would recur during DEFCON finals in some form or other.

One place where my knowledge was deficient compared to other reversers was in IDA Pro's more esoteric hotkeys.  I set up mnemosyne flashcards for some of the ones that I didn't know by reflex but saw others use during Google CTF, and they proved useful - I was able to recall them quickly and accurately during Finals under stress, despite not having ever used some of them in practice prior.  I could see doing the same for opcodes, syscall numbers, and other structureless trivia.

A coworker who had played in finals before mentioned that I'd probably be useful for binary patching.  It's a fairly uncommon skillset that isn't exercised by most of the jeopardy-style CTFs that the team plays.  There's only one dedicated veteran patcher on the team, and he has to sleep occasionally.  I resolved to do patching during finals, and started playing pwnable.kr as "patchable.kr" - first pwn the challenge, and then generate a patched version that defeats your exploit.  This was helpful, and I soon had a list of attack-types and corresponding patches that I should be able to generate.

I lost about the first two weeks of July to videogames (nights and weekends / practice time).

When I returned to patching prep, I worked on tooling to make some common types of patches easy to apply.  I also took several approaches to training for speed under stress; the simplest was timing my pwnable/patchable.kr plays.  Per Randall Collins' work on meditation allowing people to overcome their emotional barriers to competent violence, I extrapolated that meditation might be useful for overcoming stress and aversions during CTF.  Unfortunately, I was undisciplined in my approach to meditation, and it is hard to tell if it bore any fruit.  If I were to do it again, I would screen-record myself playing pwnable/patchable, and use that to better reflect on and improve my execution.  The knowledge that you are being recorded also adds a layer of stress which I think would be useful.

Around 19 July, I began work on some patching tools to create huge amounts of space inside ELF binaries, which could then be used for very extensive patches, potentially up to relocating whole functions or pulling in whole functions from compiled C (in recent years, the DEFCON CTF organizers have denied contestants the ability to use LD_PRELOAD defenses, but this approach would allow us to put equivalent defenses inside the binary itself).  This project of creating space occupied most of my prep effort for the remaining three weeks leading up to DEFCON.  I also assisted one of the interns in her preparations, building a repository of docker images for testing patches and exploits.

My sleep in July was mostly good (7+ hours a night), but interrupted in several cases by taking visiting coworkers out on the town.  It began deteriorating down towards six hours a night in August, as I started staying up late to put more time on preparing tooling.  This was a mistake.

I arrived in Las Vegas for finals on the evening of the 8th of August.  The hard-drive of my laptop began acting up on the flight, but recovered on arrival - I suspect it was just intolerant of the vibrations of the engine.  Still, perhaps it is time to replace my spinning-rust with an SSD.  The time-shift did my sleep cycle no favors.  I spent the 9th socializing and adding a few last-minute features to the tool for making space for patches.  Our main patcher thought it was a splendid tool; something to the effect of "I've always wanted this but didn't want to write it".  He began building tooling on top of it.

Finals:

I woke up very early on the friday of finals, and my biological clock was still set to EST.  This year finals were run by the Order of the Overflow, an organization including parts of Shellphish, some academics at University of Arizona, and a smattering of others.  We weren't really sure what to expect; new organizers always change things up.  Finals began slightly delayed (not surprising), and opened with a new type of challenge, King of the Hill, where teams competed to maximize their score on a game of assembling and disassembling (instead of a more traditional attack-defense challenge, where teams try to exploit services that other teams are running).  I ran into difficulty with my networking setup, where I was unable to connect to our team's VPN because my version of OpenSSL was too new, so I sat out the reversing King of the Hill while I compiled libraries and eventually ended up building a virtual machine for VPN access.  In addition to introducing King of the Hill challenges, the rules also mentioned that patches would not be made available to other teams (unlike the previous two DEFCON CTFs), but there would be limits on the number of bytes a patch could change in a service.  This rendered all of our patch-tooling preparations useless immediately.  While this was obviously disheartening at the time, on further reflection I think it was a very reasonable decision - since the Cyber Grand Challenge, some teams have had very strong tooling for patches, and compared to them our stuff was probably pretty weak.  So this change ultimately worked in our favor, I think.  Also, it meant that I got to spend my weekend doing old-school manual patches instead of writing and speed-debugging software (like my day-job), and I had a lot of fun.  The manual patching practice I had done on pwnable.kr definitely paid off.  Other changes to the rules included a big delay on the release of network traffic to teams, which meant that pulling exploits off the wire was no longer a viable strategy.  This would've made testing our patches more difficult too, but the organizers provided automatic testing of patches on upload, and rejected patches which failed functionality, instead of deploying them and then docking your score.  This allowed us to patch pretty aggressively and then roll back to more conservative patches if necessary without much of a penalty, which was great.

The first attack-defense service, twoplustwo, came out around lunchtime on friday.  Our veteran patcher was playing the King of the Hill, so one of the interns and I fielded patches as bugs were discovered.  We didn't first-blood the challenge, but discovered the bug shortly after and had working patches before we had working exploits, so I felt good about that.

The second attack-defense challenge, pointless, opened in the afternoon.  It was a MIPS binary that did a bunch of crypto.  Our most-senior exploiters went to town on it and had an exploit for it about two hours before the competition closed for the night, but were unable to throw it due to organizer error for about half an hour.  This caused much frustration.  Once the competition closed for the night around 8PM, they came up and briefed us on the bugs.  Our main patcher handled most of the patching for it, as I was pretty bushed.  Another king of the hill challenge on polyglot shellcoding was also released just before the end of the day.  I slept from about 0200 PST to 0730 PST; not bad for a ctf.

Saturday morning, we showed up bright and early with a bunch of new patches and exploits to deploy, only to find the game start shifted back half-hour by half-hour for two hours or so, with a corresponding one-hour shift of the end of the game.  So we sat there on high alert for a couple of hours, because we knew we would probably have to revert and redeploy patches ASAP once things launched.  On the upside this gave me some time to debug and fix a battery problem with my laptop (another TODO).  When things did start, our patches for pointless failed because we had to leave a directory traversal in for intended functionality.  We spent a while going back and forth on how to patch pointless properly; I think we got a patch we were happy with in the early evening.

The twoplustwo service was retired (maybe a little early IMO, as there were still teams unpatched), and another service named poool was released.  This was a monero pool mining server (which I was really hoping would require Andersen-style optimization), and required pretty extensive reverse-engineering.  Here again I was at a disadvantage due to lack of familiarity/practice with decompilation.  Unlike previous services, the problem description for poool did not specify a limit of number of bytes we could patch, but did mention that we could only submit five patched versions.  This made testing our patches our responsibility again, and a coworker of mine who was playing took it on himself to write a testing framework for poool.  That evening, DEFKOR started exploiting poool, while we were nowhere close to having an exploit or patch ready.  We figured that the bug was related to multiple-counting of submitted shares, and our main patcher was asleep, so I hammered out a patch for that using make_space and not worrying about the number of bytes I changed.  It took me a smidge over an hour, during which I was sweating and shaking / experiencing physiological stress, and during which I made some stupid assembler mistakes.  At the end it passed our tests beautifully, so we submitted it about an hour and a half after exploits had started hitting us.  It promptly failed the SLA check, because there was actually a limit in bytes, and the organizers had changed the description of the problem without alerting anyone.  This also cost us one of our five shots at patching the binary.  We tried to get it working without make_space but it was segfaulting and there was very little time left in the game for the day.  So that whole evening was a bit rough, but also somewhat satisfying, because I didn't choke under the stress - I did a totally reasonable thing given the information I had available to me, it worked, and it turned out the information I had was wrong.

That night, we discovered that there was a much easier bug in poool, which was probably the one that DEFKOR had been exploiting.  It was a one-byte patch.  Our main patcher also took over and shrank the multiple-counting patch better than I had, and we added the new variants of these bugs to the test suite.  A new service, vchat, had been released just before closing time, and most of our exploiters were working on that.  I think there was also a web challenge that came out that evening?  I didn't look at it.  I went to bed around 0200 PST and got up on sunday around 0700 PST.

Sunday morning, the limit on the number of times we could patch poool had been removed, which made all the work on the testing framework sort of wasted.  We submitted our well-tested patched for poool and it passed SLA from the get-go.  Had some problems with the web patch.  Our main patcher was wiped out from being up all night and he delegated the one bug that had been found overnight in vchat to me.  It was actually a very nice patch and I was proud of it, even though the bug turned out to not be exploitable (nobody ended up exploiting vchat at all).  Another service, reeducation, a rust attack-defense binary, was released that morning.  I wasn't involved in patching it; I think one of the interns handled it.

Sunday around lunchtime, a new king of the hill was released, this time focused on patching and minimizing the number of bytes you changed to modify the functionality of a binary that hashed itself, while still preserving the hash.  Honestly my mind was kind of blown by the whole thing, and I helped out with debugging our patch scripts for it but didn't make any tremendous structural contributions.

Closing:

Overall, the types of bugs that were being exploited were different than the ones I came prepared to patch by pwnable.kr; several signedness bugs, heap leaks, bytecode execution, one-byte-writes off end of heap object instead of format strings and buffer overflows (except the MIPS challenge, which had a straightforward stack buffer overflow once you got through all the crypto).  Not really surprising, just not quite what I practiced for.

Though my preparations were for naught and I could've certainly done better, I felt that I was useful and I had fun.  I think choosing to operate as a dedicated patcher was a mistake in terms of team labor-allocation strategy; while one of our captains said it was "our best year ever for patching, though not by much", ultimately we needed more exploits, which would require more eyes on reversing.  If I want to be more useful next year, bug discovery is probably a good thing for me to work on (though there were definitely subtleties of patching that could stand improvement too).  I think the reverse was also true - given that the stakes for an incorrect patch were much lower than in previous years (due to automatic SLA), there was little reason that our reversers shouldn't also be doing patching, especially given their greater familiarity with the services.  In a number of cases, they handed off one-byte patches to us ("just change this signed compare to an unsigned compare"), which felt a bit silly.  I have little doubt that they have the necessary background; there's just sort of a mental block around patching, I think (I know I once had such a block).  Likewise, our network defense folks really didn't shift to something more useful when it became apparent that pcaps were not forthcoming.  Our labor allocation was too rigid; we didn't adapt enough to the changed environment, and our command hierarchy / senior players didn't push us to.

No comments:

Post a Comment