Patrick Craig > Other Stuff > ICFP 2001
ICFP 2001
I submitted a lightning entry to the
ICFP 2001 contest. I am fairly confident of my
entry being first. First to be eliminated that is! As long as you have at least one
redundant unformatted whitespace character at the end of your SML file, don't
use plain or underline tags and don't have too many color, number or emphasis tags
occurring together, my program works fine. It's also very fast and never gives a result
bigger than the input. So if you just reverse the judgement criteria:
1. Speed of optimisation.
2. Output size.
3. Stupidity.
4. Correctness.
My program will do quite well, but then a one line program (int main() {return 0;})
will do even better! It could all have been so different...
Timetable (Japanese time)
Friday, July 27 0:00
Task description published. Read it at home and thought there must be some mistake
as it sounded far too easy. Went to bed and thought about how I would solve the problem
in a few hours.
Friday, July 27 10:00
Reread the task description and realised it was a bit more complicated than I first thought.
Friday, July 27 12:00
Made a start on a parser in my lunch hour.
Friday, July 27 17:00
Work over, started coding up the simplification algorithm. The idea was to record the
current format associated with each text character during the initial pass.
You then recreate the document using the change in format between characters.
Had to give up at 22:00pm. Submitted what I had written by then.
source (1,600 lines of code)
README
Friday night
While lying in bed, realised there was a major problem with the algorithm when
dealing with plain tags.
Saturday
Spent the day with the family and came up with an idea to solve the plain problem.
Sunday
Managed to pursaude my wife to let me work on my entry (ooh err missus). Realised there were several
problems with the code apart from dealing with plain tags. Two simple errors in the
code meant that the final character from the input and any closing tags associated
with this character were missing in the output and the underline tags were not
handled properly. There were two problems with the EM tags, first they were treated
as on/off tags and not switch tags. The second was my assumption about when they
were redundant was wrong, i.e. all the EM tags in this sequence are redundant:
bar <EM> <EM> foo </EM> </EM>.
but only the first two are in this sequence.
bar <EM> <EM> foo </EM>x</EM>.
Fixed the errors in my code relating to the missing last character, underline tag
and EM tags.
Next realised that unlike other tags, the ordering of color and number tags that
occur together is important, e.g.
<0><1>one</1></0>
is obviously not the same as
<1><0>one</0></1>
Fixed this by having buffers to record the ordering of color and number tags.
At the same time added color and number stacks to record the current color
and number to remove redundant color and number tags.
Started working on the plain fix. The idea is that you have a plain level that
is incremented every time you encounter an opening plain tag and decremented
every time you encounter a closing plain tag. In the first pass, you ignore
(i.e. just treat as text) all S, EM, I, B, U and TT tags that occur in parts
of the text where the plain level is not equal to zero. In the second pass
you you ignore all these tags where the plain count is not one etc. If you
record the positions of the plain tags along with the plain level,
in effect you are doing less than two passes through the text.
Wrote a random SML file generator and started testing my program properly.
Found and fixed several bugs but couldn't fix them all by the deadline
(still needed to impement the color nesting and plain shortcut simplifications
too).
After reading other entrants pages I think if I had got my entry working it
would be one of the fastest implementations (I doubt it would ever require
more than three minutes for a 5 Mb file) and I think it would have done quite
well on the output file size (especially on large file sizes where other
entrants were time constrained).
Unlike a few years ago I now have a life (of sorts). If I could have spent the
three days working on a fast machine maybe I could have completed my entry.
(Yeah right - that's what they all say).
Maybe next year...