This week I learned 21
This week I learned about state machines in AWK.
The AWK book gives an example of parsing multiline records with a simple state machine.
Consider a simple example, again an address list, but this time each record begins with a header that indicates some characteristic, such as occupation, of the person whose name follows, and each record (except possi- bly the last) is terminated by a trailer consisting of a blank line:
accountant Adam Smith 1234 Wall St., Apt. SC New York, NY 10021 doctor - ophthalmologist Dr. Will Seymour 798 Maple Blvd. Berkeley Heights, NJ 07922 lawyer David w. Copperfield 221 Dickens Lane Monterey, CA 93940 doctor - pediatrician Dr. Susan Mark 600 Mountain Avenue Murray Hill, NJ 07974
To print the doctor records without headers, we can use
/^doctor/ { p = 1; next } p == 1 /^$/ { p = 0; next }
Pretty neat.
I wrote a small program to convert an html file into plain text in just a few lines of AWK using this technique.