Binary Pattern Matching in Elixir
I've been working on a web client for my side project Grapevine, and part of that includes parsing a telnet stream. Before you ask, "A telnet stream!?", the web client connects to text based games called MUDs that primarily used telnet as that's how they started in the late 1980s.
Telnet is a mostly a TCP stream that contains only text that should be output to the screen. Hidden inside the otherwise plain TCP stream is a series of bytes called IAC (Interpret as Command). This is the byte 255
followed by at least 1 other byte that describes what the command is.
An example stream might be:
"Welcome! What is your name? " <> <<255, 249>>
This is the command to Go Ahead (GA), signaling to the remote client that the server is done sending and should display the text it sent. In order to act on this, the Grapevine web client needs to be able to parse this binary stream to find the commands.
Luckily Elixir has a wonderful thing called binary pattern matching that saves the day.
Pattern Matching on Binaries
In order to find these telnet commands, I have a function that takes the full binary as it pops off bytes at a time until it finds a known byte series such as the go ahead.
def options(binary, current \\ <<>>, processed \\ [])
def options(<<>>, current, processed) do
processed ++ [current]
end
def options(<<@iac, @ga, data::binary>>, current, processed) do
options(data, <<>>, processed ++ [current, <<@iac, @ga>>])
end
def options(<<byte::size(8), data::binary>>, current, processed) do
options(data, current <> <<byte>>, processed)
end
With this you can now call options/3
with the above binary and get ["Welcome! What is your name? ", <<255, 249>>]
This works by expecting the last option
function definition to be the main call; it takes the top byte of the binary, ::size(8)
, and adds it to the current working binary. It then calls itself again with the rest of the binary, minus the first byte.
If a known byte sequence is encountered at the beginning of the binary, that function definition is called instead. The current working binary and the known pattern is added to the processed data.
The function continues to recurse by calling itself with the unprocessed binary and continues until it runs out of binary data. The top function then matches and the processed data is returned with the final current working binary.
Transforming Binary into Actionable Data
With this list of broken up binary data, I then loop over the list transforming it into something useful for the application.
def transform(<<@iac, @ga>>), do: {:ga}
def transform(string), do: {:string, string}
This results in the following list, continuing with the example: [{:string, "Welcome! What is your name? "}, {:ga}]
. The application can then easily act on each of these pieces of data, processing as needed.
Conclusion
Binary pattern matching is incredibly powerful and I suggest seeing what else is available in the Elixir Special Forms documentation.
The full code for the example above is also available on GitHub, both the options function and the transform function. The real examples are much more complicated as some telnet options don't have a determined length, and the real options function parses those out — with more pattern matching!