-
Notifications
You must be signed in to change notification settings - Fork 35
Scripting Language
In a K3PO script, you define the exact sequence of events that a network connection goes through in its lifetime. If this exact sequence of events does not happen, the script is considered to have “failed”. A K3PO script thus defines a “behavior”. Testing, then, comprises defining the expected behaviors, and then running the script against your code to see if those expectations are met.
TL:DR;
Sample scripts can be found here
K3PO scripts can be authored using different protocols. At the time of this writing there are only 2 protocols, HTTP and TCP. TCP is the lowest level and therefore offers the script author the most precise and definitive control. When authoring scripts in HTTP it is generally assumed you are testing applications or protocols on top of an assumably stable HTTP stack. Therefore, HTTP semantics are generally assumed to be correct and the HTTP K3PO keywords are more provided as a convenience to the script author. For example, when using HTTP you do not need to worry about trivial things such as the order of headers. This guide will start by introducing the K3PO language using the TCP semantics. HTTP semantics are in general an extension of the TCP semantics and are described in [HTTP Scripting Language](Scripting with HTTP).
A K3PO script is considered a “session”. Within a script, you will define a set of expectations for one or more network connections. Each of these network connections (and their associated expectations) is considered a “channel”. A session might comprise a single short-lived channel, or several interacting channels.
Every channel has roughly the same lifecycle:
started --> connected --> …. --> close --> closed
Between the connected and closed states, K3PO is used to send and receive messages (sequences of bytes “on the wire”) from the network.
Before we start learning about the various operations supported by the K3PO language, and how to use the language to express our expectations about network behavior, let’s first look at an example K3PO script. I find that looking at a concrete illustration of a subject, before discussing the various details of the subject, helps to put that discussion in perspective. So:
# tcp.client.connect-then-close
connect tcp://localhost:9876
connected
close
closed
That is the entire script. In this script session, there is only a single channel. As you might guess from the connect word there, this channel is a client channel, i.e. we are acting as a client and connecting to the server at the address/port appearing in the URI.
The first line, starting with the # character, is a comment. The K3PO language supports single line comments, as you’ve seen, using the shell-style ‘#’ character:
# This is a comment
accept tcp://localhost:9876
This can be used for multi-line comments as well, e.g.:
# This is a multi-line comment.
# It spans multiple lines
# and can be used for long explanations.
accept tcp://localhost:9876
The next line, connect tcp://localhost:9876
starts the channel, describing what kind of channel it will be (client channels use connect, server channels use accept), and the URI to use. The URI in this example indicates that a plain TCP connection will be used, connecting to localhost, port 9876.
Next we see a line comprising just one word: connected. What does this mean, exactly? The K3PO language has two main kinds of keywords: states and actions. If you think of the channel as a state machine, a K3PO script describes the states that the channel moves through, and the actions which occur when transitioning to the next state. The K3PO state keywords are modeled using the past tense: a client connected, a channel was closed, a message was read, etc. Contrast this with K3PO actions, which are modeled using present tense verbs: you write a message, you close a channel. The connected line thus is the starting state of the channel, and we are ready to progress to the next state via some action.
In our example script, that action that we perform, after reaching the connected state, is: close. The scripted behavior in this example has us creating a TCP connection successfully, then immediately closing that connection. Not very useful (except for demonstrating some of the K3PO language).
The last line in the channel is the ending state for every connection: closed
.
Thus concludes your first glimpse into the K3PO language.
Without further ado, then, we present the boring, dry reference material for the K3PO language. But don’t worry: after wading through (or skipping over entirely) this section, you can see more [examples of K3PO scripts](Sample K3PO Scripts).
accept <URI> [as <name>] [notify <barrier>]
The accept keyword indicates that the channel will be a server channel with the optional name . The URI indicates the address and port on which the server will listen; the URI scheme will additionally signal any transport-level abstractions to apply to the channel IO. Optionally notifies the given barrier when the operation (bind) is complete.
connect <URI> [await <barrier>]
The connect keyword indicates that the channel will be a client channel. The URI indicates the address and port to which the client will connect; the URI scheme will additionally signal any transport-level abstractions to apply to the channel IO. Optionally delay connecting until the given barrier has been notified (see Barriers).
closed
The closed keyword indicates that the channel has reached the closed state; no further actions will occur on this channel. Note that the closed keyword is required for every channel.
accepted
The accepted keyword that the server channel accepted a child channel. The child channel is not yet connected.
connected
The connected keyword indicates that the channel has reached the connected state; actions on this channel (such as reading/writing messages) can now commence on this channel.
close
The close keyword is used to close the current channel. Note that attempting to perform any other action after close, or to expect any state other than disconnected, unbound, or closed, is an error.
read <message>
The read keyword is used to read in the next bytes from the network as a message. Messages are covered in greater detail in the next section. Technically, read is not an action but an event.
write <message>
The write keyword is used to write out the bytes of a message to the network. Messages are covered in greater detail in the next section.
For convenience both read and write can take multiple messages that it will process inorder
write <message1> <message2> <message3>
read <message1> <message2> <message3>
Barriers can be used to control and verify correct sequencing of events and actions. You can notify when an event has occurred (data has been read or written, or a server channel has been bound), and delay actions (connect or write) until a barrier has been notified.
read await <barrier>
Require that the barrier named “<barrier>” have been notified before the next input event in the script occurs (normally read). If that event occurs before the barrier has been notified this will result in a script failure.
write await <barrier>
Wait for the barrier named “<barrier>” to be notified before processing subsequent output actions. Essentially, block output actions until the barrier has been notified.
connect await <barrier>
Wait for the barrier named “<barrier>” to be notified before executing a subsequent connect command. Delay connecting until the barrier has been notified.
read notify <barrier>
The read notify keyword is used to indicate that the named barrier has been reached; any channels currently at an await keyword for that named barrier will then be able to proceed. Semantically equivalent to write notify
. For readability use read notify
when notifying a barrier that an input event has occurred.
write notify <barrier>
The write notify keyword is used to indicate that the named barrier has been reached; any channels currently at an await keyword for that named barrier will then be able to proceed. Semantically equivalent to ‘read notify’. For readability use “write notify” when notifying a barrier that an output event has occurred.
accept <URI> notify <barrier>
Notify when the URI has been bound. Often used in conjunction with connect <URI> await <barrier
.
A barrier is a named point of execution in the K3PO script. They are often used to coordinate interactions across multiple channels, as when working with a protocol which uses multiple connections (SIP or FTP come to mind). However, they can be useful in a single-connection test as well.
Let’s assume that you want to write a message, and then read a message back. You want to ensure that the outgoing message has been completely written out to the network before you read the incoming network; furthermore, if the incoming message arrives before the outgoing message was completely written out, it should be an error. How would you write the K3PO script to express this? As you might guess, the solution involves barriers:
# tcp.client.write-then-read
connect tcp://localhost:9876
connected
write “Hello, World!”
write notify WRITTEN
read await WRITTEN
read “Hello, programmer”
close
closed
This says to write the string “Hello, World!”, and when that string has been successfully (and completely) written out, to notify that the barrier named “WRITTEN” was reached. The next statement says to wait for the barrier named “WRITTEN” to be reached, then read in the string “Hello, programmer”.
Note that for convenience, when a read or series of reads is followed by a write, the write will not be done until all the preceding reads have completed. So it is not necessary to add barriers to a script like the following:
read "Request line 1"
read "Request line 2"
write "Response"
Messages are a central part of the K3PO language. The term “message” was deliberately chosen, as it can mean many things, depending on the protocol and the transport used. HTTP requests and responses, TLS records, WebSocket frames and messages, etc are all examples of protocol/transport-specific messages. In the K3PO language, a message refers to any sequence of bytes that can be read in or written out to the network. Nice and vague, right?
There are currently three main flavors of K3PO messages: text strings, hex strings, and patterns. Text strings are sequences of characters enclosed within double quotation marks (“), just as you would see in C, Perl, Java, etc. Sending the text string “Hello, World!” would thus look like this in K3PO:
write “Hello, World!”
or to read the text string “Hello, World!” would thus look like this in K3PO:
read “Hello, World!”
Now depending on the protocol you are testing, and on the type of message you want to send for that protocol, you will find that text strings aren’t enough. What if you absolutely must specify the raw bytes to send, e.g. to send compiled code over the network? For cases like this, you would use a hex string. A hex string consists of the hex digits, one hex digit per byte, that you want to send; the entire sequence of bytes is enclosed with square brackets ([]). So to send “Hello, World!” as a hex string, you would do this in K3PO:
write [0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x57 0x6f 0x72 0x6c 0x64 0x21 0xa]
or to read “Hello, World!” would thus look like this in K3PO:
read [0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x57 0x6f 0x72 0x6c 0x64 0x21 0xa]
Often we may not care what bytes are actually read. We just want to read N of them. So k3po provides the syntax:
read [0..N]
where N is a value that evaluates to an integer.
This may be a fixed constant (ie. 1, 10, 100, etc) or it may be the result of an EL expression using previously captured variables (of numeric type byte, short, int or long) or a function explained later.
Sometimes you will want to read in a number of bytes and then use them latter. You can capture a number of bytes into a variable via the following:
read ([0..N]:capture}
where "capture" is the name of the variable
In order to truly test for protocol regressions, we want to test for more than just sending and receiving bytes -- we will want to test what those bytes actually look like, particularly for bytes that we have received. Using literal text (or hex) strings for comparisons against the received bytes works, but only up to a point. More powerful assertions over the content/structure of the received bytes is done using pattern messages, which use Java regular expressions. To read a sequence of bytes and match it against a regular expression, you place the Java regular expression in between ‘/’ characters.
read /.*World/
The above read expression will bytes up until a newline character is seen or it obtains a match. Once the newline character is seen it will attempt to match the received bytes against the regular expression and fail if it does not match.
Note that this implies that when matching a line, you should include the newline character inside the regular expression to avoid test failures due to packet fragmentation (see issue #200). For example, use:
read /.*\n/
instead of
read /.*/ "\n"
In general Java Regex are supported.
In some cases you may want to capture a regex and reuse the value latter. To do this use a named group regex:
read /(?<capture>[abc]+)/
"capture" will then be available as a variable for future use
In k3po releases 3.0.0-alpha-85 and later, you can use the keywords byte, short, int, and long to read 1, 2, 4, and 8 bytes respectively:
read byte
# Note: this currently does not work, see https://github.com/k3po/k3po/issues/185, write byte 0xnn is ok
read short
read int
read long
In the same fashion you may ask it to read the specified number of bytes and match it against a signed value. As a shorthand, you can also use the suffix L
or s
to indicate long or short literals, and no suffice means int. You can use the same syntax with write as well.
Example | Meaning |
---|---|
read byte 2 |
Read a byte and match it against the value 0x02 |
read short 2 |
Read two bytes and match it against the value 0x0002 |
read int 2 |
Read four bytes and match it against the value 0x00000002 |
read long 2 |
Read eight bytes and match it against the value 0x0000000000000002 |
read 2s |
Equivalent to read short 2
|
read 2 |
Equivalent to read int 2
|
read 2L |
Equivalent to read long 2
|
write 2s |
Write the value 2 as a 2 byte integer |
write int -47 |
Write the value -47 as a 4 byte integer |
Numeric values are read and written in network (big endian) byte order, except that from k3po 3.0.0-alpha-81 the Agrona transport reads and writes them in native byte order (most commonly little endian).
As of k3po 3.0.0-alpha-81 integer literals can be expressed in the form 0xhhhh
where h
represents a hexadecimal digit (0 through 9, a through f, A through F) and long literals can be expressed in the form 0xhhhL
or 123L
(decimal). In addition, the underscore character '_' may be used as a separator for clarity, for example: 0x0001_00000000000cL
.
Reading and writing values is a great start but we also need to be able to read and write values stored from previous reads. So we introduce the idea of capturing the results of our read directives to temporary variables. Variables have scope of only the current channel and can only be assigned to once. Each form of a read declarative which allows for a variation in the value of the bytes read has a form which you can store/assign/capture the results into a temporary variable. Read’s of fixed literals do not provide the capability of capturing to a variable since the value is fixed and thus constant and does not change. Variables do have a type. In most cases the type of the variable is an array of bytes (byte[]). The exception to this is read declaratives that include a type qualifier (byte, short, int, long). Those variables will have type corresponding to the type used in the statement, for example:
`read (byte:subscription)'
Reads one byte into a variable called "subscription".
Once you have captured the result of a variable you may later use this variable to read and match the same exact bytes, write the variable out, or use it to specify how many bytes a read statement should read.
read ${var}
Expects to read the previously captured value store in var.
write ${var}
Variables can set a number of ways including Regex named capture groups, reading a number of bytes, or setting a property.
In some instances scripts will be required to implement dynamic behavior such as generating random bytes. A script author can use the Functions SPI (TODO add link) to accomplish this. The SPI takes a prefix and a method name. Once implement functions can be called as follows:
${prefix:function(parameters)}
Properties are the equivalent to variable definitions. They typically occur at the top of the script.
property variable <value>
where value can be a text literal, function, or specific bytes.
The instruction read aborted
(or just aborted
prior to k3po version 3.0.0-alpha-77) is used to indicate that the transport layer underneath you ended unexpectedly. The instruction write abort
(formerly just abort
) is used to terminate the transport layer underneath you. These instructions are currently only implemented for http, where write abort
can be used to kill the tcp connection (e.g. in the middle of a request/response or websocket session).
The above instructions and in addition read abort
and write aborted
are defined in the grammar and can be implemented by k3po extensions (for example K3PO Nukleus Extension)