diff options
Diffstat (limited to 'doc/querylanguage.md')
| -rw-r--r-- | doc/querylanguage.md | 63 |
1 files changed, 63 insertions, 0 deletions
diff --git a/doc/querylanguage.md b/doc/querylanguage.md new file mode 100644 index 0000000..41e95de --- /dev/null +++ b/doc/querylanguage.md @@ -0,0 +1,63 @@ +DTail Query Language +==================== + +The query language allows you to run mapreduce queries on log files. This page is the reference to the language. + +## Prerequisites + +For this to work, DTail needs to understand your log format. DTail already understands its own log format. You can have a look at all examples of the [examples](./examples.md) page using `-query` (these would be all examples of the `dmap` command, and some examples using the `dtail` command). + +DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log format](./logformats.md) documentation for this. That page also documents how to implement your own log format parser. + +## The language + +This are the fundamental types of the query language: + +```shell +NUMBER := A whole number (e.g. 42) +FLOAT := A float number, e.g. 3.14 +STRING := A quoted string, e.g. "foo" +FIELD := BAREWORD|$VARIABLE +BAREWORD := A bare string without quotes, e.g. foo. This usually contains a value + extracted from a log line. +$VARIABLE := Like a bareword, but with a $ prefix, e.g. $foo. This usually contains + a special value set by DTail itself (not necessary from the log line). +``` + +This is the overall structure of a query: + +```shell +QUERY := select SELECT1[,SELECT2...] + [from TABLE] + [where CONDITION1[,CONDITION2...]] + [group by FIELD1[,FIELD2...]] + [order|rorder by ORDERFIELD] + [set SET1,[,SET2...]] + [interval NUMBER] + [limit NUMBER] + [outfile STRING] + [logformat LOGFORMAT] +``` + +... whereas: + +```shell +TABLE := The mapreduce table name, e.g. STATS in MAPREDUCE:STATS +SELECT := FIELD|AGGREGATION(FIELD) +CONDITION := ARG1 OPERATOR ARG2 +ARG := FIELD|FLOAT|STRING +OPERATOR := FLOATOPERATOR|STRINGOPERATOR +FLOATOPERATOR := One of: == != < <= > >= +STRINGOPERATOR := eq|ne|contains|ncontains|lacks|hasprefix|nhasprefix|hassuffix|nhassuffix +ORDERFIELD := FIELD|AGGREGATION(FIELD) +SET := $VARIABLE = FLOAT|STRING|FIELD|FUNCTION(FIELD) +LOGFORMAT := default|generic|generickv|... +AGGREGATION := count|sum|min|max|avg|last|len +FUNCTION := md5sum|maskdigits +``` + +*Notes:* + +* `rorder` stands for reverse order. +* `lacks` is an alias for `ncontains` (not contains). +* Available fields (variables and barewords) vary from the log format used. Check out the [log format](./logformats.md) documentation for more information. |
