From 1fa1f0329442c555635460221c809fecdaa977a7 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Tue, 7 Dec 2021 09:23:00 +0000 Subject: initial logformats and querylanguage documentation --- doc/querylanguage.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 doc/querylanguage.md (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md new file mode 100644 index 0000000..9f38f5e --- /dev/null +++ b/doc/querylanguage.md @@ -0,0 +1,18 @@ +DTail Query Language +==================== + +The query language allows you to run mapreduce queries on log files. This page intends to be a reference to the language. + +## Prerequisites + +For this to work, DTail needs to understand your log format. DTail already understands its own log format. You can have a look at all examples of the [examples](./examples.md) page using `-query` (these would be all examples of the `dmap` command, and some examples using the `dtail` command). + +DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log formats](./logformats.md) documentation for this. + +To implement your own log format, please also check out the [log formats](./logformats.md) documentation. + +## The complete language + +``` +TODO: Add EBNF +``` -- cgit v1.2.3 From 18d1783378732b6abca0eb89e29636cc81c02db8 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Thu, 9 Dec 2021 09:53:48 +0000 Subject: Add query language EBNF and variables. --- doc/querylanguage.md | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 70 insertions(+), 2 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index 9f38f5e..c6b9beb 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -13,6 +13,74 @@ To implement your own log format, please also check out the [log formats](./logf ## The complete language +```shell +QUERY := + select SELECT1[,SELECT2...] + from TABLE + [where COND1[,COND2...]] + [group by GROUPFIELD1[,GROUPFIELD2...]] + [order|rorder by ORDERFIELD] + [interval SECONDS] + [limit NUM] + [outfile "FILENAME.csv"] +SELECT := FIELD|AGGREGATION(FIELD) +TABLE := The mapreduce table name, e.g. WRITE in MAPREDUCE:WRITE +AGGREGATION := count|sum|min|max|avg|last|len +COND := ARG1 OPERATOR ARG2 +ARG := This is either + a string: "foo bar" + a float number: 3.14 + a bareword e.g.: responsecode + or a $variable (see below). +OPERATOR := This is one of ... + Floating point operators: + == != < <= > >= + String operators: + eq ne contains lacks (lacks is the opposite of contains, e.g. + "not contains") +GROUPFIELD := bareword|$variable +ORDERFIELD := This must be a AGGREGATION(FIELD) or FIELD which was specified in + select clause already. ``` -TODO: Add EBNF -``` + +## Predefined variables + +This is the list of pre-defined variables. Please note that these vary depending on the log format used. + +### Common variables: + +The common variables may exist in all log formats. + +* `$empty` - The empty string `""` +* `$hostname` - The server FQDN +* `$line` - The current log line +* `$server` - Alias for `$hostname` +* `$timeoffset` - Offset of $timezone +* `$timezone` - The current time zone +* `* (special placeholder) + +### DTail default log format: + +These variables may only exist when your logs are in the DTail default log format: + +*Date and time:* + +* `$hour` - The current hour in format HH +* `$minute` - The current minute in format MM +* `$second` - The current second in format SS. +* `$time` - The current time in format YYYYMMDD-HHMMSS + +*Log level/severity:* + +* `$loglevel` - Alias for `$severity` +* `$severity` - The log severity + +*System and Go runtime:* + +* `$caller` - DTail server caller of the logger +* `$cgocalls` - Num of DTail server CGo calls +* `$cpus` - Num of DTail server CPUs used +* `$goroutines` - Num of DTail server Goroutines used +* `$loadavg` - 1 min. average load average +* `$pid` - DTail server process ID +* `$uptime` - DTail server uptime -- cgit v1.2.3 From a9372bc8a882b59fcdd3997a56acc2338776f602 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Thu, 9 Dec 2021 10:22:25 +0000 Subject: Documenting log formats --- doc/querylanguage.md | 53 ++++++---------------------------------------------- 1 file changed, 6 insertions(+), 47 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index c6b9beb..96d0fd1 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -7,9 +7,7 @@ The query language allows you to run mapreduce queries on log files. This page i For this to work, DTail needs to understand your log format. DTail already understands its own log format. You can have a look at all examples of the [examples](./examples.md) page using `-query` (these would be all examples of the `dmap` command, and some examples using the `dtail` command). -DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log formats](./logformats.md) documentation for this. - -To implement your own log format, please also check out the [log formats](./logformats.md) documentation. +DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log format](./logformats.md) documentation for this. To implement your own log format, please also check out the log format documentation. ## The complete language @@ -23,6 +21,7 @@ QUERY := [interval SECONDS] [limit NUM] [outfile "FILENAME.csv"] + [logformat LOGFORMAT] SELECT := FIELD|AGGREGATION(FIELD) TABLE := The mapreduce table name, e.g. WRITE in MAPREDUCE:WRITE AGGREGATION := count|sum|min|max|avg|last|len @@ -31,56 +30,16 @@ ARG := This is either a string: "foo bar" a float number: 3.14 a bareword e.g.: responsecode - or a $variable (see below). + a field or a $variable OPERATOR := This is one of ... Floating point operators: == != < <= > >= String operators: - eq ne contains lacks (lacks is the opposite of contains, e.g. - "not contains") + eq ne contains lacks (lacks is the opposite of contains, e.g. "not contains") GROUPFIELD := bareword|$variable ORDERFIELD := This must be a AGGREGATION(FIELD) or FIELD which was specified in select clause already. +LOGFORMAT := The name of the log format implementation. It's 'default' by default. ``` -## Predefined variables - -This is the list of pre-defined variables. Please note that these vary depending on the log format used. - -### Common variables: - -The common variables may exist in all log formats. - -* `$empty` - The empty string `""` -* `$hostname` - The server FQDN -* `$line` - The current log line -* `$server` - Alias for `$hostname` -* `$timeoffset` - Offset of $timezone -* `$timezone` - The current time zone -* `* (special placeholder) - -### DTail default log format: - -These variables may only exist when your logs are in the DTail default log format: - -*Date and time:* - -* `$hour` - The current hour in format HH -* `$minute` - The current minute in format MM -* `$second` - The current second in format SS. -* `$time` - The current time in format YYYYMMDD-HHMMSS - -*Log level/severity:* - -* `$loglevel` - Alias for `$severity` -* `$severity` - The log severity - -*System and Go runtime:* - -* `$caller` - DTail server caller of the logger -* `$cgocalls` - Num of DTail server CGo calls -* `$cpus` - Num of DTail server CPUs used -* `$goroutines` - Num of DTail server Goroutines used -* `$loadavg` - 1 min. average load average -* `$pid` - DTail server process ID -* `$uptime` - DTail server uptime +Note, that the available fields and variables vary from the log format used. There is also a subtle difference between a field and a variable. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3 From c1e52e7a7fd925ae8450708c4337d6808e014ba4 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Fri, 10 Dec 2021 10:36:40 +0000 Subject: remove trace logging --- doc/querylanguage.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index 96d0fd1..2819a77 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -29,8 +29,8 @@ COND := ARG1 OPERATOR ARG2 ARG := This is either a string: "foo bar" a float number: 3.14 - a bareword e.g.: responsecode - a field or a $variable + a bareword (aka a field) e.g.: responsecode + or a $variable OPERATOR := This is one of ... Floating point operators: == != < <= > >= @@ -39,7 +39,7 @@ OPERATOR := This is one of ... GROUPFIELD := bareword|$variable ORDERFIELD := This must be a AGGREGATION(FIELD) or FIELD which was specified in select clause already. -LOGFORMAT := The name of the log format implementation. It's 'default' by default. +LOGFORMAT := The name of the log format implementation. It's "default" by default. ``` Note, that the available fields and variables vary from the log format used. There is also a subtle difference between a field and a variable. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3 From 040ea682db4df81c4888b771a785418262397e00 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Sun, 12 Dec 2021 21:12:14 +0000 Subject: Also document query language functions --- doc/querylanguage.md | 73 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 45 insertions(+), 28 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index 2819a77..4fffdc3 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -9,37 +9,54 @@ For this to work, DTail needs to understand your log format. DTail already under DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log format](./logformats.md) documentation for this. To implement your own log format, please also check out the log format documentation. -## The complete language +## The language + +These are the fundamental types of the query language: + +```shell +NUMBER := A whole number (e.g. 42) +FLOAT := A float number, e.g. 3.14 +STRING := A quoted string, e.g. "foo" +FIELD := BAREWORD|VARIABLE +BAREWORD := A bare string without quotes, e.g. foo. This usually contains a value + extracted from a log line. +VARIABLE := Like a bareword, but with a $ prefix, e.g. $foo. This usually contains + a special value set by DTail itself (not necessary from the log line). +``` + +This is the overall structure of a query: ```shell -QUERY := - select SELECT1[,SELECT2...] - from TABLE - [where COND1[,COND2...]] - [group by GROUPFIELD1[,GROUPFIELD2...]] - [order|rorder by ORDERFIELD] - [interval SECONDS] - [limit NUM] - [outfile "FILENAME.csv"] - [logformat LOGFORMAT] +QUERY := from TABLE + select SELECT1[,SELECT2...] + [where CONDITION1[,CONDITION2...]] + [group by FIELD1[,FIELD2...]] + [order|rorder by ORDERFIELD] + [set SET1,[,SET2...]] + [interval NUMBER] + [limit NUMBER] + [outfile STRING] + [logformat LOGFORMAT] +``` + +Whereas.... + +```shell +TABLE := The mapreduce table name, e.g. STATS in MAPREDUCE:STATS SELECT := FIELD|AGGREGATION(FIELD) -TABLE := The mapreduce table name, e.g. WRITE in MAPREDUCE:WRITE +CONDITION := ARG1 OPERATOR ARG2 +ARG := FIELD|FLOAT|STRING +OPERATOR := FLOATOPERATOR|STRINGOPERATOR +FLOATOPERATOR := One of: == != < <= > >= +STRINGOPERATOR := eq|ne|contains|lacks +ORDERFIELD := FIELD|AGGREGATION(FIELD) +SET := VARIABLE = FLOAT|STRING|FIELD|FUNCTION(FIELD) +LOGFORMAT := default|generic|generickv|... AGGREGATION := count|sum|min|max|avg|last|len -COND := ARG1 OPERATOR ARG2 -ARG := This is either - a string: "foo bar" - a float number: 3.14 - a bareword (aka a field) e.g.: responsecode - or a $variable -OPERATOR := This is one of ... - Floating point operators: - == != < <= > >= - String operators: - eq ne contains lacks (lacks is the opposite of contains, e.g. "not contains") -GROUPFIELD := bareword|$variable -ORDERFIELD := This must be a AGGREGATION(FIELD) or FIELD which was specified in - select clause already. -LOGFORMAT := The name of the log format implementation. It's "default" by default. +FUNCTION := md5sum|maskdigits ``` -Note, that the available fields and variables vary from the log format used. There is also a subtle difference between a field and a variable. Check out the [log format](./logformats.md) documentation for more information. +*Notes:* + +* `lacks` is the inverse of `contains`) +* Available fields (variables and barewords) vary from the log format used. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3 From 242d419f1b31755d1d1b3d1a1fd0e7bf61f7768e Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Sun, 12 Dec 2021 21:21:14 +0000 Subject: link from example docs to query language and log format docs --- doc/querylanguage.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index 4fffdc3..a603bd9 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -58,5 +58,6 @@ FUNCTION := md5sum|maskdigits *Notes:* -* `lacks` is the inverse of `contains`) +* `lacks` is the inverse of `contains` +* `rorder` stands for reverse order and is the inverse of `order` * Available fields (variables and barewords) vary from the log format used. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3 From b1f3760dc2f452c3dba7883a538fd14d62a581e9 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Tue, 14 Dec 2021 10:27:23 +0000 Subject: Refactor makeWhereConditions --- doc/querylanguage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index a603bd9..725b635 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -48,7 +48,7 @@ CONDITION := ARG1 OPERATOR ARG2 ARG := FIELD|FLOAT|STRING OPERATOR := FLOATOPERATOR|STRINGOPERATOR FLOATOPERATOR := One of: == != < <= > >= -STRINGOPERATOR := eq|ne|contains|lacks +STRINGOPERATOR := eq|ne|contains|ncontains|lacks|hasprefix|nhasprefix|hassuffix|nhassuffix ORDERFIELD := FIELD|AGGREGATION(FIELD) SET := VARIABLE = FLOAT|STRING|FIELD|FUNCTION(FIELD) LOGFORMAT := default|generic|generickv|... @@ -58,6 +58,6 @@ FUNCTION := md5sum|maskdigits *Notes:* -* `lacks` is the inverse of `contains` +* `lacks` is an alias for `ncontains` (not contains) * `rorder` stands for reverse order and is the inverse of `order` * Available fields (variables and barewords) vary from the log format used. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3 From 895ed15df5144e367a5143d1c36d8abe2fec8f08 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Wed, 15 Dec 2021 16:06:48 +0000 Subject: documenting how to implement a custom log format --- doc/querylanguage.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) (limited to 'doc/querylanguage.md') diff --git a/doc/querylanguage.md b/doc/querylanguage.md index 725b635..41e95de 100644 --- a/doc/querylanguage.md +++ b/doc/querylanguage.md @@ -1,34 +1,34 @@ DTail Query Language ==================== -The query language allows you to run mapreduce queries on log files. This page intends to be a reference to the language. +The query language allows you to run mapreduce queries on log files. This page is the reference to the language. ## Prerequisites For this to work, DTail needs to understand your log format. DTail already understands its own log format. You can have a look at all examples of the [examples](./examples.md) page using `-query` (these would be all examples of the `dmap` command, and some examples using the `dtail` command). -DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log format](./logformats.md) documentation for this. To implement your own log format, please also check out the log format documentation. +DTail also ships with a generic log format, which only allows you to run very basic queries. Check out the [log format](./logformats.md) documentation for this. That page also documents how to implement your own log format parser. ## The language -These are the fundamental types of the query language: +This are the fundamental types of the query language: ```shell NUMBER := A whole number (e.g. 42) FLOAT := A float number, e.g. 3.14 STRING := A quoted string, e.g. "foo" -FIELD := BAREWORD|VARIABLE +FIELD := BAREWORD|$VARIABLE BAREWORD := A bare string without quotes, e.g. foo. This usually contains a value extracted from a log line. -VARIABLE := Like a bareword, but with a $ prefix, e.g. $foo. This usually contains +$VARIABLE := Like a bareword, but with a $ prefix, e.g. $foo. This usually contains a special value set by DTail itself (not necessary from the log line). ``` This is the overall structure of a query: ```shell -QUERY := from TABLE - select SELECT1[,SELECT2...] +QUERY := select SELECT1[,SELECT2...] + [from TABLE] [where CONDITION1[,CONDITION2...]] [group by FIELD1[,FIELD2...]] [order|rorder by ORDERFIELD] @@ -39,7 +39,7 @@ QUERY := from TABLE [logformat LOGFORMAT] ``` -Whereas.... +... whereas: ```shell TABLE := The mapreduce table name, e.g. STATS in MAPREDUCE:STATS @@ -50,7 +50,7 @@ OPERATOR := FLOATOPERATOR|STRINGOPERATOR FLOATOPERATOR := One of: == != < <= > >= STRINGOPERATOR := eq|ne|contains|ncontains|lacks|hasprefix|nhasprefix|hassuffix|nhassuffix ORDERFIELD := FIELD|AGGREGATION(FIELD) -SET := VARIABLE = FLOAT|STRING|FIELD|FUNCTION(FIELD) +SET := $VARIABLE = FLOAT|STRING|FIELD|FUNCTION(FIELD) LOGFORMAT := default|generic|generickv|... AGGREGATION := count|sum|min|max|avg|last|len FUNCTION := md5sum|maskdigits @@ -58,6 +58,6 @@ FUNCTION := md5sum|maskdigits *Notes:* -* `lacks` is an alias for `ncontains` (not contains) -* `rorder` stands for reverse order and is the inverse of `order` +* `rorder` stands for reverse order. +* `lacks` is an alias for `ncontains` (not contains). * Available fields (variables and barewords) vary from the log format used. Check out the [log format](./logformats.md) documentation for more information. -- cgit v1.2.3