120 lines
3.8 KiB
Plaintext
120 lines
3.8 KiB
Plaintext
# awk
|
|
|
|
an awk-ward (or not really) language.
|
|
|
|
some notes in process.
|
|
|
|
# projects
|
|
|
|
g2e is an opinionated gempub to epub converter written in awk:
|
|
|
|
=> https://tildegit.org/sejo/g2e g2e
|
|
|
|
my solutions for {advent of code 2021} have been written in awk.
|
|
|
|
# resources
|
|
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/index.html gawk user's guide
|
|
=> https://www.tutorialspoint.com/awk/index.htm awk tutorial: tutorialspoint
|
|
|
|
# some built-in variables
|
|
|
|
* FS: input field separator (by default, space)
|
|
* RS: record field separator (by default, newline)
|
|
* NF: number of fields in current record
|
|
* NR: number of current record
|
|
* FNR: number of current record relative to current file
|
|
* OFS: output field separator (by default, space)
|
|
* ORS: output record separator (by default, newline)
|
|
* RSTART: index of the first position in the string matched by match( )
|
|
* RLENGTH: length of the string matched by match( )
|
|
* ARGV: array that stores the command-line arguments. index starts at 0
|
|
* ARGC: number of command-line provided arguments
|
|
* ARGIND: index in ARGV that is currently being processed (not necessarily compatible with all awks)
|
|
|
|
$0 represents the entire input record, and $n the nth field in the current record (starting to count from 1)
|
|
|
|
=> https://www.tutorialspoint.com/awk/awk_built_in_variables.htm awk tutorial: built-in variables
|
|
|
|
# some built-in functions
|
|
|
|
=> https://www.tutorialspoint.com/awk/awk_built_in_functions.htm awk tutorial: built-in functions
|
|
=> https://www.tutorialspoint.com/awk/awk_miscellaneous_functions.htm miscellaneous functions
|
|
=> https://www.tutorialspoint.com/awk/awk_string_functions.htm string functions
|
|
|
|
## strings
|
|
|
|
the index of the first character is 1!
|
|
|
|
* index( string, sub) : index of sub as a substring of string
|
|
* length( string )
|
|
* match( string, regex ) : index of the longest match of regex in string
|
|
* split( string, arr, regex ) : split string into array using regex as separator
|
|
* printf( format, expr-list)
|
|
* strtonum(string): useful to convert from hexadecimal (0x prefix) or octal (0 prefix)
|
|
* gsub( regex, sub, string): global substitution of regex with sub in string. if string is ommited, $0 is used
|
|
* sub(regex, sub, string): substitute regex with sub in string, once. if string is omitted, $0 is used
|
|
* substr(string, start, len): returns the substring from start index, with length len. if len is ommitted, it goes until the end of the string
|
|
* tolower( str )
|
|
* toupper( str )
|
|
|
|
## misc
|
|
|
|
* getline: read the next line. and other possibilities:
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/Getline.html Explicit Input with getline
|
|
|
|
> The getline command is used in several different ways and should not be used by beginners.
|
|
(?)
|
|
|
|
* next: stops the current processing and start over with next line
|
|
* system: execute the specified command and returns its exit status
|
|
* delete: delete an element from an array, or an array
|
|
|
|
# gawk
|
|
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/index.html gawk user's guide
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/Bitwise-Functions.html bit-manipulation functions
|
|
|
|
note: to run in compatibility mode, without gnu extensions, use the -c or --traditional flag:
|
|
|
|
```
|
|
$ awk -c
|
|
```
|
|
|
|
# other notes
|
|
|
|
* apparently counters/accumulators don't need to be initialized at 0
|
|
* boolean values are 0 or 1
|
|
|
|
## record separation
|
|
|
|
records separated by empty lines can be extracted with:
|
|
|
|
```
|
|
RS = ""
|
|
```
|
|
|
|
without modifying FS, fields will be separated by any whitespace, including newlines.
|
|
|
|
gawk allows regexp in RS, traditional awk will only accept one character.
|
|
|
|
## field separation
|
|
|
|
one can get one character at a time by setting (in gawk?) :
|
|
|
|
```
|
|
FS = ""
|
|
```
|
|
|
|
## loop through the elements of an array
|
|
|
|
this approach might yield the results in different order depending on the awk implementation.
|
|
|
|
```
|
|
arr["a"] = 1
|
|
arr["b"] = 2
|
|
|
|
for(key in arr)
|
|
print key ": " arr[key]
|
|
```
|