93 lines
2.9 KiB
Plaintext
93 lines
2.9 KiB
Plaintext
# awk
|
|
|
|
an awk-ward (or not really) language.
|
|
|
|
some notes in process.
|
|
|
|
# resources
|
|
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/index.html gawk user's guide
|
|
=> https://www.tutorialspoint.com/awk/index.htm awk tutorial: tutorialspoint
|
|
|
|
# some built-in variables
|
|
|
|
* FS: input field separator (by default, space)
|
|
* RS: record field separator (by default, newline)
|
|
* NF: number of fields in current record
|
|
* NR: number of current record
|
|
* FNR: number of current record relative to current file
|
|
* OFS: output field separator (by default, space)
|
|
* ORS: output record separator (by default, newline)
|
|
* RLENGTH: length of the string matched by match( )
|
|
|
|
$0 represents the entire input record, and $n the nth field in the current record
|
|
|
|
=> https://www.tutorialspoint.com/awk/awk_built_in_variables.htm awk tutorial: built-in variables
|
|
|
|
# some built-in functions
|
|
|
|
=> https://www.tutorialspoint.com/awk/awk_built_in_functions.htm awk tutorial: built-in functions
|
|
=> https://www.tutorialspoint.com/awk/awk_miscellaneous_functions.htm miscellaneous functions
|
|
=> https://www.tutorialspoint.com/awk/awk_string_functions.htm string functions
|
|
|
|
## strings
|
|
|
|
the index of the first character is 1!
|
|
|
|
* index( string, sub) : index of sub as a substring of string
|
|
* length( string )
|
|
* match( string, regex ) : index of the longest match of regex in string
|
|
* split( string, arr, regex ) : split string into array using regex as separato
|
|
* printf( format, expr-list)
|
|
* strtonum(string): useful to convert from hexadecimal (0x prefix) or octal (0 prefix)
|
|
* gsub( regex, sub, string): global substitution of regex with sub in string. if string is ommited, $0 is used
|
|
* sub(regex, sub, string): substitute regex with sub in string, once. if string is omitted, $0 is used
|
|
* substr(string, start, len): returns the substring from start index, with length len. if len is ommitted, it goes until the end of the string
|
|
* tolower( str )
|
|
* toupper( str )
|
|
|
|
## misc
|
|
|
|
* getline: read the next line
|
|
* next: stops the current processing and start over with next line
|
|
* system: execute the specified command and returns its exit status
|
|
* delete: delete an element from an array, or an array
|
|
|
|
# gawk
|
|
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/index.html gawk user's guide
|
|
=> https://www.gnu.org/software/gawk/manual/html_node/Bitwise-Functions.html bit-manipulation functions
|
|
|
|
note: to run in compatibility mode, without gnu extensions, use the -c or --traditional flag:
|
|
|
|
```
|
|
$ awk -c
|
|
```
|
|
|
|
# other notes
|
|
|
|
* apparently counters/accumulators don't need to be initialized at 0
|
|
* boolean values are 0 or 1
|
|
|
|
## record separation
|
|
|
|
records separated by empty lines can be extracted with:
|
|
|
|
```
|
|
RS = ""
|
|
```
|
|
|
|
without modifying FS, fields will be separated by any whitespace, including newlines.
|
|
|
|
gawk allows regexp in RS, traditional awk will only accept one character.
|
|
|
|
## loop through the elements of an array
|
|
|
|
```
|
|
arr["a"] = 1
|
|
arr["b"] = 2
|
|
|
|
for(key in arr)
|
|
print key ": " arr[key]
|
|
```
|