rspamd_util
This module contains some generic purpose utilities that could be useful for testing and production rules.
Functions:
Function | Description |
---|---|
util.create_event_base() |
Creates new event base for processing asynchronous events. |
util.load_rspamd_config(filename) |
Load rspamd config from the specified file. |
util.config_from_ucl(any, string) |
Load rspamd config from ucl represented by any lua table. |
util.encode_base64(input[, str_len, [newlines_type]]) |
Encodes data in base64 breaking lines if needed. |
util.encode_qp(input[, str_len, [newlines_type]]) |
Encodes data in quoted printable breaking lines if needed. |
util.decode_qp(input) |
Decodes data from quoted printable. |
util.decode_base64(input) |
Decodes data from base64 ignoring whitespace characters. |
util.encode_base32(input, [b32type = 'default']) |
Encodes data in base32 breaking lines if needed. |
util.decode_base32(input, [b32type = 'default']) |
Decodes data from base32 ignoring whitespace characters. |
util.decode_url(input) |
Decodes data from url encoding. |
util.tokenize_text(input[, exceptions]) |
Create tokens from a text using optional exceptions list. |
util.tanh(num) |
Calculates hyperbolic tangent of the specified floating point value. |
util.parse_html(input) |
Parses HTML and returns the according text. |
util.levenshtein_distance(s1, s2) |
Returns levenstein distance between two strings. |
util.fold_header(name, value, [how, [stop_chars]]) |
Fold rfc822 header according to the folding rules. |
util.is_uppercase(str) |
Returns true if a string is all uppercase. |
util.humanize_number(num) |
Returns humanized representation of given number (like 1k instead of 1000). |
util.get_tld(host) |
Returns effective second level domain part (eSLD) for the specified host. |
util.glob(pattern) |
Returns results for the glob match for the specified pattern. |
util.parse_mail_address(str, [pool]) |
Parses email address and returns a table of tables in the following format. |
util.strlen_utf8(str) |
Returns length of string encoded in utf-8 in characters. |
util.lower_utf8(str) |
Converts utf8 string to lower case. |
util.normalize_utf8(str) |
Gets a string in UTF8 and normalises it to NFKC_Casefold form. |
util.transliterate(str) |
Converts utf8 encoded string to latin transliteration. |
util.strequal_caseless(str1, str2) |
Compares two strings regardless of their case using ascii comparison. |
util.strequal_caseless_utf8(str1, str2) |
Compares two utf8 strings regardless of their case using utf8 collation rules. |
util.get_ticks() |
Returns current number of ticks as floating point number. |
util.get_time() |
Returns current time as unix time in floating point representation. |
util.time_to_string(seconds) |
Converts time from Unix time to HTTP date format. |
util.stat(fname) |
Performs stat(2) on a specified filepath and returns table of values. |
util.unlink(fname) |
Removes the specified file from the filesystem. |
util.lock_file(fname, [fd]) |
Lock the specified file. |
util.unlock_file(fd, [close_fd]) |
Unlock the specified file closing the file descriptor associated. |
util.create_file(fname, [mode]) |
Creates the specified file with the default mode 0644. |
util.close_file(fd) |
Closes descriptor fd. |
util.random_hex(size) |
Returns random hex string of the specified size. |
util.zstd_compress(data, [level=1]) |
Compresses input using zstd compression. |
util.zstd_decompress(data) |
Decompresses input using zstd algorithm. |
util.gzip_decompress(data, [size_limit]) |
Decompresses input using gzip algorithm. |
util.inflate(data, [size_limit]) |
Decompresses input using inflate algorithm. |
util.gzip_compress(data, [level=1]) |
Compresses input using gzip compression. |
util.normalize_prob(prob, [bias = 0.5]) |
Normalize probabilities using polynom. |
util.is_utf_spoofed(str, [str2]) |
Returns true if a string is spoofed (possibly with another string str2 ). |
util.get_string_stats(str) |
Returns table with number of letters and digits in string. |
util.is_valid_utf8(str) |
Returns true if a string is valid UTF8 string. |
util.has_obscured_unicode(str) |
Returns true if a string has obscure UTF symbols (zero width spaces, order marks), ignores invalid utf characters. |
util.readline([prompt]) |
Returns string read from stdin with history and editing support. |
util.readpassphrase([prompt]) |
Returns string read from stdin disabling echo. |
util.file_exists(file) |
Checks if a specified file exists and is available for reading. |
util.mkdir(dir[, recursive]) |
Creates a specified directory. |
util.umask(mask) |
Sets new umask. |
util.isatty() |
Returns if stdout is a tty. |
util.pack(fmt, ...) |
. |
util.packsize(fmt) |
. |
util.unpack(fmt, s [, pos]) |
Unpacks string s according to the format string fmt as described in. |
util.caseless_hash(str[, seed]) |
Calculates caseless non-crypto hash from a string or rspamd text. |
util.caseless_hash_fast(str[, seed]) |
Calculates caseless non-crypto hash from a string or rspamd text. |
util.get_hostname() |
Returns hostname for this machine. |
util.parse_content_type(ct_string, mempool) |
Parses content-type string to a table. |
util.mime_header_encode(hdr[, is_structured]) |
Encodes header if needed. |
util.btc_polymod(input_values) |
Performs bitcoin polymod function. |
util.parse_smtp_date(str[, local_tz]) |
Converts an SMTP date string to unix timestamp. |
The module rspamd_util
defines the following functions.
util.create_event_base()
Creates new event base for processing asynchronous events
Parameters:
No parameters
Returns:
{ev_base}
: new event processing baseBack to module description.
util.load_rspamd_config(filename)
Load rspamd config from the specified file
Parameters:
No parameters
Returns:
{confg}
: new configuration object suitable for accessBack to module description.
util.config_from_ucl(any, string)
Load rspamd config from ucl represented by any lua table
Parameters:
No parameters
Returns:
{confg}
: new configuration object suitable for accessBack to module description.
util.encode_base64(input[, str_len, [newlines_type]])
Encodes data in base64 breaking lines if needed
Parameters:
input {text or string}
: input datastr_len {number}
: optional size of lines or 0 if split is not neededReturns:
{rspamd_text}
: encoded data chunkBack to module description.
util.encode_qp(input[, str_len, [newlines_type]])
Encodes data in quoted printable breaking lines if needed
Parameters:
input {text or string}
: input datastr_len {number}
: optional size of lines or 0 if split is not neededReturns:
{rspamd_text}
: encoded data chunkBack to module description.
util.decode_qp(input)
Decodes data from quoted printable
Parameters:
input {text or string}
: input dataReturns:
{rspamd_text}
: decoded data chunkBack to module description.
util.decode_base64(input)
Decodes data from base64 ignoring whitespace characters
Parameters:
input {text or string}
: data to decode; if rspamd{text}
is used then the string is modified in-placeReturns:
{rspamd_text}
: decoded data chunkBack to module description.
util.encode_base32(input, [b32type = 'default'])
Encodes data in base32 breaking lines if needed
Parameters:
input {text or string}
: input datab32type {string}
: base32 type (default, bleach, rfc)Returns:
{rspamd_text}
: encoded data chunkBack to module description.
util.decode_base32(input, [b32type = 'default'])
Decodes data from base32 ignoring whitespace characters
Parameters:
input {text or string}
: data to decodeb32type {string}
: base32 type (default, bleach, rfc)Returns:
{rspamd_text}
: decoded data chunkBack to module description.
util.decode_url(input)
Decodes data from url encoding
Parameters:
input {text or string}
: data to decodeReturns:
{rspamd_text}
: decoded data chunkBack to module description.
util.tokenize_text(input[, exceptions])
Create tokens from a text using optional exceptions list
Parameters:
input {text/string}
: input dataexceptions, {table}
: a table of pairs containing <start_pos,length> of exceptions in the inputReturns:
{table/strings}
: list of strings representing words in the textBack to module description.
util.tanh(num)
Calculates hyperbolic tangent of the specified floating point value
Parameters:
num {number}
: input numberReturns:
{number}
: hyperbolic tangent of the variableBack to module description.
util.parse_html(input)
Parses HTML and returns the according text
Parameters:
in {string|text}
: input HTMLReturns:
{rspamd_text}
: processed text with no HTML tagsBack to module description.
util.levenshtein_distance(s1, s2)
Returns levenstein distance between two strings
Parameters:
s1 {string}
: the first strings2 {string}
: the second stringReturns:
{number}
: number of differences in two stringsBack to module description.
util.fold_header(name, value, [how, [stop_chars]])
Fold rfc822 header according to the folding rules
Parameters:
name {string}
: name of the headervalue {string}
: value of the headerhow {string}
: “cr” for \r, “lf” for \n and “crlf” for \r\n (default)stop_chars {string}
: also fold header when theReturns:
{string}
: Folded value of the headerBack to module description.
util.is_uppercase(str)
Returns true if a string is all uppercase
Parameters:
str {string}
: input stringReturns:
{bool}
: true if a string is all uppercaseBack to module description.
util.humanize_number(num)
Returns humanized representation of given number (like 1k instead of 1000)
Parameters:
num {number}
: number to humanizeReturns:
{string}
: humanized representation of a numberBack to module description.
util.get_tld(host)
Returns effective second level domain part (eSLD) for the specified host
Parameters:
host {string}
: hostnameReturns:
{string}
: eSLD part of the hostname or the full hostname if eSLD was not foundBack to module description.
util.glob(pattern)
Returns results for the glob match for the specified pattern
Parameters:
pattern {string}
: glob pattern to match (‘?’ and ‘*’ are supported)Returns:
{table/string}
: list of matched filesBack to module description.
util.parse_mail_address(str, [pool])
Parses email address and returns a table of tables in the following format:
raw
- the original value without any processingname
- name of internet address in UTF8, e.g. for Vsevolod Stakhov <blah@foo.com>
it returns Vsevolod Stakhov
addr
- address part of the addressuser
- user part (if present) of the address, e.g. blah
domain
- domain part (if present), e.g. foo.com
flags
- table with following keys set to true if given condition fulfilled:
<blah@foo.com>
addressParameters:
str {string}
: input stringpool {rspamd_mempool}
: memory pool to useReturns:
{table/tables}
: parsed list of mail addressesBack to module description.
util.strlen_utf8(str)
Returns length of string encoded in utf-8 in characters. If invalid characters are found, then this function returns number of bytes.
Parameters:
str {string}
: utf8 encoded stringReturns:
{number}
: number of characters in stringBack to module description.
util.lower_utf8(str)
Converts utf8 string to lower case
Parameters:
str {string}
: utf8 encoded stringReturns:
{string}
: lowercased utf8 stringBack to module description.
util.normalize_utf8(str)
Gets a string in UTF8 and normalises it to NFKC_Casefold form RSPAMD_UNICODE_NORM_NORMAL = 0, RSPAMD_UNICODE_NORM_UNNORMAL = (1 « 0), RSPAMD_UNICODE_NORM_ZERO_SPACES = (1 « 1), RSPAMD_UNICODE_NORM_ERROR = (1 « 2), RSPAMD_UNICODE_NORM_OVERFLOW = (1 « 3)
Parameters:
str {string}
: utf8 encoded stringReturns:
{string,integer}
: lowercased utf8 string + result of the normalisation (use bit.band to check):Back to module description.
util.transliterate(str)
Converts utf8 encoded string to latin transliteration
Parameters:
str {string/text}
: utf8 encoded stringReturns:
{text}
: transliterated stringBack to module description.
util.strequal_caseless(str1, str2)
Compares two strings regardless of their case using ascii comparison.
Returns true
if str1
is equal to str2
Parameters:
str1 {string}
: utf8 encoded stringstr2 {string}
: utf8 encoded stringReturns:
{bool}
: result of comparisonBack to module description.
util.strequal_caseless_utf8(str1, str2)
Compares two utf8 strings regardless of their case using utf8 collation rules.
Returns true
if str1
is equal to str2
Parameters:
str1 {string}
: utf8 encoded stringstr2 {string}
: utf8 encoded stringReturns:
{bool}
: result of comparisonBack to module description.
util.get_ticks()
Returns current number of ticks as floating point number
Parameters:
No parameters
Returns:
{number}
: number of current clock ticks (monotonically increasing)Back to module description.
util.get_time()
Returns current time as unix time in floating point representation
Parameters:
No parameters
Returns:
{number}
: number of seconds since 01.01.1970Back to module description.
util.time_to_string(seconds)
Converts time from Unix time to HTTP date format
Parameters:
seconds {number}
: unix timestampReturns:
{string}
: date as HTTP dateBack to module description.
util.stat(fname)
Performs stat(2) on a specified filepath and returns table of values
size
: size of file in bytestype
: type of filepath: regular
, directory
, special
mtime
: modification time as unix timeParameters:
No parameters
Returns:
{string,table}
: string is returned when error is occurredBack to module description.
util.unlink(fname)
Removes the specified file from the filesystem
Parameters:
fname {string}
: filename to removeReturns:
{boolean,[string]}
: true if file has been deleted or false,’error string’Back to module description.
util.lock_file(fname, [fd])
Lock the specified file. This function returns {number} which must be passed to util.unlock_file
after usage
or you’ll have a resource leak
Parameters:
fname {string}
: filename to lockfd {number}
: use the specified fd instead of opening oneReturns:
{number|nil,string}
: number if locking was successful or nil + error otherwiseBack to module description.
util.unlock_file(fd, [close_fd])
Unlock the specified file closing the file descriptor associated.
Parameters:
fd {number}
: descriptor to unlockclose_fd {boolean}
: close descriptor on unlocking (default: TRUE)Returns:
{boolean[,string]}
: true if a file was unlockedBack to module description.
util.create_file(fname, [mode])
Creates the specified file with the default mode 0644
Parameters:
fname {string}
: filename to createmode {number}
: open mode (you should use octal number here)Returns:
{number|nil,string}
: file descriptor or pair nil + error stringBack to module description.
util.close_file(fd)
Closes descriptor fd
Parameters:
fd {number}
: descriptor to closeReturns:
{boolean[,string]}
: true if a file was closedBack to module description.
util.random_hex(size)
Returns random hex string of the specified size
Parameters:
len {number}
: length of desired string in bytesReturns:
{string}
: string with random hex digestsBack to module description.
util.zstd_compress(data, [level=1])
Compresses input using zstd compression
Parameters:
data {string/rspamd_text}
: input dataReturns:
{rspamd_text}
: compressed dataBack to module description.
util.zstd_decompress(data)
Decompresses input using zstd algorithm
Parameters:
data {string/rspamd_text}
: compressed dataReturns:
{error,rspamd_text}
: pair of error + decompressed textBack to module description.
util.gzip_decompress(data, [size_limit])
Decompresses input using gzip algorithm
Parameters:
data {string/rspamd_text}
: compressed datasize_limit {integer}
: optional size limitReturns:
{rspamd_text}
: decompressed textBack to module description.
util.inflate(data, [size_limit])
Decompresses input using inflate algorithm
Parameters:
data {string/rspamd_text}
: compressed datasize_limit {integer}
: optional size limitReturns:
{rspamd_text}
: decompressed textBack to module description.
util.gzip_compress(data, [level=1])
Compresses input using gzip compression
Parameters:
data {string/rspamd_text}
: input dataReturns:
{rspamd_text}
: compressed dataBack to module description.
util.normalize_prob(prob, [bias = 0.5])
Normalize probabilities using polynom
Parameters:
prob {number}
: probability parambias {number}
: number to subtract for making the final solutionReturns:
{number}
: normalized numberBack to module description.
util.is_utf_spoofed(str, [str2])
Returns true if a string is spoofed (possibly with another string str2
)
Parameters:
No parameters
Returns:
{boolean}
: true if a string is spoofedBack to module description.
util.get_string_stats(str)
Returns table with number of letters and digits in string
Parameters:
No parameters
Returns:
{table}
: with string stats keys are “digits” and “letters”Back to module description.
util.is_valid_utf8(str)
Returns true if a string is valid UTF8 string
Parameters:
No parameters
Returns:
{boolean}
: true if a string is spoofedBack to module description.
util.has_obscured_unicode(str)
Returns true if a string has obscure UTF symbols (zero width spaces, order marks), ignores invalid utf characters
Parameters:
No parameters
Returns:
{boolean}
: true if a has obscured unicode characters (+ character and offset if found)Back to module description.
util.readline([prompt])
Returns string read from stdin with history and editing support
Parameters:
No parameters
Returns:
{string}
: string read from the input (with line endings stripped)Back to module description.
util.readpassphrase([prompt])
Returns string read from stdin disabling echo
Parameters:
No parameters
Returns:
{string}
: string read from the input (with line endings stripped)Back to module description.
util.file_exists(file)
Checks if a specified file exists and is available for reading
Parameters:
No parameters
Returns:
{boolean,string}
: true if file exists + string error if notBack to module description.
util.mkdir(dir[, recursive])
Creates a specified directory
Parameters:
No parameters
Returns:
{boolean[,error]}
: true if directory has been createdBack to module description.
util.umask(mask)
Sets new umask. Accepts either numeric octal string, e.g. ‘022’ or a plain number, e.g. 0x12 (since Lua does not support octal integrals)
Parameters:
No parameters
Returns:
{number}
: old umaskBack to module description.
util.isatty()
Returns if stdout is a tty
Parameters:
No parameters
Returns:
{boolean}
: true in case of output being ttyBack to module description.
util.pack(fmt, ...)
Backport of Lua 5.3 string.pack
function:
Returns a binary string containing the values v1, v2, etc. packed (that is,
serialized in binary form) according to the format string fmt
A format string is a sequence of conversion options. The conversion
options are as follows:
: sets big endian
(A “[n]” means an optional integral numeral.) Except for padding, spaces, and configurations (options “xX <=>!”), each option corresponds to an argument (in string.pack) or a result (in string.unpack).
For options “!n”, “sn”, “in”, and “In”, n can be any integer between 1 and All integral options check overflows; string.pack checks whether the given value fits in the given size; string.unpack checks whether the read value fits in a Lua integer.
Any format string starts as if prefixed by “!1=”, that is, with maximum alignment of 1 (no alignment) and native endianness.
Alignment works as follows: For each option, the format gets extra padding until the data starts at an offset that is a multiple of the minimum between the option size and the maximum alignment; this minimum must be a power of 2. Options “c” and “z” are not aligned; option “s” follows the alignment of its starting integer.
All padding is filled with zeros by string.pack (and ignored by unpack).
Parameters:
No parameters
Returns:
No return
Back to module description.
util.packsize(fmt)
Returns size of the packed binary string returned for the same fmt
argument
by util.pack
Parameters:
No parameters
Returns:
No return
Back to module description.
util.unpack(fmt, s [, pos])
Unpacks string s
according to the format string fmt
as described in
util.pack
Parameters:
No parameters
Returns:
fmt
Back to module description.
util.caseless_hash(str[, seed])
Calculates caseless non-crypto hash from a string or rspamd text
Parameters:
str {no type}
: string or lua_textseed {no type}
: mandatory seed (0xdeadbabe by default)Returns:
{int64}
: boxed int64_tBack to module description.
util.caseless_hash_fast(str[, seed])
Calculates caseless non-crypto hash from a string or rspamd text
Parameters:
str {no type}
: string or lua_textseed {no type}
: mandatory seed (0xdeadbabe by default)Returns:
{number}
: number from int64_tBack to module description.
util.get_hostname()
Returns hostname for this machine
Parameters:
No parameters
Returns:
{string}
: hostnameBack to module description.
util.parse_content_type(ct_string, mempool)
Parses content-type string to a table:
type
subtype
charset
boundary
Parameters:
ct_string {string}
: content type as stringmempool {rspamd_mempool}
: needed to store temporary data (e.g. task pool)Returns:
Back to module description.
util.mime_header_encode(hdr[, is_structured])
Encodes header if needed
Parameters:
hdr {string}
: input headeris_structured {boolean}
: if true, then we encode as structured header (e.g. encode all non alpha-numeric characters)Returns:
Back to module description.
util.btc_polymod(input_values)
Performs bitcoin polymod function
Parameters:
input_values {table|numbers}
: no descriptionReturns:
{boolean}
: true if polymod has been successfulBack to module description.
util.parse_smtp_date(str[, local_tz])
Converts an SMTP date string to unix timestamp
Parameters:
str {string}
: input stringlocal_tz {boolean}
: convert to local tz if true
Returns:
{number}
: time as unix timestamp (converted to float)Back to module description.
Back to top.