Biber::Utils − Various utility subs used in Biber
All functions are exported by default.
glob_data_file
Expands a data file glob to a list of filenames
locate_data_file
Searches for a data file by
The exact path if the filename is absolute
In the input_directory, if defined
In the output_directory, if defined
Relative to the current directory
In the same directory as the control file
Using kpsewhich, if available
Check existence
of NFC/NFD file variants and return correct one.
Account for windows file encodings
biber_warn
Wrapper around various warnings bits and pieces
Logs a warning, add warning to the list of .bbl warnings and
optionally
increments warning count in Biber object, if present
biber_error
Wrapper around error logging
Forces an exit.
makenamesid
Given a Biber::Names object, return an underscore normalised
concatenation of all of the full name strings.
makenameid
Given a Biber::Name object, return an underscore normalised
concatenation of the full name strings.
latex_recode_output
Tries to convert UTF−8 to TeX macros in passed
string
strip_noinit
Removes elements which are not to be considered during
initials generation
in names
strip_nosort
Removes elements which are not to be used in sorting a name
from a string
normalise_string_label
Remove some things from a string for label generation.
Don’t strip \p{Dash} as this is needed to process
compound names or label generation.
normalise_string_sort
Removes LaTeX macros, and all punctuation, symbols,
separators as well as leading and trailing whitespace for
sorting strings. Control chars don’t need to be
stripped as they are completely ignorable in
DUCET
normalise_string_bblxml
Some string normalisation for bblxml output
normalise_string
Removes LaTeX macros, and all punctuation, symbols,
separators and control characters, as well as leading and
trailing whitespace for sorting strings. Only decodes LaTeX
character macros into Unicode if output is
UTF−8
normalise_string_common
Common bit for normalisation
normalise_string_hash
Normalise strings used for hashes. We collapse LaTeX macros
into a vestige
so that hashes are unique between things like:
Smith
{\v S}mith
we replace macros like this to preserve their vestiges:
\v S −> v:
\" −> 34:
normalise_string_underscore
Like normalise_string, but also substitutes ~ and whitespace
with underscore.
escape_label
Escapes a few special character which might be used in
labels
unescape_label
Unscapes a few special character which might be used in
label but which need
sorting without escapes
reduce_array
reduce_array(\@a, \@b) returns all elements in @a that are
not in @b
remove_outer
Remove surrounding curly brackets:
'{string}' −> 'string'
but not
'{string} {string}' −> 'string} {string'
Return (boolean if stripped, string)
has_outer
Return (boolean if surrounded in braces
add_outer
Add surrounding curly brackets:
'string' −> '{string}'
ucinit
upper case of initial letters in a string
is_undef
Checks for undefness of arbitrary things, including
composite method chain calls which don't reliably work
with defined() (see perldoc for defined())
This works because we are just testing the value passed
to this sub. So, for example, this is randomly unreliable
even if the resulting value of the arg to defined() is
"undef":
defined($thing−>method($arg)−>method)
wheras:
is_undef($thing−>method($arg)−>method)
works since we only test the return value of all the methods
with defined()
is_def
Checks for definedness in the same way as is_undef()
is_undef_or_null
Checks for undef or nullness (see is_undef() above)
is_def_and_notnull
Checks for def and unnullness (see is_undef() above)
is_def_and_null
Checks for def and nullness (see is_undef() above)
is_null
Checks for nullness
is_notnull
Checks for notnullness
is_notnull_scalar
Checks for notnullness of a scalar
is_notnull_array
Checks for notnullness of an array (passed by ref)
is_notnull_hash
Checks for notnullness of an hash (passed by ref)
is_notnull_object
Checks for notnullness of an object (passed by ref)
stringify_hash
Turns a hash into a string of keys and values
normalise_utf8
Normalise any UTF−8 encoding string immediately to
exactly what we want
We want the strict perl utf8 "UTF−8"
inits
We turn the initials into an array so we can be flexible
with them later
The tie here is used only so we know what to split on. We
don't want to make
any typesetting decisions in Biber, like what to use to join
initials so on
output to the .bbl, we only use BibLaTeX macros.
join_name
Replace all join typsetting elements in a name part (space,
ties) with BibLaTeX macros
so that typesetting decisions are made in BibLaTeX, not
hard−coded in Biber
filter_entry_options
Process any per_entry option transformations which are
necessary on output
imatch
Do an interpolating (neg)match using a match RE and a string
passed in as variables
Using /g on matches so that $1,$2 etc. can be populated from
repeated matches of
same capture group as well as different groups
ireplace
Do an interpolating match/replace using a match RE,
replacement RE
and string passed in as variables
validate_biber_xml
Validate a biber/biblatex XML metadata file against an RNG
XML schema
map_boolean
Convert booleans between strings and numbers. Because
standard XML "boolean"
datatype considers "true" and "1" the
same etc.
process_entry_options
Set per−entry options
merge_entry_options
Merge entry options, dealing with conflicts
expand_option_input
Expand options such as meta−options coming from
biblatex
parse_date_range
Parse of ISO8601 date range
Returns two−element array ref: [start DT object, end
DT object]
parse_date_unspecified
Parse of ISO8601−2:2016 4.3 unspecified format into
date range
Returns range plus specification of granularity of
unspecified
parse_date_start
Convenience wrapper
parse_date_end
Convenience wrapper
parse_date
Parse of EDTF dates
date_monthday
Force month/day to ISO8601−2:2016 format with leading
zero
biber_decode_utf8
Perform NFD form conversion as well as UTF−8
conversion. Used to normalize
bibtex input as the T::B interface doesn't allow a neat
whole file slurping.
out
Output to target. Outputs NFC UTF−8 if output is
UTF−8
process_comment
Fix up some problems with comments after being processed by
btparse
locale2bcp47
Map babel/polyglossia language options to a sensible CLDR
(bcp47) locale default
Return input string if there is no mapping
bcp472locale
Map CLDR (bcp47) locale to a babel/polyglossia locale
Return input string if there is no mapping
rangelen
Calculate the length of a range field
Range fields are an array ref of two−element array
refs [range_start, range_end]
range_end can be be empty for open−ended range or
undef
Deals with Unicode and ASCII roman numerals via the magic of
Unicode NFKD form
m−n −> [m, n]
m −> [m, undef]
m− −> [m, '']
−n −> ['', n]
− −> ['', undef]
match_indices
Return array ref of array refs of matches and start indices
of matches
for provided array of compiled regexps into string
parse_range
Parses a range of values into a two−value array ref.
Ranges with no starting value default to "1"
Ranges can be open−ended and it's up to surrounding
code to interpret this
Ranges can be single figures which is shorthand for
1−x
strip_annotation
Removes annotation marker from a field name
parse_range_alt
Parses a range of values into a two−value array ref.
Either start or end can be undef and it's up to surrounding
code to interpret this
maploopreplace
Replace loop markers with values.
get_transliterator
Get a ref to a transliterator for the given from/to
We are abstracting this in this way because it is not clear
what the future
of the transliteration library is. We want to be able to
switch.
call_transliterator
Run a transliterator on passed text. Hides call semantics of
transliterator
so we can switch engine in the future.
Philip Kime "<philip at kime.org.uk>"
Please report any bugs or feature requests on our Github tracker at <https://github.com/plk/biber/issues>.
Copyright 2012−2019 Philip Kime, all rights reserved.
This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.