search.services.index.util module¶
Helpers for building ES queries.
-
search.services.index.util.
DATE_PARTIAL
= '(?:^|[\\s])(\\d{2})((?:0[1-9]{1})|(?:1[0-2]{1}))(?:$|[\\s])'¶ Used to match parts of paper IDs that encode the announcement date.
-
search.services.index.util.
MAX_RESULTS
= 10000¶ This is the maximum result offset for pagination.
-
search.services.index.util.
OLD_ID_NUMBER
= '(910[7-9]|911[0-2]|9[2-9](0[1-9]|1[0-2])|0[0-6](0[1-9]|1[0-2])|070[1-3])(00[1-9]|0[1-9][0-9]|[1-9][0-9][0-9])'¶ The number part of the old arXiv identifier looks like YYMMNNN.
The old arXiv identifier scheme was used between 1991-07 and 2007-03 (inclusive).
-
search.services.index.util.
Q_
(qtype, field, value, operator='or')[source]¶ Construct a
Q
, but handle wildcards first.Return type: <function Q at 0x7f5f70230510>
-
search.services.index.util.
STRING_LITERAL
= re.compile('([\\"][^\\"]*[\\"])')¶ Pattern for string literals (quoted) in search queries.
-
search.services.index.util.
escape
(term, quotes=False)[source]¶ Escape special characters.
Return type: str
-
search.services.index.util.
has_wildcard
(term)[source]¶ Determine whether or not
term
contains a wildcard.Return type: bool
-
search.services.index.util.
is_literal_query
(term)[source]¶ Determine whether the term is intended to be treated as a literal.
Return type: bool
-
search.services.index.util.
is_old_papernum
(term)[source]¶ Check whether term matches 7-digit pattern for old arXiv ID numbers.
Return type: bool
-
search.services.index.util.
is_tex_query
(term)[source]¶ Determine whether the term is intended as a TeX query.
Return type: bool
-
search.services.index.util.
parse_date
(term)[source]¶ Attempt to find date-related information in the query.
Parameters: term (str) – Search term. Returns: First element is the responding date-related fragment, second element is the remainder of term (without the date). Return type: tuple Raises: ValueError
– Raised if no date-related information is found in term.Return type: Tuple
[str
,str
]
-
search.services.index.util.
parse_date_partial
(term)[source]¶ Convert a 4-digit ID date partial into a full year-month value.
This can be used to search for papers by announcement date.
Parameters: term (str) – Search term. Returns: Date in yyyy-MM format, if found. Return type: str Return type: Optional
[str
]
-
search.services.index.util.
remove_single_characters
(term)[source]¶ Remove any single characters in the search string.
Return type: str
-
search.services.index.util.
sort
(query, search)[source]¶ Apply sorting to a
Search
.Return type: Search
-
search.services.index.util.
strip_punctuation
(s)[source]¶ Remove all punctuation characters from a string.
Return type: str
-
search.services.index.util.
wildcard_escape
(querystring)[source]¶ Detect wildcard characters, and escape any that occur within a literal.
Parameters: querystring (str) – Returns: - str – Query string with wildcard characters enclosed in literals escaped.
- bool – If a non-literal wildcard character is present, returns True.
Return type: Tuple
[str
,bool
]