agent.process.classification_and_content module¶
Extract text, and get suggestions, features, and flags from Classifier.
-
class
agent.process.classification_and_content.
CheckStopwordCount
(submission_id, process_id=None)¶ Bases:
agent.process.base.Process
Check the submission content for too low stopword count.
-
check_stop_count
(previous, trigger, emit)¶ Flag the submission if the number of stopwords is too low.
- Return type
None
-
steps
= [<function CheckStopwordCount.check_stop_count>]¶
-
-
class
agent.process.classification_and_content.
CheckStopwordPercent
(submission_id, process_id=None)¶ Bases:
agent.process.base.Process
Check the submission content for too low percentage of stopwords.
-
check_stop_percent
(previous, trigger, emit)¶ Flag the submission if the percentage of stopwords is too low.
- Return type
None
-
steps
= [<function CheckStopwordPercent.check_stop_percent>]¶
-
-
class
agent.process.classification_and_content.
PlainTextExtraction
(submission_id, process_id=None)¶ Bases:
agent.process.base.Process
Extract plain text from a compiled PDF.
-
handle_plaintext_exception
(exc)¶ Handle exceptions raised when calling the plain text service.
- Return type
None
-
poll_extraction
(previous, trigger, emit)¶ Poll the plain text service until extraction is complete.
- Return type
None
-
start_extraction
(previous, trigger, emit)¶ Request extraction by the plain text service.
- Return type
None
-
steps
= [<function PlainTextExtraction.start_extraction>, <function PlainTextExtraction.poll_extraction>, <function PlainTextExtraction.retrieve_content>]¶
-
-
class
agent.process.classification_and_content.
RunAutoclassifier
(submission_id, process_id=None)¶ Bases:
agent.process.classification_and_content.PlainTextExtraction
Extract plain text and poll the autoclassifier.
In addition to generating classification suggestions, the current implementation of the autoclassifier also generates features (like word counts) and content flags (e.g. possible language issues, line numbers).
-
CLASSIFIER_FLAGS
= {'%stop': None, 'charset': <Type.CHARACTER_SET: 'character set'>, 'language': <Type.LANGUAGE: 'language'>, 'linenos': <Type.LINE_NUMBERS: 'line numbers'>, 'stops': None}¶
-
call_classifier
(content, trigger, emit)¶ Send plain text content to the autoclassifier.
- Return type
None
-
handle_classifier_exception
(exc)¶ Handle exceptions raised when calling the classifier service.
- Return type
None
-
process_result
(result, trigger, emit)¶ Process the results returned by the autoclassifier.
- Return type
None
-
steps
= [<function PlainTextExtraction.start_extraction>, <function PlainTextExtraction.poll_extraction>, <function PlainTextExtraction.retrieve_content>, <function RunAutoclassifier.call_classifier>, <function RunAutoclassifier.process_result>]¶
-