TenshiTranslator.Util.TextProcessor module¶
Contains functions that process text, such as removing indent, splitting sentences, etc.
- TenshiTranslator.Util.TextProcessor.isEmptyLine(line: str) bool ¶
Checks if a line is empty
- Parameters:
line – line to be checked
- Returns:
‘True’ if the line is empty, ‘False’ otherwise
- TenshiTranslator.Util.TextProcessor.isTimeoutMessage(line: str) bool ¶
Checks if the line is a timeout message from the Sugoi Translator site
- Parameters:
line – line to be checked
- Returns:
‘True’ if the line is a timeout message, ‘False’ otherwise
- TenshiTranslator.Util.TextProcessor.makeOutputFilePath(inputFilePath: str) str ¶
Builds the output file path from the input file path
- Parameters:
inputFilePath – path to the desired file
- Returns:
path of the output file, in format of <input file name>-Translated.<extension>
- TenshiTranslator.Util.TextProcessor.noJapaneseCharacters(line: str) bool ¶
Checks if string contains no Japanese characters
- Parameters:
line – line to be checked
- Returns:
‘True’ if the line contains no Japanese characters, ‘False’ otherwise
- TenshiTranslator.Util.TextProcessor.removeIndent(line: str) str ¶
Removes indent from a sentence
- Parameters:
line – line to be modified
- Returns:
line without indent
- TenshiTranslator.Util.TextProcessor.retrieveLines(inputFilePath: str) list[str] ¶
Retrieves every line from a .txt file. The code terminates if the file is not found.
- Parameters:
inputFilePath – path to the desired file
- Raises:
FileNotFoundError if the file is not found
- Returns:
list of lines from the file
- TenshiTranslator.Util.TextProcessor.splitToSentence(line: str, maxLength: int) list ¶
If the japanese paragraph is longer than the max allowed length, splits it into smaller sentences.
The sentences are split by the period character. This is used to comply with Sugoi translator site’s 100 character limit.
- Parameters:
line – lline to be split
maxLength – max allowed length of a sentence
- Returns:
list of sentences