Text Manipulation
Extensions that manipulate with text
Last updated
Extensions that manipulate with text
Last updated
Textify has a main goal of providing you with text extensions that allow you to perform various operations on a string, and they will be listed individually in this page to have a quick overview of what each function does.
Text tools can be found in the TextTools
class under the Textify.Global
namespace. They make it easier for you to manipulate with strings.
The below tools allow you to get the width of a character.
GetCharWidth()
This uses the Unicode width database that Textify maintains internally to be able to determine whether a character use one, two, or zero cells. Some of the characters are either assigned as ambiguous, are private use characters, or unassigned characters, and their handling can be controlled by the following properties:
This is primarily used for console operations, and is a good start to implement console applications that support CJK, such as console applications that use .
GetCharWidthType()
This function allows you to easily get the character width type from a specified Unicode character codepoint. This will return one of the following types:
Private
NonPrinting
Combining
DoubleWidth
Ambiguous
Emoji
Unassigned
The following functions allow you to wrap a long string into a specified length, both character-wise and word-wise.
GetWrappedSentences()
This function allows you to wrap a long string into a list of strings that represent resultant lines. These lines correspond wrapped sentences by a specified length, and only wraps by the amount of characters. In addition to that, you can also specify an indentation length for the first line of the wrapped sentence.
Nitrocid
wrapped into four characters with no indentation will result in the following lines:
Nitr
ocid
Nitrocid
wrapped into four characters with 2 characters indentation will result in the following lines:
Ni
troc
id
GetWrappedSentencesByWords()
This function allows you to wrap a long string into a list of strings that represent resultant lines. These lines correspond wrapped sentences by a specified length, but also takes words into account, like word processors, for readability. In addition to that, you can also specify an indentation length for the first line of the wrapped sentence.
Nitrocid KS kernel sim
wrapped into four characters with no indentation will result in the following lines:
Nitr
ocid
KS
kern
el
sim
Nitrocid KS kernel sim
wrapped into four characters with 2 characters indentation will result in the following lines:
Ni
troc
id
KS
kern
el
sim
The following functions manipulate with the double quotation in the string.
SplitEncloseDoubleQuotes()
This function splits a string with either a new line, a specific character, or a character condition, with support for splitting with double quotation marks (single quote, double quote, or backticks), while releasing the quotation marks that surround the string.
The primary reason was that because we were uncomfortable with the usage of Microsoft.VisualBasic
in C# applications as we were migrating to C# for Nitrocid KS in 2022.
A string, First "Second Third" Fourth
, will be split like this:
First
Second Third
Fourth
Partial quote split characters can also be specified, but you'll need to be aware of the implications when using it, so it's best not to specify unless you're dealing with a very specific string.
SplitEncloseDoubleQuotesNoRelease()
This function splits a string with either a new line, a specific character, or a character condition, with support for splitting with double quotation marks (single quote, double quote, or backticks), without releasing the quotation marks that surround the string.
The primary reason was that because we were uncomfortable with the usage of Microsoft.VisualBasic
in C# applications as we were migrating to C# for Nitrocid KS in 2022.
A string, First "Second Third" Fourth
, will be split like this:
First
"Second Third"
Fourth
Partial quote split characters can also be specified, but you'll need to be aware of the implications when using it, so it's best not to specify unless you're dealing with a very specific string.
ReleaseDoubleQuotes()
This function allows you to remove surrounding double quotes from the beginning and the end of the string. For example, the following strings will be changed:
"Double quotes"
-> Double quotes
'Single quotes'
-> Single quotes
`Backticks`
-> Backticks
GetEnclosedDoubleQuotesType()
This function allows you to determine the type of the double quotation that is found in the beginning and the end of the string. The following strings will be processed this way:
"Double quotes"
-> EnclosedDoubleQuotesType.DoubleQuotes
'Single quotes'
-> EnclosedDoubleQuotesType.SingleQuotes
`Backticks`
-> EnclosedDoubleQuotesType.Backticks
Normal
-> EnclosedDoubleQuotesType.None
The following functions manipulate with the new lines in the string.
SplitNewLines()
This function allows you to easily split the string by new lines. You can optionally exclude empty lines by setting emptyStrings
to false
. This is platform-agnostic so that you don't have to specify what kind of new line you're splitting with.
A string that contains line breaking characters, such as "First\r\nSecond\r\nThird"
, will be split like this:
First
Second
Third
UnixifyNewLines()
This function allows you to normalize the new line characters to convert them to Unix-based newlines (LF
). This supports common and uncommon new line characters, such as CR
+ LF
for Windows, CR
for Mac OS 9, and others. This conforms to the Unicode standards.
"First\r\nSecond\r\nThird"
-> "First\nSecond\nThird"
"First\rSecond\rThird"
-> "First\nSecond\nThird"
The following functions allow you to perform different kinds of beginning, ending, and substring detection in the string.
StartsWithAnyOf()
and StartsWithAllOf()
This checks the string for a list of prefixes in the OR and the AND logical condition, respectively.
StartsWithAnyOf()
checks to see if any of the prefixes is found within the beginning of the string.
For example, this string "pre_rel-01-Servicing"
returns true
if any of "pre_"
and "rel_"
prefixes match.
StartsWithAllOf()
checks to see if all of the prefixes is found within the beginning of the string.
For example, this string "dotnet-hostfxr-8.0"
returns true
if all of "dotnet-"
and "dotnet-hostfxr-"
prefixes match, but returns false
if one of the prefixes don't match, such as "dotnet-runtime-8.0"
.
EndsWithAnyOf()
and EndsWithAllOf()
This checks the string for a list of suffixes in the OR and the AND logical condition, respectively.
EndsWithAnyOf()
checks to see if any of the suffixes is found within the ending of the string.
For example, this string "Release-5.0-OOB"
returns true
if either the "-OOB"
or the "-RTM"
suffixes match.
EndsWithAllOf()
checks to see if all of the suffixes is found within the ending of the string.
For example, this string "Release-5.0-OOB"
returns true
if both the "-OOB"
and the "-5.0-OOB"
suffixes match, but returns false
if one of the suffixes doesn't match, for example, "Release-4.6-OOB"
.
ContainsAnyOf()
and ContainsAllOf()
This checks the string for a list of substrings in the OR and the AND logical condition, respectively.
ContainsAnyOf()
checks to see if any of the substrings is found within the string.
For example, this string "Branch-Prod-5.0"
returns true
if either the "Prod"
or the "Staging"
substrings match.
ContainsAllOf()
checks to see if all of the substrings is found within the string.
For example, this string "Branch-Prod-5.0"
returns true
if both the "Prod"
and the "Branch"
substrings match, but returns false
if one of the substrings doesn't match, for example, "Branch-Staging-5.0"
.
The following functions allow you to perform replacement operations on a string.
ReplaceAll()
This function allows you to perform a replacement of a list of specified characters or substrings with either a single string or a single character that will be used for replacement.
Please <replace> Nitrocid. This sub is a unit <replace2>.
Replacement string to replace <replace>
and <replace2>
: test
Result: Please test Nitrocid. This sub is a unit test.
ReplaceAllRange()
This function allows you to perform a replacement of a list of specified characters or substrings with either a list of strings or characters that will be used for replacement. This is a bulk replacement.
Please <replace> Nitrocid. This sub is a unit <replace2>.
Replacement strings to replace <replace>
and <replace2>
: test the integrity of
, test
Result: Please test the integrity of Nitrocid. This sub is a unit test.
In order for this to work, the length of both the replacement source array and the target array must be at the same length.
ReplaceLastOccurrence()
This function allows you to replace the last occurrence of either a character or a substring with the target replacement character or substring.
Nitrocid is awesome and is great!
Replacement string to replace the last is
: its features are
Result: Nitrocid is awesome and its features are great!
ReplaceChar()
This function allows you to replace a character in a specified index with a replacement character.
Textyfy
Character to replace in index 4
(char 5): i
Result: Textify
The following functions allow you to perform index operations on a string.
AllIndexesOf()
This function allows you to get all indexes of either a target string or a target character. This allows for more specific replacements or analysis.
Nitrocid is awesome and is great!
Character to get its index in a string: a
Results
First index: 12
(char 13)
Second index: 20
(char 21)
Third index: 30
(char 31)
The following functions allow you to perform formatting operations on a string.
FormatString()
This function allows you to format a string using a string extension. This makes use of the String.Format()
function, but doesn't throw an exception. If formatting fails, the string is returned unmodified.
Nitrocid KS 0.0.1 first launched {0}/{1}/{2}.
Formatted variables: 2
, 22
, 2018
Result: Nitrocid KS 0.0.1 first launched 2/22/2018.
IsStringNumeric()
This function checks to see whether this string can be expressed as a number or not. This function also supports double-precision floating point values.
String "1"
returns true
String "a"
returns false
The following functions allow you to perform operations on a prefix or a suffix within a string.
AddPrefix()
and AddSuffix()
These functions allow you to add either a prefix or a suffix to the string, respectively. This makes the task easier for you by automatically checking to see if the string already starts with a prefix or ends with a suffix.
Prefixes
Adding prefixes with checking: Hello
with str
as prefix becomes strHello
, and strHello
with str
as prefix becomes strHello
.
Adding prefixes without checking: Hello
with str
as prefix becomes strHello
, and strHello
with str
as prefix becomes strstrHello
.
Suffixes
Adding suffixes with checking:
If you want to turn automatic checking off, you can set the check
argument value to false
.
RemovePrefix()
and RemoveSuffix()
These functions allow you to remove either a prefix or a suffix to the string, respectively. This operation checks to see if the string already starts with a prefix or ends with a suffix.
VerifyPrefix()
and VerifySuffix()
These functions allow you to verify a prefix and a suffix within a string by comparing the following:
Prefix at the beginning of the string
Suffix at the end of the string
Currently, this comparison is case-sensitive according to your current culture settings determined by your operating system. However, the comparison
argument lets you control case-sensitivity and culture-specific settings. For instance, you can make use of OrdinalIgnoreCase
to verify the prefix or the suffix ordinally without checking for case sensitivity.
The following functions allow you to encode and decode your string easily.
GetBase64Encoded()
This encodes a specified string and returns a BASE64 encoded string that can be decoded.
GetBase64Decoded()
This decodes a specified BASE64 string and returns a decoded string that can be encoded.
The following functions allow you to manipulate with cases in a string.
UpperFirst()
and LowerFirst()
This allows you to make the first character in a string upper case or lower case.
ToTitleCase()
This function allows you to change the casing of all words in a string except the small words that should be kept lowercase, such as the following:
of
the
a
an
in
on
to
from
These tools allow you to escape and unescape some of the illegal characters.
The following characters are escaped:
\
, *
, +
, ?
, |
, {
, [
, (
, )
, ^
, $
, .
, #
,
, -
, "
, '
, `
, !
Escape()
This function allows you to escape some of the illegal characters for string parsing.
Unescape()
This function allows you to unescape some of the illegal characters for human readability.
The functions that fall into this category allow you to determine the letter repetition pattern by the number of steps.
GetLetterRepetitionPattern()
This function allows you to get a number that represents a letter repetition pattern (LRP) that determines how many times a program needs to step n
characters, which is specified in the steps
parameter, before the final step round reaches the end of the string.
GetLetterRepetitionPatternTable()
These functions allow you to get a read-only dictionary that represents a number of steps taken multipled by the number of iterations.
The first function overload allows you to specify either a single iteration or double iterations.
The second function overload allows you to specify a number of iterations.
GetListOfRepeatedLetters()
This allows you to get a list of repeated letters in a read-only dictionary form:
The key is a single unique character found in a string
The value represents how many times a character has occurred in a string
By default, this function populates characters that are only populated once. If you're not interested in this detail, you can remove them by passing the removeSingle
argument as true
.
These tools allow you to perform logical comparison operations in a string.
CompareLogical()
This function allows you to compare two strings logically (that is, alphanumerically) similar to how Windows Explorer sorts files. This returns either a result of CompareTo()
against two strings or a result of the same function against two numeric chunks detected.
Usually, you'll only need to use the LogicalComparer
class as a comparer when sorting strings this way.
OrderLogically()
and OrderDescendLogically()
This function simplifies the usage of the LogicalComparer
class by wrapping it with the OrderBy()
and the OrderByDescending()
functions.
These tools allow you to test strings for equality, prefixes, and suffixes in different ways.
EqualsNoCase()
This function allows you to test string equality easily without checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCultureIgnoreCase
StringComparison.InvariantCultureIgnoreCase
StringComparison.OrdinalIgnoreCase
EqualsCase()
This function allows you to test string equality easily while checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCulture
StringComparison.InvariantCulture
StringComparison.Ordinal
StartsWithNoCase()
This function allows you to test string prefix easily without checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCultureIgnoreCase
StringComparison.InvariantCultureIgnoreCase
StringComparison.OrdinalIgnoreCase
StartsWithCase()
This function allows you to test string prefix easily while checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCulture
StringComparison.InvariantCulture
StringComparison.Ordinal
EndsWithNoCase()
This function allows you to test string suffix easily without checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCultureIgnoreCase
StringComparison.InvariantCultureIgnoreCase
StringComparison.OrdinalIgnoreCase
EndsWithCase()
This function allows you to test string suffix easily while checking for case sensitivity.
The comparison argument must be supplied with one of the following comparison options:
StringComparison.CurrentCulture
StringComparison.InvariantCulture
StringComparison.Ordinal
ContainsWithNoCase()
This function allows you to check a substring without checking for case sensitivity.
The following functions allow you to perform operations with wide strings and characters on a string.
GetWideChars()
This function allows you to get a list of wide characters that are found within a string. A wide character description can be found in this page.
The following functions allow you to perform even more operations on a string.
ShiftLetters()
This function allows you to easily shift characters within a string by a number of character shifting steps that can be specified from -255
to 255
.
Shift Hello
by 1
: Ifmmp
Shift Hello
by -1
: Gdkkn
TruncateString()
This function allows you to truncate a string into a specified string length. This helps in situations where wrapping is not possible or the user needs a truncated string.
Reverse()
This function allows you to easily reverse the order of characters in a string. For example, Reversed
is desreveR
.
GetEnclosedWordFromIndex()
This function splits the string by spaces internally, then determines what word is from a specified source index. You can also include the symbols in the resultant enclosed word. Here are the following examples:
Without symbols
Hello world!
at index 2
: Hello
Hello world!
at index 8
: World
With symbols
Hello world!
at index 2
: Hello
Hello world!
at index 8
: World!
GetIndexOfEnclosedWordFromIndex()
This function splits the string by spaces internally, then determines what word is from a specified source index, and gets the index of its first character. You can also include the symbols in the resultant enclosed word. Here are the following examples:
Without symbols
!Hello world!
at index 2
: 1
Hello world!
at index 8
: 6
With symbols
!Hello world!
at index 2
: 0
Hello world!
at index 8
: 6
ReadNullTerminatedString()
This function allows you to read a null-terminated string, optionally chopping the source string starting from offset
index.
IsPalindrome()
This function checks to see if a specified string is a palindrome or not. A string is considered to be a palindrome if the other half of the string is a mirror of the first half, such as madam
or noon
. Strings such as Word
and Laura
are not palindromes.
ToStringBuilder()
This function allows you to easily get a string builder from a string to perform operations on a string without allocations.
BreakSurrogates()
This function allows you to break a string that consists of high surrogate and low surrogate characters into their individual character representations of the surrogates.
Character management tools can be found in the CharManager
class under the Textify.General
namespace. They make it easier for you to manipulate with individual characters.
NewLine
This property returns a new line that is returned by the Environment.NewLine
property. This changes depending on the operating system, such as CR
+ LF
on Windows and LF
on Unix systems.
GetAllAsciiChars()
Gets all 256 ASCII characters. You can refer to the ASCII table here.
GetAllChars()
Gets all Unicode characters ranging from \u0000
to \uFFFF
in hexadecimal representation.
GetAllLettersAndNumbers()
Gets all letter and number characters. If unicode
is set to true, it returns all letter and number characters found within Unicode. Otherwise, it uses the ASCII table to look up letters and numbers.
GetAllLetters()
Gets all letter characters. If unicode
is set to true, it returns all letter characters found within Unicode. Otherwise, it uses the ASCII table to look up letters.
GetAllNumbers()
Gets all number characters. If unicode
is set to true, it returns all number characters found within Unicode. Otherwise, it uses the ASCII table to look up numbers.
GetAllDigitChars()
Gets all characters that represent a digit. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllControlChars()
Gets all control characters from the whole Unicode table, such as escape character, bell character, and more.
GetAllRealControlChars()
Gets all control characters from the whole Unicode table, such as escape character, bell character, and more.
GetAllSurrogateChars()
Gets all high and low surrogate characters from the whole Unicode table.
GetAllHighSurrogateChars()
Gets all high surrogate characters from the whole Unicode table.
GetAllLowSurrogateChars()
Gets all low surrogate characters from the whole Unicode table.
GetAllLowerChars()
Gets all lower case characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllUpperChars()
Gets all upper case characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllPunctuationChars()
Gets all punctuation characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllSeparatorChars()
Gets all separator characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllSymbolChars()
Gets all symbol characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetAllWhitespaceChars()
Gets all whitespace characters. If unicode
is set to true, it returns all such characters found within Unicode. Otherwise, it uses the ASCII table to look up such characters.
GetEsc()
Gets an escape character. This function is a wrapper of the escape character, \u001b
.
IsControlChar()
Checks whether the specified character is a real control character. This checks to see if the following conditions are true:
The character is greater than the NULL character (\u0000
) and less than the BACKSPACE character (\u0008
)
The character is greater than the CARRIAGE RETURN character (\u000D
) and less than the SUBSTITUTE character (\u001A
)
Mathematically, the algorithm that this function uses can be described as:
The below functions are also found in Terminaux, though they also employ VT sequence support to help process them. For console applications, it's better to use the version.
This function also found in Terminaux, though they also employ VT sequence support to help process them. For console applications, it's better to use the version.