Split Text with RegEx

Usage

This action is used to split a text using a specific pattern.

Important Notes

Zenphi provides three actions that utilize RegEx for text processing, each serving a distinct purpose. It’s important not to confuse them:

  1. Split Text with RegEx
    Splits a text into a collection of strings based on a specified RegEx pattern.
    Use Case: Separating a paragraph into sentences or splitting a CSV string into individual values.

  2. Find Values with RegEx (this action)
    Finds and extracts all values in a text that match a specified RegEx pattern.
    Use Case: Extracting all email addresses, phone numbers, or dates from a document.

  3. Find Value with RegEx
    Finds a single value that matches a specified RegEx pattern.
    Use Case: Extracting the first occurrence of a specific pattern, such as the first email address in a text.

Understanding these differences ensures you select the right action for your specific workflow requirements.

Fields

1.Text - The text which you would like to split by specific pattern.

2.Pattern - REGEX Pattern to Use.

3.Case Insensitive Toggle - This field shows if the operation should be case insensitive.

4.REGEX Toggle - Shows if the operation should be done using REGEX.

Demonstration on how to use it in a flow

1.Drag and drop RegEx - Split action into the flow.

2.The Name section is pre-filled by the action name but you can configure this section according to your preference.

3.Click the gear icon to open its settings.

4.Enter the text.

5.Enter the REGEX pattern.

6.Specify whether it should be case sensitive or not.

7.Specify whether REGEX should be used or not.


Introduction to Regular Expressions (RegEx)

Regular Expressions (RegEx) are a powerful tool used for pattern matching and text manipulation. They allow you to search for specific patterns within strings (texts). Below are the basics to help you create simple patterns:

1. Literal Characters

:Matches the exact characters you type.

Example:

  • Pattern: apple
  • Matches: "apple" in "I like apple pie."

2. Dot (.)

Matches any single character except a newline.

Example:

  • Pattern: a.b
  • Matches: "aab", "axb", "acb" in "aab axb acb"

3. Character Classes

Matches any character inside the square brackets [].

Examples:

  • Pattern: [abc]
    • Matches: "a", "b", "c" in "apple banana cat"
  • Pattern: [0-9]
    • Matches: Any digit (0 through 9) in "123 abc"

4. Negated Character Classes

Matches any character except those inside the square brackets [^].

Example:

  • Pattern: [^a-z]
    • Matches: Any non-lowercase letter in "apple123"

5. Quantifiers

Specifies how many times a character or group should appear.

Examples:

  • *: Zero or more times
    • Pattern: a*
    • Matches: "aaa", "a", "" in "aaa apple a"
  • +: One or more times
    • Pattern: a+
    • Matches: "aaa", "a" in "aaa apple a"
  • ?: Zero or one time
    • Pattern: a?
    • Matches: "a", "" in "apple"

6. Anchors

Matches positions in the text (start or end).

Examples:

  • ^: Matches the start of a string
    • Pattern: ^apple
    • Matches: "apple" at the beginning of a string
  • $: Matches the end of a string
    • Pattern: pie$
    • Matches: "pie" at the end of a string

7. Escape Characters

Used to escape special characters like ., *, +, etc.

Example:

  • Pattern: \.
  • Matches: "." (literal dot) in "file.txt"

8. Groups and Pipes (Alternation)

Groups multiple characters and allows alternation between them.

Examples:

  • () for grouping
    • Pattern: (abc|def)
    • Matches: "abc" or "def"
  • | for alternation
    • Pattern: apple|banana
    • Matches: "apple" or "banana"

Example Patterns:

  • Email Address:

    • Pattern: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b
    • Matches: "[email protected]"
  • Phone Number:

    • Pattern: \+?\d{1,4}[\s-]?\(?\d{1,3}\)?[\s-]?\d{3,4}[\s-]?\d{3,4}
    • Matches: "+123 456-7890"

Summary:

  • Use literal characters for exact matches.
  • Use special characters like . and [] for flexible matching.
  • Control repetition with *, +, and ?.
  • Use anchors (^, $) to match the start or end of a string.
  • Escape special characters with \.
  • Group and alternate with () and |.

These basics will allow you to create simple RegEx patterns for various text-processing tasks!