DataProcessor Documentation

Version 2.5.0 Scope GLOBAL 10 Keywords

Library Website View on GitHub Contact Support

Author: Mary JaneMaintainer: Docs TeamLicense: MITRobot Framework: >=7.0Python: >=3.8

Sample Usage

Robot

*** Settings ***
Library    DataProcessor

*** Test Cases ***
Example
    [Documentation]    Demonstrates using DataProcessor
    # add your keyword calls here

Introduction

A powerful data processing library for Robot Framework.

This library provides keywords for:

Data validation and transformation
JSON and dictionary manipulation
String processing and formatting
List operations and filtering
Data comparison and merging

Validate Json Structure

Arguments

json_data : Union[str, dict]

schema : Dict[str, Any]

Return Type

bool

Documentation

This keyword validates that the provided JSON data matches the expected

schema structure. It checks for required fields, data types, and nested

structures.

Arguments:

json_data: JSON string or dictionary to validate
schema: Dictionary defining the expected structure with type information

Returns: True if validation passes, False otherwise.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Validate JSON Example
    ${schema}    Create Dictionary    name=str    age=int    email=str
    ${valid}    Validate Json Structure    {"name": "John", "age": 30, "email": "john@example.com"}    ${schema}
    Should Be True     ${valid}

Schema Format:

Field	Type	Description
Field name	str/int/bool/list/dict	Expected data type
Nested dict	dict	For nested objects, use dict with sub-schema

See also:

Transform Json Data for data transformation
Merge Dictionaries for combining data structures

Transform JSON Data

Arguments

data : Union[str, Dict]

mapping : Dict[str, str]

default_value : Optional[Any] default: None

Return Type

Dict[str, Any]

Documentation

Transform JSON data using field mapping rules.

Maps fields from source data to target structure based on the provided

mapping dictionary. Supports nested field access using dot notation.

Arguments:

data: Source JSON data (string or dictionary)
mapping: Dictionary mapping target fields to source fields (e.g., {"newname": "oldname"})
default_value: Default value for missing fields (default: None)

Returns: Transformed dictionary with mapped fields.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Transform JSON Example
    ${source}    Create Dictionary    first_name=John    last_name=Doe    age=30
    ${mapping}    Create Dictionary    name=first_name    surname=last_name    years=age
    ${result}    Transform Json Data    ${source}    ${mapping}
    # Result: {"name": "John", "surname": "Doe", "years": 30}

Advanced Mapping:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Advanced Transform Example
    ${source}    Create Dictionary    user=${dict}    metadata=${dict2}
    ${mapping}    Create Dictionary    username=user.name    email=user.contact.email
    ${result}    Transform Json Data    ${source}    ${mapping}

Filter List By Condition

Arguments

items : List[Any]

condition : str

value : Optional[Any] default: None

Return Type

List[Any]

Documentation

Filter a list based on a condition.

Filters list items based on various conditions like equality, comparison,

type checking, or custom lambda functions.

Arguments:

items: List of items to filter
condition: Condition type: equals, contains, greaterthan, lessthan, typeis, matchesregex
value: Value to compare against (required for most conditions)

Returns: Filtered list containing only matching items.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Filter List Example
    ${numbers}    Create List    1    5    10    15    20
    ${filtered}    Filter List By Condition    ${numbers}    greater_than    10
    # Result: [15, 20]

Supported Conditions:

Condition	Description	Example
`equals`	Exact match	`equals` with `value="test"`
`contains`	Substring search	`contains` with `value="key"`
`greater_than`	Numeric comparison	`greater_than` with `value=10`
`less_than`	Numeric comparison	`less_than` with `value=5`
`type_is`	Type checking	`type_is` with `value=int`
`matches_regex`	Pattern matching	`matches_regex` with `value="^[A-Z]+$"`

Regex Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Filter By Regex Example
    ${emails}    Create List    user@example.com    invalid    admin@test.org
    ${valid}    Filter List By Condition    ${emails}    matches_regex    ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

Merge Dictionaries

Arguments

*dictionaries : Dict[str, Any]

strategy : str default: "override"

Return Type

Dict[str, Any]

Documentation

Merge multiple dictionaries into a single dictionary.

Combines multiple dictionaries using different merge strategies. Handles

nested dictionaries intelligently based on the chosen strategy.

Arguments:

*dictionaries: Variable number of dictionaries to merge
strategy: Merge strategy - override (default), keepfirst, keeplast, deep_merge

Returns: Merged dictionary containing all key-value pairs.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Merge Dictionaries Example
    ${dict1}    Create Dictionary    name=John    age=30
    ${dict2}    Create Dictionary    age=35    city=NYC
    ${merged}    Merge Dictionaries    ${dict1}    ${dict2}
    # Result: {"name": "John", "age": 35, "city": "NYC"} (override strategy)

Merge Strategies:

Strategy	Behavior	Use Case
`override`	Later dicts override earlier ones	Default configuration overrides
`keep_first`	First value wins	Preserve original data
`keep_last`	Last value wins	Latest updates take precedence
`deep_merge`	Recursively merge nested dicts	Complex nested structures

Deep Merge Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Deep Merge Example
    ${base}    Create Dictionary    user=${dict}    settings=${dict2}
    ${update}    Create Dictionary    user=${dict3}    settings=${dict4}
    ${merged}    Merge Dictionaries    ${base}    ${update}    strategy=deep_merge

Format String Template

Arguments

template : str

variables : Dict[str, Any]

missing_handler : str default: "skip"

Return Type

str

Documentation

Format a string template with variable substitution.

Replaces placeholders in a template string with values from a dictionary.

Supports various placeholder formats and missing value handling strategies.

Arguments:

template: String template with placeholders (e.g., "Hello {name}, you have {count} messages")
variables: Dictionary mapping placeholder names to values
missing_handler: How to handle missing variables - skip, error, or default

Returns: Formatted string with variables substituted.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Format Template Example
    ${template}    Set Variable    Welcome {username}! You have {message_count} new messages.
    ${vars}    Create Dictionary    username=John    message_count=5
    ${result}    Format String Template    ${template}    ${vars}
    # Result: "Welcome John! You have 5 new messages."

Placeholder Formats:

{variable} - Simple placeholder
{variable:default} - With default value
{variable!upper} - With transformation (upper, lower, title)

Missing Handler Options:

Handler	Behavior
`skip`	Leave placeholder as-is
`error`	Raise an error
`default`	Use empty string

Advanced Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Advanced Template Example
    ${template}    Set Variable    User: {user:Guest} | Status: {status!upper} | Count: {count:0}
    ${vars}    Create Dictionary    status=active
    ${result}    Format String Template    ${template}    ${vars}    missing_handler=default

Extract Data By Pattern

Arguments

text : str

pattern : str

group : Optional[int] default: None

Return Type

Union[str, List[str]]

Documentation

Extract data from text using regex patterns.

Extracts matching text from a string using regular expressions. Supports

single match, all matches, and specific capture group extraction.

Arguments:

text: Text to search in
pattern: Regular expression pattern
group: Capture group number (None for all matches, 0 for full match, 1+ for specific group)

Returns: Extracted string or list of strings.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Extract Pattern Example
    ${text}    Set Variable    Contact: john@example.com or admin@test.org
    ${emails}    Extract Data By Pattern    ${text}    [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
    # Result: ["john@example.com", "admin@test.org"]

Capture Group Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Extract With Group Example
    ${text}    Set Variable    Date: 2024-01-15 Time: 14:30:00
    ${date}    Extract Data By Pattern    ${text}    (\d{4}-\d{2}-\d{2})    group=1
    # Result: "2024-01-15"

Common Patterns:

Pattern	Description	Example
`\d+`	Numbers	`"123"` from `"Price: 123 dollars"`
`[A-Z][a-z]+`	Capitalized words	`"John"` from `"Hello John Doe"`
`\w+@\w+\.\w+`	Email addresses	`"user@example.com"`
`\$\{[^}]+\}`	Robot variables	`${variable}` from Robot Framework code

Compare Data Structures

Arguments

data1 : Union[Dict, List]

data2 : Union[Dict, List]

ignore_order : bool default: False

Return Type

Dict[str, Any]

Documentation

Compare two data structures and return differences.

Performs deep comparison of dictionaries or lists, identifying added,

removed, and modified items. Useful for data validation and testing.

Arguments:

data1: First data structure (baseline)
data2: Second data structure (to compare)
ignore_order: If True, list order is ignored (default: False)

Returns: Dictionary with comparison results:

equal: Boolean indicating if structures are identical
added: Items present in data2 but not in data1
removed: Items present in data1 but not in data2
modified: Items that differ between structures

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Compare Structures Example
    ${original}    Create Dictionary    name=John    age=30
    ${updated}    Create Dictionary    name=John    age=31    city=NYC
    ${diff}    Compare Data Structures    ${original}    ${updated}
    # Result: {"equal": False, "added": {"city": "NYC"}, "removed": {}, "modified": {"age": (30, 31)}}

List Comparison:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Compare Lists Example
    ${list1}    Create List    a    b    c
    ${list2}    Create List    c    b    a
    ${diff}    Compare Data Structures    ${list1}    ${list2}    ignore_order=True
    # Result: {"equal": True, ...}

Group Items By Key

Arguments

items : List[Dict[str, Any]]

key : str

default_group : str default: "other"

Return Type

Dict[str, List[Dict[str, Any]]]

Documentation

Group a list of dictionaries by a common key.

Organizes a list of dictionaries into groups based on the value of a

specified key. Useful for categorizing and organizing data.

Arguments:

items: List of dictionaries to group
key: Key name to group by
default_group: Group name for items missing the key (default: "other")

Returns: Dictionary mapping group names to lists of items.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Group Items Example
    ${users}    Create List
    ...    ${dict1}    # {"name": "John", "role": "admin"}
    ...    ${dict2}    # {"name": "Jane", "role": "user"}
    ...    ${dict3}    # {"name": "Bob", "role": "admin"}
    ${grouped}    Group Items By Key    ${users}    role
    # Result: {"admin": [dict1, dict3], "user": [dict2]}

Use Cases:

Organizing test results by status
Grouping API responses by category
Categorizing log entries by level
Sorting data by type or classification

Nested Key Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Group By Nested Key Example
    ${items}    Create List    ${dict1}    ${dict2}
    # dict1: {"user": {"department": "IT"}}
    # Use dot notation: "user.department"

Calculate Statistics

Arguments

numbers : List[Union[int, float]]

include_percentiles : bool default: False

Return Type

Dict[str, float]

Documentation

Calculate statistical measures for a list of numbers.

Computes common statistical metrics including mean, median, min, max,

and optionally percentiles. Useful for data analysis and reporting.

Arguments:

numbers: List of numeric values
include_percentiles: If True, calculate 25th, 50th, 75th, and 95th percentiles

Returns: Dictionary with statistical measures:

count: Number of values
sum: Sum of all values
mean: Average value
median: Middle value
min: Minimum value
max: Maximum value
range: Difference between max and min
percentiles: (if enabled) Dictionary with percentile values

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Calculate Statistics Example
    ${scores}    Create List    85    90    78    92    88
    ${stats}    Calculate Statistics    ${scores}
    # Result: {"count": 5, "mean": 86.6, "median": 88, "min": 78, "max": 92, ...}

With Percentiles:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Statistics With Percentiles Example
    ${data}    Create List    10    20    30    40    50    60    70    80    90    100
    ${stats}    Calculate Statistics    ${data}    include_percentiles=True
    # Includes: "percentiles": {"25": 30, "50": 55, "75": 80, "95": 95}

Visualization:

Distribution: [min]----[25%]----[median]----[75%]----[max]

Create Data Snapshot

Arguments

data : Any

include_metadata : bool default: True

Return Type

Dict[str, Any]

Documentation

Create a snapshot of data with optional metadata.

Captures a point-in-time snapshot of data structures, including

metadata like timestamp, data type, and size information.

Arguments:

data: Data structure to snapshot (dict, list, string, etc.)
include_metadata: If True, include metadata in snapshot (default: True)

Returns: Dictionary containing snapshot data and metadata.

Example:

*** Settings ***
Library    DataProcessor


*** Test Cases ***
Create Snapshot Example
    ${data}    Create Dictionary    name=John    age=30
    ${snapshot}    Create Data Snapshot    ${data}
    # Result: {"data": {...}, "metadata": {"timestamp": "...", "type": "dict", "size": 2}}

Metadata Fields:

Field	Description
`timestamp`	ISO format timestamp
`type`	Data type (dict, list, str, etc.)
`size`	Size/length of data structure
`hash`	Hash of serialized data (for comparison)

Use Cases:

Data versioning and comparison
Audit trails
Debugging data transformations
Testing data integrity

Need Help?

Found an issue or have a feature request for this library? Let the maintainers know by opening a new GitHub issue. Please include environment details and relevant log output to help us reproduce the problem.

Open an Issue on GitHub