UDFs in Flow V2
UDFs provide custom functionality to your Flow.
Adding the Apply UDF step
You can add the Apply UDF step to a Flow.
-
Open an
.ibflow
file and click Tools > Add Step > Select Step > Apply UDF. -
Set the input and output extensions.
-
Set the input and output folders.
-
Add a registered custom function to the Formula field. The Apply UDF step is added as the last step.
To add the Apply UDF step in a different order, delete previous steps until you can add the Apply UDF step in the right place. Then add the Flow steps that you removed back to your Flow. Be sure to verify the associated input and output folders, scripts directories, and other supporting file structures for the steps that you move.
Pre- and post-run hooks
Custom functions that are defined in the scripts directory and registered with the custom function name can be run before and after a Flow.
-
Pre-Flow UDFs are run immediately after the output folder setup
-
Post-Flow UDFs are run after the entire Run Flow, Run Flows, or Run Metaflow completes
The Post-Flow UDFs for Metaflow work only on binary (.ibflowbin
) files.
Scripts location
For Run Flow, Run Flows, and Run Metaflow, you must define the pre-Flow and post-Flow UDFs within a scripts directory. The scripts directory must be in the same folder as the Flows. The folder is usually called the Workflows/
.
To use the Flow root directory and the file-like object during the custom post-hook step, write your UDF function to accept the flow_info_json
and clients
parameters.
Special input variables
The flow_info_json
dictionary is type FlowInfoDict and contains runtime information about the Flow.
FlowInfoDict = TypedDict('FlowInfoDict',
{'root_output_folder': Text})
-
root_output_folder
is the absolute path to the Run Flow/Flows/Metaflow operation’s output directory. -
input_folder
is the absolute path to the input directory on which we are running the Flow. -
CONFIG
is a set of key-value pairs that are dynamically passed at runtime into a Flow.- An example runtime config:
{"key1": "val1", "key2": "val2"}
- An example runtime config:
-
clients
is an object that contains a property calledibfile
.
To enable forward compatibility, be sure to update your existing functions to accept a variable number of **kwargs
.
Logging in UDF
Use Python’s standard logging
library to log messages from an Apply UDF step, a pre-flow UDF, or a post-flow UDF. Logs will show up in Flow Dashboard. You can filter to see only the logs from UDFs by selecting the “Show Developer Logs Only” option.
Flow logs currently have a size limit of 20MB per job ID by default. As a good practice, avoid logging binary values (like images), entire IBDOCs, or extraction results that might contain PII. Logs are stored in the file system.
Logging in UDFs used to be done by the LOGGER object from function context. Although LOGGER
is still supported, we recommend you to directly use the logging
library from Python now.
Pre-Flow UDF example
Sample pre-run UDF:
import logging
def custom_prep_fn(flow_info_json, clients, **kwargs):
logging.info('Flow info json {}'.format(flow_info_json))
root_out = flow_info_json['root_output_folder']
logging.info('Custom UDF started')
if clients:
clients.ibfile.write_file(root_out + '/custom_output.txt', root_out)
logging.info('Custom UDF ended')
def register(name_to_fn):
more_fns = {
'custom_fn_name': {
'fn': custom_prep_fn,
'ex': '',
'desc': ''
}
}
name_to_fn.update(more_fns)
Replace the custom_fn_name
key with one of these function names for the desired Run type:
Run type | Custom function name |
---|---|
Run Flow | flow_custom_prep |
Run Flows | multiflow_custom_prep |
Run Metaflow | metaflow_custom_prep |
Post-Flow UDF example
Sample post-run UDF:
import logging
def custom_post_fn(flow_info_json, clients, **kwargs):
logging.info('Root output folder {}'.format(flow_info_json))
if not clients:
return
root_out = flow_info_json['root_output_folder']
classes = clients.ibfile.read_file(root_out + '/class_output_folders.json')
clients.ibfile.write_file(root_out + '/class_output_folders_new.json', classes)
clients.ibfile.write_file(root_out + '/root_output_folder.txt', root_out)
logging.info('Written files successfully')
def register(name_to_fn):
more_fns = {
'custom_fn_name': {
'fn': custom_post_fn,
'ex': '',
'desc': ''
}
}
name_to_fn.update(more_fns)
Replace the custom_fn_name
key with a custom function name. The following table shows the run type that corresponds to the custom function name:
Run type | Custom function name |
---|---|
Run Flow | flow_custom_finish |
Run Flows | multiflow_custom_finish |
Run Metaflow | metaflow_custom_finish |
Because the post-Flow and pre-Flow function names are different, you can register both UDF types at the same time so they can be executed for the appropriate run type at the appropriate time.