f— id: update-earlier-provenance-tracked-functions title: Update earlier provenance tracked functions weight: 55 description: Update functions that were created in Instabase versions earlier than June 2020 to use the new API through the Value class. Remove .tracker() in code references, except as required for Informational Methods.
Provenance tracked functions from Instabase versions earlier than July 2020 must be updated to use the new API through the Value class.
What’s different
The new Value class has functions that operate on the Value and provenance tracker at the same time.
Earlier provenance-tracked functions perform string operations on the string contained in Value
and mimic these operations on the tracker object with built-in functions from the ProvenanceTracker
class, followed by setting the modified tracker back onto the Value object.
For example, an earlier provenance tracked function might look like this:
def my_substr(input_text: Value[Text], start: Value[int], end: Value[int], **kwargs) -> Value[Text]:
start_idx = start.value()
end_idx = end.value()
final_string = input_text.value()[start_idx:end_idx]
result = Value(result)
if input_text.tracker():
new_tracker = input_text.tracker().substring(input_text.value(), start_idx, end_idx)
result.set_tracker(new_tracker)
return result
In Instabase versions June 2020 and later, you can remove most provenance tracking code that is related to substrings, concatenation, replacements, regex search, list tracking operations on the tracker object and replace that code with Value functions that operate on the Value object.
Generally, most operations on a string need to deal only with the Value
object directly, rather than accessing the value or provenance tracker directly. A good rule to follow is to omit all .tracker()
references in your code. The only valid use of .tracker()
is with Provenance Tracking Informational Methods that still access the provenance tracker directly.
Example of built-in slice operator
With the new Value
function, you can use the built-in slice operator on Value
:
def my_substr(input_text: Value[Text], start: Value[int], end: Value[int], **kwargs) -> Value[Text]:
start_idx = start.value()
end_idx = end.value()
return input_text[start_idx:end_idx]
Example of complex operations
Use the new Value
function for complex operations like joining or adding Value objects:
def strip_lines(input_text: Value[Text]) -> Value[Text]:
lines: List[Value[Text]] = input_text.split('\n')
lines_stripped = [l.strip() for l in lines]
return Value.join('\n', lines_stripped)
Example of regex operations
In earlier versions, regex operations required lots of substring tracking. Use the new Value
function to perform regex operations:
def get_numbers(input_text: Value[Text]) -> Value[List[Value[Text]]]:
# Each element is provenance tracked, and the resulting list is also provenance tracked
return Value.regex_findall('\d+', input_text)
def remove_numbers(input_text: Value[Text]) -> Value[Text]:
return Value.regex_sub('\d', ' ', input_text)
Example of converting the strip function
Here is an example of converting the strip
function from earlier versions to the new Value
function.
Earlier version strip function
For example, this original implementation of the strip
function needs to be converted:
def trim(s: Value[Text], trim_char=Value(None), **kwargs) -> Value[Text]:
# Make deep copies of trackers
s, trim_char = make_value_deep_copies(s, trim_char)
if trim_char.value() is not None:
t_c = trim_char.value().strip("'")
value = Value(s.value().strip(t_c))
tracker = s.tracker().strip(
s.value(), char_to_strip=t_c) if s.tracker() else None
value.set_tracker(tracker)
return value
value = Value(s.value().strip())
tracker = s.tracker().strip(s.value()) if s.tracker() else None
value.set_tracker(tracker)
return value
Converted strip function
This converted function uses the Value
function:
def trim(s, trim_char=Value(None), **kwargs):
# Make deep copies of trackers
s, trim_char = make_value_deep_copies(s, trim_char)
if trim_char is not None and trim_char.value() is not None:
trim_char = trim_char.strip("'")
return s.strip(trim_char.value())
return s.strip()
However, to insert information from trim_char
in this strip
function example, access to the tracker is still required:
def trim(s, trim_char=Value(None), **kwargs):
# Make deep copies of trackers
s, trim_char = make_value_deep_copies(s, trim_char)
result = None
if trim_char is not None and trim_char.value() is not None:
trim_char = trim_char.strip("'")
result = s.strip(trim_char.value())
result = s.strip()
if result.tracker():
result.tracker().insert_information_from(trim_char.tracker())
return result