Skip to content

KFP: Cannot parse dictionary as input #3047

@Lejboelle

Description

@Lejboelle

I have a kubeflow component that was compiled using the Kubeflow Pipelines SDK func_to_container_op method. One input to the component is a dictionary.

The component specification would be as follows:

name: Test dict inputs
description: Test Elyra dict inputs
inputs:
- {name: dict_input, type: JsonObject, description: some dict}
implementation:
  container:
    image: python:3.9
    command:
    - sh
    - -ec
    - |
      program_path=$(mktemp)
      printf "%s" "$0" > "$program_path"
      python3 -u "$program_path" "$@"
    - |
      def _make_parent_dirs_and_return_path(file_path: str):
          import os
          os.makedirs(os.path.dirname(file_path), exist_ok=True)
          return file_path

      def test_dict(dict_input):
          print(dict_input)
          

      import json
      import argparse
      _parser = argparse.ArgumentParser(prog='Test dict input', description='Test Elyra dict inputs')
      _parser.add_argument("--dict-input", dest="dict_input", type=json.loads, required=True, default=argparse.SUPPRESS)
      _parsed_args = vars(_parser.parse_args())

      test_dict(**_parsed_args)
    args:
    - --dict-input
    - {inputValue: dict_input}

However, if I try create a pipeline using this component, and specify an input, .e.g, {"test_dict": 99}, either in the Inputs box or in a text file, the compiler fails:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/elyra/pipeline/kfp/processor_kfp.py", line 287, in process
    pipeline_dsl = self._generate_pipeline_dsl(
  File "/opt/conda/lib/python3.8/site-packages/elyra/pipeline/kfp/processor_kfp.py", line 576, in _generate_pipeline_dsl
    pipeline_dsl = black.format_str(fix_code(pipeline_dsl), mode=black.FileMode())
  File "src/black/__init__.py", line 1067, in format_str
  File "src/black/__init__.py", line 1077, in _format_str_once
  File "src/black/parsing.py", line 126, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse: 61:22:         dict_input="{"test_dict": 99}",

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/tornado/web.py", line 1704, in _execute
    result = await result
  File "/opt/conda/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 156, in post
    response = await PipelineProcessorManager.instance().process(pipeline)
  File "/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 166, in process
    res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
  File "/opt/conda/lib/python3.8/asyncio/futures.py", line 260, in __await__
    yield self  # This tells Task to wait for completion.
  File "/opt/conda/lib/python3.8/asyncio/tasks.py", line 349, in __wakeup
    future.result()
  File "/opt/conda/lib/python3.8/asyncio/futures.py", line 178, in result
    raise self._exception
  File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/conda/lib/python3.8/site-packages/elyra/pipeline/kfp/processor_kfp.py", line 304, in process
    raise RuntimeError(
RuntimeError: Error compiling pipeline 'elyra_pipeline' with engine 'argo'.

Due to Elyra automatically adding " " around the content and json.loads expects ' '.
I guess Elyra is not compatible with dictionary (and list for that matter) inputs, if the component was compiled using the KFP SDK?

  • Elyra version: 3.13
  • Operating system: linux
  • Deployment type: Kubeflow [notebook server]
  • Kubeflow version: 1.6.0

Pipeline runtime environment

  • Kubeflow Pipelines

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions