Skip to content

import segments from cvs:can't adapt type 'dict' #12717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 tasks done
Tracked by #12927
leoterry-ulrica opened this issue Jan 14, 2025 · 2 comments · Fixed by #12929
Closed
5 tasks done
Tracked by #12927

import segments from cvs:can't adapt type 'dict' #12717

leoterry-ulrica opened this issue Jan 14, 2025 · 2 comments · Fixed by #12929
Labels
🐞 bug Something isn't working

Comments

@leoterry-ulrica
Copy link
Contributor

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.15.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

In QA mode, an error occurred when importing segment information in bulk by selecting a CSV file:

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/api/tasks/batch_create_segment_to_index_task.py", line 71, in batch_create_segment_to_index_task
    .scalar()
     ^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/query.py", line 2805, in scalar
    ret = self.one()
          ^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/query.py", line 2778, in one
    return self._iter().one()  # type: ignore
           ^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/query.py", line 2827, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
                                                  ^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2226, in _execute_internal
    ) = compile_state_cls.orm_pre_session_exec(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/context.py", line 561, in orm_pre_session_exec
    session._autoflush()
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 3061, in _autoflush
    raise e.with_traceback(sys.exc_info()[2])
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 3050, in _autoflush
    self.flush()
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4352, in flush
    self._flush(objects)
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4487, in _flush
    with util.safe_reraise():
         ^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4448, in _flush
    flush_context.execute()
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 1233, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
    return meth(
           ^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
    self._handle_dbapi_exception(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(psycopg2.ProgrammingError) can't adapt type 'dict'

SQL:

[SQL: INSERT INTO document_segments (tenant_id, dataset_id, document_id, position, content, answer, word_count, tokens, index_node_id, index_node_hash, hit_count, disabled_at, disabled_by, status, created_by, updated_by, indexing_at, completed_at, error, stopped_at) VALUES (%(tenant_id)s::UUID, %(dataset_id)s::UUID, %(document_id)s::UUID, %(position)s, %(content)s, %(answer)s, %(word_count)s, %(tokens)s, %(index_node_id)s, %(index_node_hash)s, %(hit_count)s, %(disabled_at)s, %(disabled_by)s::UUID, %(status)s, %(created_by)s::UUID, %(updated_by)s::UUID, %(indexing_at)s, %(completed_at)s, %(error)s, %(stopped_at)s) RETURNING document_segments.id, document_segments.enabled, document_segments.created_at, document_segments.updated_at]
[parameters: {'tenant_id': 'e7aa71ec-5c60-4914-99ad-01cb83ae3ac7', 'dataset_id': '959f494e-75e1-4987-9072-82f7e073864f', 'document_id': 'fb6dc544-ed9d-4f6d-a99d-0382e9ba9084', 'position': 1, 'content': [{'content': '问题 1', 'answer': '答案 1'}, {'content': '问题 2', 'answer': '答案 2'}], 'answer': '答案 1', 'word_count': 6, 'tokens': 6, 'index_node_id': 'd04e97ce-09aa-4f83-a581-58bfba94f1e2', 'index_node_hash': 'cddb52bf326a94bfe8608fb815af15ce8bc2bbd5759c69e24a3de0248554cbd1', 'hit_count': 0, 'disabled_at': None, 'disabled_by': None, 'status': 'completed', 'created_by': '1507e141-470b-4719-8c4a-babf6fb730a1', 'updated_by': None, 'indexing_at': datetime.datetime(2025, 1, 14, 4, 51, 57, 968109), 'completed_at': datetime.datetime(2025, 1, 14, 4, 51, 57, 968129), 'error': None, 'stopped_at': None}]

✔️ Expected Behavior

In QA mode, segment information can be imported in bulk via CSV.

❌ Actual Behavior

No response

Copy link

dosubot bot commented Jan 14, 2025

The error "can't adapt type 'dict'" suggests that the CSV import process is attempting to insert a dictionary directly into a database field, which is not supported. This typically happens when the data structure being inserted is not properly flattened or serialized.

In the context of the batch_create_segment_to_index_task function, the 'content' field in the CSV file should be a string, not a dictionary. Each row in the CSV should represent a single segment, and if the document form is "qa_model", it should include both "content" and "answer" as separate fields.

To resolve this issue, ensure that the CSV file is structured correctly. Each row should have separate columns for "content" and "answer" (if applicable), and these should be plain text strings, not dictionaries. The CSV processing code should extract these fields and pass them as individual strings to the database insertion logic.

Here's a brief outline of how the CSV should be structured and processed:

  1. CSV Structure: Ensure each row in the CSV has separate columns for "content" and "answer" (if applicable).

  2. Data Processing: When reading the CSV, extract the "content" and "answer" fields as strings, not as dictionaries.

  3. Database Insertion: Ensure that the data being inserted into the database is in the correct format, with 'content' and 'answer' as text fields.

By following these steps, you should be able to avoid the "can't adapt type 'dict'" error during the CSV import process.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the 🐞 bug Something isn't working label Jan 14, 2025
@leoterry-ulrica
Copy link
Contributor Author

demo data file:
template-test.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant