Skip to content

Commit 7a49bec

Browse files
kslohithLohith K S
and
Lohith K S
authored
Verified that issue #1067 is resolved and added documentation for load pdf functionality. (#1343)
Issue #1067 about not being able to load pdf files, was verified to be working with evadb documentation pdf and a new page for loading pdf is added to the documentation. <img width="1310" alt="Screenshot 2023-11-07 at 1 33 01 AM" src="https://github.com/georgia-tech-db/evadb/assets/32676813/af2fa40b-c8c1-4f3d-b93f-98d0bf278a5b"> Co-authored-by: Lohith K S <[email protected]>
1 parent bb45db4 commit 7a49bec

File tree

2 files changed

+17
-0
lines changed

2 files changed

+17
-0
lines changed

docs/_toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ parts:
4545
- file: source/reference/evaql/load_csv
4646
- file: source/reference/evaql/load_image
4747
- file: source/reference/evaql/load_video
48+
- file: source/reference/evaql/load_pdf
4849
- file: source/reference/evaql/select
4950
- file: source/reference/evaql/explain
5051
- file: source/reference/evaql/show_functions
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
LOAD PDF
2+
==========
3+
4+
.. _load-pdf:
5+
6+
.. code:: mysql
7+
8+
LOAD PDF 'test_pdf.pdf' INTO MyPDFs;
9+
10+
PDFs can be directly imported into a table, where the PDF document is segmented into pages and paragraphs.
11+
Each row in the table corresponds to a paragraph extracted from the PDF, and the resulting table includes columns for ``name`` , ``page``, ``paragraph``, and ``data``.
12+
13+
| ``name`` signifies the title of the uploaded PDF.
14+
| ``page`` signifies the specific page number from which the data is retrieved.
15+
| ``paragraph`` signifies the individual paragraph within a page from which the data is extracted.
16+
| ``data`` refers to the text extracted from the paragraph on the given page.

0 commit comments

Comments
 (0)