Skip to content

Commit 260c61e

Browse files
dristysrivastavadristy.cd
and
dristy.cd
authored
Updated doc for 0.1.18 version (#506)
* Updated doc for 0.1.18 version --------- Co-authored-by: dristy.cd <[email protected]>
1 parent 2af69bc commit 260c61e

File tree

6 files changed

+33
-33
lines changed

6 files changed

+33
-33
lines changed

docs/gh_pages/docs/entityclassifier.md

+1
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ Below is the list of `entities` supported by Pebblo -
1818
1. US Bank Account Number
1919
1. IBAN Code
2020
1. US ITIN
21+
1. IP Address
2122
1. GitHub Access Token
2223
1. Slack Access Token
2324
1. AWS Access Key

docs/gh_pages/docs/pebblo_ui.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,4 +61,4 @@ Load History provides details about latest 5 loads of this app. It provides the
6161

6262
It will also provide you with a list of these Datasource, accompanied by additional details such as the size, source path, the count of topics & entities across the datasource.
6363

64-
4. **Snippets**: This sections provides the actual text inspected by the Pebblo Server using the Pebblo Topic Classifier and Pebblo Entity Classifier. This will be useful to quickly inspect and remediate text that should not be ingested into the Gen-AI RAG application. Each snippet shows the exact file the snippet is loaded from easy remediation.
64+
4. **Snippets**: This section details the text analyzed by the Pebblo Server using the Pebblo Topic Classifier and Pebblo Entity Classifier. It is designed to help quickly inspect and remediate text that should not be ingested into the Gen-AI RAG application. Each snippet shows the exact file for easy reference, with sensitive information labeled with confidence scores: HIGH, MEDIUM, or LOW.

docs/gh_pages/docs/safe_loader.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -65,4 +65,4 @@ Load History provides details about latest 5 loads of this app. It provides the
6565

6666
It will also provide you with a list of these Datasource, accompanied by additional details such as the size, source path, the count of topics & entities across the datasource.
6767

68-
4. **Snippets**: This sections provides the actual text inspected by the Pebblo Server using the Pebblo Topic Classifier and Pebblo Entity Classifier. This will be useful to quickly inspect and remediate text that should not be ingested into the Gen-AI RAG application. Each snippet shows the exact file the snippet is loaded from easy remediation.
68+
4. **Snippets**: This section details the text analyzed by the Pebblo Server using the Pebblo Topic Classifier and Pebblo Entity Classifier. It is designed to help quickly inspect and remediate text that should not be ingested into the Gen-AI RAG application. Each snippet shows the exact file for easy reference, with sensitive information labeled with confidence scores: HIGH, MEDIUM, or LOW.

docs/gh_pages/docs/topicclassifier.md

+11-10
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,22 @@ and improvements to enrich its accuracy and effectiveness.
88

99
Below is the list of `topics` supported by Pebblo -
1010

11+
1. Medical Advice
12+
1. Harmful Advice
1113
1. Board Meeting
12-
1. Enterprise Agreement
13-
1. Patent Application Filling
14-
1. Financial Report
15-
1. Loan and Security Agreement
1614
1. Consulting Agreement
17-
1. Sexual Harassment
18-
1. Settlement Agreement
19-
1. Price List
20-
1. Distribution/Partner Agreement
2115
1. Customer List
16+
1. Enterprise Agreement
2217
1. Executive Severance Agreement
23-
1. Employee Agreement
18+
1. Financial Report
19+
1. Loan And Security Agreement
2420
1. Merger Agreement
25-
1. Non-Disclosure Agreement
21+
1. Patent Application Fillings
22+
1. Price List
23+
1. Employee Agreement
24+
1. Sexual Content
25+
1. Sexual Incident Report
26+
1. Internal Product Roadmap Agreement
2627

2728
User can get details of classified topics for their loader source files in Pebblo report.
2829
Different sections of Pebblo report such as , `Top Files With Most Findings`, `Data Source Findings Table` and `Snippets` helps to get overview of pebblo topic classifier output for user's rag application.

pebblo/entity_classifier/README.md

+2
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Currently, we are supporting following Entities:
1010
5. US Bank Account Number
1111
6. IBAN code
1212
7. US ITIN
13+
8. IP Address
1314

1415
And following Secret Entities:
1516
1. Github Token
@@ -28,4 +29,5 @@ entities, total_count, anonymized_text, entity_details = entity_classifier_obj.p
2829
print(f"Entity Group: {entity_groups}")
2930
print(f"Entity Count: {total_entity_count}")
3031
print(f"Anonymized Text: {anonymized_text}")
32+
print(f"Entity Details: {entity_details}")
3133
```

pebblo/topic_classifier/README.md

+17-21
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,22 @@
22

33
This is Topic Classifier.
44
Currently, we are supporting following Topics:
5-
1. Normal Advice
6-
2. Medical Advice
7-
3. Harmful Advice
8-
4. Board Meeting
9-
5. Consulting Agreement
10-
6. Customer List
11-
7. Distribution/Partner Agreement
12-
8. Enterprise License Agreement
13-
9. Executive Severance Agreement
14-
10. Financial Report
15-
11. Internal Use Only
16-
12. Loan And Security Agreement
17-
13. Merger Agreement
18-
14. NDA
19-
15. Patent Application Fillings
20-
16. Price List
21-
17. Settlement Agreement
22-
18. Employee Agreement
23-
19. Enterprise Agreement
24-
20. Sexual Content
25-
21. Sexual Incident Report
5+
1. Medical Advice
6+
1. Harmful Advice
7+
1. Board Meeting
8+
1. Consulting Agreement
9+
1. Customer List
10+
1. Enterprise Agreement
11+
1. Executive Severance Agreement
12+
1. Financial Report
13+
1. Loan And Security Agreement
14+
1. Merger Agreement
15+
1. Patent Application Fillings
16+
1. Price List
17+
1. Employee Agreement
18+
1. Sexual Content
19+
1. Sexual Incident Report
20+
1. Internal Product Roadmap Agreement
2621

2722
## How to use
2823

@@ -34,4 +29,5 @@ topic_classifier_obj = TopicClassifier()
3429
topics, total_topic_count, topic_details = topic_classifier_obj.predict(text)
3530
print(f"Topic Response: {topics}")
3631
print(f"Topic Count: {total_topic_count}")
32+
print(f"Topic Details: {topic_details}")
3733
```

0 commit comments

Comments
 (0)