Skip to content

Log improvements in pebblo-server. #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 13, 2024

Conversation

rahul-trip
Copy link
Contributor

No description provided.

@rahul-trip
Copy link
Contributor Author

@shreyas-damle / @srics

Kindly review.

@rahul-trip
Copy link
Contributor Author

logs:

Downloading models if needed...                                                                                                                                             
 30%|████████████████████████████████████████▊                                                                                               | 3/10 [00:05<00:13,  1.87s/it]
Topic Classifier Initializing.                                                                                                                                              
Topic Classifier Initialized...                                                                                                                                             
Entity Classifier Initializing.                                                                                                                                             
Entity Classifier Initialized...                                                                                                                                            
Pebblo server Starting.                                                                                                                                                     
Pebblo server Running. Hi!                                                                                                                                                  
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00,  1.16it/s]
INFO:     Started server process [817071]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
INFO:     127.0.0.1:36326 - "POST /v1/app/discover HTTP/1.1" 200 OK
INFO:     127.0.0.1:36336 - "POST /v1/loader/doc HTTP/1.1" 200 OK
^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [817071]
Pebblo server Stopped. BYE!

Comment on lines +4 to +5
from io import StringIO
from tqdm import tqdm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose, all three libraries are either present by default in python 3.9+ or already present as dependency of some other package and no need to add them to requirements.txt/pyproject.toml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats right.
io is part of standard library and tqdm come as a dependency in current environment. So we should be good.


# running local server
uvicorn.run(app, host="localhost", port=8000, log_level="info")
p_bar.write("Pebblo server Stopped. BYE!")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see overall 4 sections of logs - Downloading model, Topic Classifier Initialization, Entity Classifier Initialization and starting pebblo server. Is it possible to move these sections in separate function? so that code is more readable. We have some changes in this file coming soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shreyas-damle
I am not able to think of a way how can we move scattered logging into a common function, they are all stdouts. We are steering progress bar and screen message one by one, hence we have this many logs.

But I think we can have a common function to update progress bar and stdout text messages with both to be passed as an argument. This way we can make the code a bit readable.

What do you think, please suggest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shreyas-damle
made changes in the daemon file as per our recent discussion.

@shreyas-damle
Copy link
Collaborator

@rahul-trip Overall logs are better now, API info logs missing though.

Signed-off-by: Rahul Tripathi <[email protected]>
@rahul-trip
Copy link
Contributor Author

Latest output:

(.venv311) [email protected]@BAN-LAP-TRIPA:~/repos/11/pebblo/dist$ pebblo
Downloading models if needed...                                                                                                                                             
Topic Classifier Initializing.                                                                                                                                              
Topic Classifier Initialized...                                                                                                                                             
Entity Classifier Initializing.                                                                                                                                             
Entity Classifier Initialized...                                                                                                                                            
Pebblo server Starting.                                                                                                                                                     
Pebblo server Running. Hi!                                                                                                                                                  
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.32it/s]
INFO:     Started server process [842707]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
INFO:     127.0.0.1:38094 - "POST /v1/app/discover HTTP/1.1" 200 OK
INFO:     127.0.0.1:38110 - "POST /v1/loader/doc HTTP/1.1" 200 OK
INFO:     127.0.0.1:39854 - "POST /v1/app/discover HTTP/1.1" 200 OK
INFO:     127.0.0.1:39870 - "POST /v1/loader/doc HTTP/1.1" 200 OK
^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [842707]
Pebblo server Stopped. BYE!

@rahul-trip
Copy link
Contributor Author

Screencast.from.09-02-24.12.08.02.AM.IST.webm

Also:
Screenshot from 2024-02-09 00-09-20

We need to suppress logs from router initialisation because it is printing torch package's warnings (See below) as it is importing TopicClassifier class which is importing something from presedio and so on..

If we dont suppress stdout and stderr of router initialisation, we get following logs:


Downloading models if needed...                                                                                                                                           
  0%|                                                                                                                                              | 0/10 [00:00<?, ?it/s]/home/ad.msystechnologies.com/rahul.tripathi/repos/.venv311/lib/python3.11/site-packages/torch/cuda/__init__.py:628: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
/home/ad.msystechnologies.com/rahul.tripathi/repos/.venv311/lib/python3.11/site-packages/torch/cuda/__init__.py:758: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() if nvml_count < 0 else nvml_count
/home/ad.msystechnologies.com/rahul.tripathi/repos/.venv311/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/ad.msystechnologies.com/rahul.tripathi/repos/.venv311/lib/python3.11/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
Some weights of the model checkpoint at daxa-ai/pebblo-classifier were not used when initializing DistilBertForSequenceClassification: ['pre_classifier.lora_A.default.weight', 'classifier.lora_A.default.weight', 'classifier.lora_B.default.weight', 'pre_classifier.lora_B.default.weight']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
/home/ad.msystechnologies.com/rahul.tripathi/repos/.venv311/lib/python3.11/site-packages/transformers/pipelines/text_classification.py:105: UserWarning: `return_all_scores` is now deprecated,  if want a similar functionality use `top_k=None` instead of `return_all_scores=True` or `top_k=1` instead of `return_all_scores=False`.
  warnings.warn(
Topic Classifier Initializing.                                                                                                                                            
 30%|████████████████████████████████████████▏                                                                                             | 3/10 [00:05<00:13,  1.96s/it]Some weights of the model checkpoint at daxa-ai/pebblo-classifier were not used when initializing DistilBertForSequenceClassification: ['pre_classifier.lora_A.default.weight', 'classifier.lora_A.default.weight', 'classifier.lora_B.default.weight', 'pre_classifier.lora_B.default.weight']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Topic Classifier Initialized...                                                                                                                                           
Entity Classifier Initializing.                                                                                                                                           
Entity Classifier Initialized...                                                                                                                                          
Pebblo server Starting.                                                                                                                                                   
Pebblo server Running. Hi!                                                                                                                                                
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:09<00:00,  1.05it/s]
INFO:     Started server process [849771]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
2024-02-09 00:15:36,623 - Pebblo Logger - INFO - App Discover Request Processed Successfully
INFO:     127.0.0.1:48506 - "POST /v1/app/discover HTTP/1.1" 200 OK
2024-02-09 00:15:50,411 - Pebblo Logger - INFO - PDF report generated at : /home/ad.msystechnologies.com/rahul.tripathi/.pebblo/csv_app_1/pebblo_report.pdf
2024-02-09 00:15:50,411 - Pebblo Logger - INFO - Loader Doc request Request processed successfully.
INFO:     127.0.0.1:48514 - "POST /v1/loader/doc HTTP/1.1" 200 OK



^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [849771]
Pebblo server Stopped. BYE!

@rahul-trip
Copy link
Contributor Author

Hi @shreyas-damle
As you suggested, we can actually use the logging instance of uvicorn to send our logs.
see: #128

We can simply send the logs to uvicorn logging instance as it has a context of its own. 🙌

cc: @srics

@rahul-trip
Copy link
Contributor Author

rahul-trip commented Feb 8, 2024

Another option to print our logs would be to create another context with stdout and stderr restored whenever we need to print the logs.

@rahul-trip
Copy link
Contributor Author

rahul-trip commented Feb 13, 2024

Latest video.:

Screencast.from.13-02-24.01.34.45.PM.IST.webm

@srics / @shreyas-damle

Copy link
Collaborator

@shreyas-damle shreyas-damle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

@shreyas-damle shreyas-damle merged commit 8915843 into daxa-ai:main Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants