Skip to content

[QEff Finetune]: Use logger in place of print statements in finetuning scripts #371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

quic-mamta
Copy link
Contributor

Use logger in place of print statements in finetuning scripts

@quic-mamta quic-mamta requested review from quic-swatia and vbaddi and removed request for ochougul and quic-rishinr April 21, 2025 09:08
@quic-mamta quic-mamta self-assigned this Apr 21, 2025
Copy link
Contributor

@vbaddi vbaddi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge this?

Copy link
Contributor

@quic-meetkuma quic-meetkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it is a good change, Mamta! Let us try to include more things as suggested.


# Try importing QAIC-specific module, proceed without it if unavailable
try:
import torch_qaic # noqa: F401
except ImportError as e:
print(f"Warning: {e}. Proceeding without QAIC modules.")
logger.warning(f"{e}. Moving ahead without these qaic modules.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed offline, please set the logger level to INFO in this file before using.
Also add the file handler to the logger.

What i suggest in this case is better to have a function defined in finetuning code, which takes the logger from "from QEfficient.utils.logging_utils import logger" and applies the log level and assigns the file handler. Then that function would return the finetuning logger. Now all the files in finetuning code can call that function to get the updated instance of the logger for all the finetuning use cases. Let us brainstorm on these lines.

@@ -63,6 +65,6 @@ def get_data_collator(dataset_processer, dataset_config):
try:
return getattr(module, func_name)(dataset_processer)
except AttributeError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for all the cases like L62.

Whenever we are raising some exceptions, it is good to log the raised exception, otherwise dumped file logs would represent different picture than the console logs.

We can override the logger to have a custom log function as below.

 import logging

 class CustomLogger(logging.Logger):
     def raise_runtimeerror(self, message):
         self.error(message)
         raise RuntimeError(message)

 logging.setLoggerClass(CustomLogger)
 logger = logging.getLogger(__name__)
 logging.basicConfig(level=logging.DEBUG, filename='app.log', filemode='a',
                     format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

What do you say?

@@ -20,7 +22,7 @@ def __init__(self, tokenizer, csv_name=None, context_length=None):
delimiter=",",
)
except Exception as e:
print(
logger.error(
"Loading of grammar dataset failed! Please see [here](https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/datasets/grammar_dataset/grammar_dataset_process.ipynb) for details on how to download the dataset."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the content is incorrect.
here --> this format is suitable for markdown format. In console this is not required. Directly print the url.

@@ -53,8 +54,7 @@ def update_config(config, **kwargs):
raise ValueError(f"Config '{config_name}' does not have parameter: '{param_name}'")
else:
config_type = type(config).__name__
# FIXME (Meet): Once logger is available put this in debug level.
print(f"[WARNING]: Unknown parameter '{k}' for config type '{config_type}'")
logger.warning(f"Unknown parameter '{k}' for config type '{config_type}'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use logger.debug?

Copy link
Contributor

@quic-meetkuma quic-meetkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address this one as well.

@@ -113,26 +114,26 @@ def train(
for epoch in range(train_config.num_epochs):
if loss_0_counter.item() == train_config.convergence_counter:
if train_config.enable_ddp:
print(
logger.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important change is required.
We need to log on console only for rank == 0 in case of ddp. Otherwise in 64x ddp or 48x ddp will fill the console with lot of information which is not user friendly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants