您所寻找的是一个自定义的Formatter
类。使用它会更加Pythonic,因为它本身就是Python日志系统的一部分,并且提供了更好的灵活性和代码可读性。
import pandas as pd
import logging
class DataFrameFormatter(logging.Formatter):
def __init__(self, fmt: str, n_rows: int = 4) -> None:
self.n_rows = n_rows
super().__init__(fmt)
def format(self, record: logging.LogRecord) -> str:
if isinstance(record.msg, pd.DataFrame):
s = ''
if hasattr(record, 'n_rows'):
self.n_rows = record.n_rows
lines = record.msg.head(self.n_rows).to_string().splitlines()
if hasattr(record, 'header'):
record.msg = record.header.strip()
s += super().format(record) + '\n'
for line in lines:
record.msg = line
s += super().format(record) + '\n'
return s.strip()
else:
return super().format(record)
formatter = DataFrameFormatter('%(asctime)s %(levelname)-8s %(message)s', n_rows=4)
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setFormatter(formatter)
logger.addHandler(ch)
df = pd.DataFrame({'a' : [1,2,3,4,5], 'bb': [10, 20, 30, 40 ,50]})
logger.info(df, extra={'header': "这是标题行"})
logger.debug('foo')
logger.info(df, extra={'n_rows': 2})
这段代码将生成以下日志:
2024-01-09 15:09:53,384 INFO 这是标题行
2024-01-09 15:09:53,384 INFO a bb
2024-01-09 15:09:53,384 INFO 0 1 10
2024-01-09 15:09:53,384 INFO 1 2 20
2024-01-09 15:09:53,384 INFO 2 3 30
2024-01-09 15:09:53,384 INFO 3 4 40
2024-01-09 15:09:53,385 DEBUG foo
2024-01-09 15:09:53,385 INFO a bb
2024-01-09 15:09:53,385 INFO 0 1 10
2024-01-09 15:09:53,385 INFO 1 2 20
通过这种方式,您可以轻松地通过extra
参数控制header
(头部信息)和n_rows
(显示数据框的行数),如果未提供这些参数,则会使用默认值。