我正在尝试解析一个表格并使用pandas处理数据,目前可以得到正确的输出结果,但同时收到了以下警告:
"FutureWarning: 将字面HTML直接传递给'read_html'函数将会在未来的版本中被弃用并移除。若要从字面字符串读取,请将其包裹在'StringIO'对象中"
以下是相关代码段:
def parse_html(box_scores):
with open(box_scores) as f:
html = f.read()
soup = BeautifulSoup(html, features="lxml")
[s.decompose() for s in soup.select("tr.over_header")]
[s.decompose() for s in soup.select("tr.theader")]
return soup
def read_line_score(soup):
line_score = pd.read_html(str(soup), attrs = {'id': 'line_score'})[0]
cols = list(line_score.columns)
cols[0] = "team"
cols[-1] = "total"
line_score.columns = cols
line_score = line_score[["team", "total"]]
return line_score
def read_stats(soup, team, stat):
df = pd.read_html(str(soup), attrs={"id": f"box-{team}-game-{stat}"}, index_col=0)[0]
df = df.apply(pd.to_numeric, errors="coerce")
return df