Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
# Your code here
import numpy as np
import pandas as pd
from pandas import HDFStore,DataFrame# create (or open) an hdf5 file and opens in append mode
currenttimedata = {
"time": "20210101080808",
"desc": "Ford",
"status": "success",
"detail0": "somevalue",
"detail1": "somevalue",
"detail2": "somevalue",
"detail3": "somevalue",
"detail4": "somevalue",
"detail5": "somevalue",
"detail6": "somevalue",
"detail7": "somevalue",
"detail8": "somevalue",
"detail9": "somevalue",
"detail10": "somevalue",
"detail11": "somevalue",
"detail12": "somevalue",
"detail13": "somevalue",
"detail14": "somevalue",
"detail15": "somevalue",
"detail16": "somevalue",
"detail17": "somevalue",
"detail18": "somevalue",
"detail19": "somevalue",
"detail20": "somevalue",
"detail21": "somevalue",
"detail22": "somevalue",
"detail23": "somevalue",
"detail24": "somevalue",
"detail25": "somevalue",
"detail26": "somevalue",
"detail27": "somevalue",
"detail28": "somevalue",
"detail29": "somevalue",
"detail30": "somevalue",
"detail31": "somevalue",
"detail32": "somevalue",
"detail33": "somevalue",
"detail34": "somevalue",
"detail35": "somevalue",
"detail36": "somevalue",
"detail37": "somevalue",
"detail38": "somevalue",
"detail39": "somevalue",
"detail40": "somevalue",
"detail41": "somevalue",
"detail42": "somevalue",
"detail43": "somevalue",
"detail44": "somevalue",
"detail45": "somevalue",
"detail46": "somevalue",
"detail47": "somevalue",
"detail48": "somevalue",
"detail49": "somevalue",
"detail50": "somevalue",
}
hdf =HDFStore('storage.h5')
data = {}
for key, value in currenttimedata.items():
data[key] = [value]
print("data: ", data)
df =DataFrame(data, columns=list(currenttimedata.keys()))
print("df: ", df)
hdf.put('d1', df, format='table', data_columns=True)
print("hdf[d1] 1: ", hdf['d1'])
for x in range(100):
print("x: ", x)
hdf.append('d1', df, format='table', data_columns=True)
print("hdf[d1] 2: ", hdf['d1'])
hdf.close()# closes the file
Problem description
[this should explain why the current behaviour is a problem and why the expected output is a better solution]
append very very slow!
Expected Output
Output of pd.show_versions()
[paste the output of pd.show_versions()
here leaving a blank line after the details tag]
INSTALLED VERSIONS
commit : 2cb9652
python : 3.8.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-42-generic
Version : #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.4
numpy : 1.20.3
pytz : 2021.1
dateutil : 2.8.1
pip : 20.0.2
setuptools : 44.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 3.0.0
IPython : None
pandas_datareader: 0.9.0
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.3
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None