|
|
|
|
Changelog for python39-pandas-1.4.4-1.2.x86_64.rpm :
* Sat Sep 10 2022 Arun Persaud - specfile: * update required version- update to version 1.4.4: * Fixed regressions + Fixed regression in DataFrame.fillna() not working on a DataFrame with a MultiIndex (GH47649) + Fixed regression in taking NULL objects from a DataFrame causing a segmentation violation. These NULL values are created by numpy.empty_like() (GH46848) + Fixed regression in concat() materializing the Index during sorting even if the Index was already sorted (GH47501) + Fixed regression in concat() or merge() handling of all-NaN ExtensionArrays with custom attributes (GH47762) + Fixed regression in calling bitwise numpy ufuncs (for example, np.bitwise_and) on Index objects (GH46769) + Fixed regression in cut() when using a datetime64 IntervalIndex as bins (GH46218) + Fixed regression in DataFrame.select_dtypes() where include=\"number\" included BooleanDtype (GH46870) + Fixed regression in DataFrame.loc() raising error when indexing with a NamedTuple (GH48124) + Fixed regression in DataFrame.loc() not updating the cache correctly after values were set (GH47867) + Fixed regression in DataFrame.loc() not aligning index in some cases when setting a DataFrame (GH47578) + Fixed regression in DataFrame.loc() setting a length-1 array like value to a single value in the DataFrame (GH46268) + Fixed regression when slicing with DataFrame.loc() with DatetimeIndex with a DateOffset object for its freq (GH46671) + Fixed regression in setting None or non-string value into a string-dtype Series using a mask (GH47628) + Fixed regression in updating a DataFrame column through Series __setitem__ (using chained assignment) not updating column values inplace and using too much memory (GH47172) + Fixed regression in DataFrame.select_dtypes() returning a view on the original DataFrame (GH48090) + Fixed regression using custom Index subclasses (for example, used in xarray) with reset_index() or Index.insert() (GH47071) + Fixed regression in intersection() when the DatetimeIndex has dates crossing daylight savings time (GH46702) + Fixed regression in merge() throwing an error when passing a Series with a multi-level name (GH47946) + Fixed regression in DataFrame.eval() creating a copy when updating inplace (GH47449) + Fixed regression where getting a row using DataFrame.iloc() with SparseDtype would raise (GH46406) * Bug fixes + The FutureWarning raised when passing arguments (other than filepath_or_buffer) as positional in read_csv() is now raised at the correct stacklevel (GH47385) + Bug in DataFrame.to_sql() when method was a callable that did not return an int and would raise a TypeError (GH46891) + Bug in DataFrameGroupBy.value_counts() where subset had no effect (GH46383) + Bug when getting values with DataFrame.loc() with a list of keys causing an internal inconsistency that could lead to a disconnect between frame.at[x, y] vs frame[y].loc[x] (GH22372) + Bug in the Series.dt.strftime() accessor return a float instead of object dtype Series for all-NaT input, which also causes a spurious deprecation warning (GH45858) * Other + The minimum version of Cython needed to compile pandas is now 0.29.32 (GH47978) * Sat Jul 09 2022 Arun Persaud - update to version 1.4.3: * Behavior of concat with empty or all-NA DataFrame columns The behavior change in version 1.4.0 to stop ignoring the data type of empty or all-NA columns with float or object dtype in concat() (Ignoring dtypes in concat with empty or all-NA columns) has been reverted (GH45637). * Fixed regressions + Fixed regression in DataFrame.replace() when the replacement value was explicitly None when passed in a dictionary to to_replace also casting other columns to object dtype even when there were no values to replace (GH46634) + Fixed regression in DataFrame.to_csv() raising error when DataFrame contains extension dtype categorical column (GH46297, GH46812) + Fixed regression in representation of dtypes attribute of MultiIndex (GH46900) + Fixed regression when setting values with DataFrame.loc() updating RangeIndex when index was set as new column and column was updated afterwards (GH47128) + Fixed regression in DataFrame.fillna() and DataFrame.update() creating a copy when updating inplace (GH47188) + Fixed regression in DataFrame.nsmallest() led to wrong results when the sorting column has np.nan values (GH46589) + Fixed regression in read_fwf() raising ValueError when widths was specified with usecols (GH46580) + Fixed regression in concat() not sorting columns for mixed column names (GH47127) + Fixed regression in Groupby.transform() and Groupby.agg() failing with engine=\"numba\" when the index was a MultiIndex (GH46867) + Fixed regression in NaN comparison for Index operations where the same object was compared (GH47105) + Fixed regression is Styler.to_latex() and Styler.to_html() where buf failed in combination with encoding (GH47053) + Fixed regression in read_csv() with index_col=False identifying first row as index names when header=None (GH46955) + Fixed regression in DataFrameGroupBy.agg() when used with list-likes or dict-likes and axis=1 that would give incorrect results; now raises NotImplementedError (GH46995) + Fixed regression in DataFrame.resample() and DataFrame.rolling() when used with list-likes or dict-likes and axis=1 that would raise an unintuitive error message; now raises NotImplementedError (GH46904) + Fixed regression in testing.assert_index_equal() when check_order=False and Index has extension or object dtype (GH47207) + Fixed regression in read_excel() returning ints as floats on certain input sheets (GH46988) + Fixed regression in DataFrame.shift() when axis is columns and fill_value is absent, freq is ignored (GH47039) + Fixed regression in DataFrame.to_json() causing a segmentation violation when DataFrame is created with an index parameter of the type PeriodIndex (GH46683) * Bug fixes + Bug in pandas.eval(), DataFrame.eval() and DataFrame.query() where passing empty local_dict or global_dict was treated as passing None (GH47084) + Most I/O methods no longer suppress OSError and ValueError when closing file handles (GH47136) * Other + The minimum version of Cython needed to compile pandas is now 0.29.30 (GH41935) * Tue Apr 05 2022 Ben Greiner - Update to version 1.4.2 * Fixed regression in DataFrame.drop() and Series.drop() when Index had extension dtype and duplicates (GH45860) * Fixed regression in read_csv() killing python process when invalid file input was given for engine=\"c\" (GH45957) * Fixed memory performance regression in Series.fillna() when called on a DataFrame column with inplace=True (GH46149) * Provided an alternative solution for passing custom Excel formats in Styler.to_excel(), which was a regression based on stricter CSS validation. Examples available in the documentation for Styler.format() (GH46152) * Fixed regression in DataFrame.replace() when a replacement value was also a target for replacement (GH46306) * Fixed regression in DataFrame.replace() when the replacement value was explicitly None when passed in a dictionary to to_replace (GH45601, GH45836) * Fixed regression when setting values with DataFrame.loc() losing MultiIndex names if DataFrame was empty before (GH46317) * Fixed regression when rendering boolean datatype columns with Styler() (GH46384) * Fixed regression in Groupby.rolling() with a frequency window that would raise a ValueError even if the datetimes within each group were monotonic (GH46061) * Fix some cases for subclasses that define their _constructor properties as general callables (GH46018) * Fixed “longtable” formatting in Styler.to_latex() when column_format is given in extended format (GH46037) * Fixed incorrect rendering in Styler.format() with hyperlinks=\"html\" when the url contains a colon or other special characters (GH46389) * Improved error message in Rolling when window is a frequency and NaT is in the rolling axis (GH46087)- Copy back the installed package into the source tree * mimics upstreams test setup of an editable install * avoids conftest.py collection errors with pytest 7 * Sat Feb 12 2022 Arun Persaud - update to version 1.4.1: * Fixed regressions + Regression in Series.mask() with inplace=True and PeriodDtype and an incompatible other coercing to a common dtype instead of raising (GH45546) + Regression in assert_frame_equal() not respecting check_flags=False (GH45554) + Regression in DataFrame.loc() raising ValueError when indexing (getting values) on a MultiIndex with one level (GH45779) + Regression in Series.fillna() with downcast=False incorrectly downcasting object dtype (GH45603) + Regression in api.types.is_bool_dtype() raising an AttributeError when evaluating a categorical Series (GH45615) + Regression in DataFrame.iat() setting values leading to not propagating correctly in subsequent lookups (GH45684) + Regression when setting values with DataFrame.loc() losing Index name if DataFrame was empty before (GH45621) + Regression in join() with overlapping IntervalIndex raising an InvalidIndexError (GH45661) + Regression when setting values with Series.loc() raising with all False indexer and Series on the right hand side (GH45778) + Regression in read_sql() with a DBAPI2 connection that is not an instance of sqlite3.Connection incorrectly requiring SQLAlchemy be installed (GH45660) + Regression in DateOffset when constructing with an integer argument with no keywords (e.g. pd.DateOffset(n)) would behave like datetime.timedelta(days=0) (GH45643, GH45890) * Bug fixes + Fixed segfault in DataFrame.to_json() when dumping tz-aware datetimes in Python 3.10 (GH42130) + Stopped emitting unnecessary FutureWarning in DataFrame.sort_values() with sparse columns (GH45618) + Fixed window aggregations in DataFrame.rolling() and Series.rolling() to skip over unused elements (GH45647) + Fixed builtin highlighters in Styler to be responsive to NA with nullable dtypes (GH45804) + Bug in apply() with axis=1 raising an erroneous ValueError (GH45912) * Other + Reverted performance speedup of DataFrame.corr() for method=pearson to fix precision regression (GH45640, GH42761) * Tue Jan 25 2022 Ben Greiner - Skip more tests on non-intel architectures boo#1167730 * Sun Jan 23 2022 Ben Greiner - Update to version 1.4.0 * https://pandas.pydata.org/docs/whatsnew/v1.4.0.html * Enhancements - Improved warning messages - Index can hold arbitrary ExtensionArrays - Enhancements in Styler - Multi-threaded CSV reading with a new CSV Engine based on pyarrow - Rank function for rolling and expanding windows - Groupby positional indexing - DataFrame.from_dict and DataFrame.to_dict have new \'tight\' option * Notable bug fixes - Inconsistent date string parsing - Ignoring dtypes in concat with empty or all-NA columns - Null-values are no longer coerced to NaN-value in value_counts and mode - mangle_dupe_cols in read_csv no longer renames unique columns conflicting with target names - unstack and pivot_table no longer raises ValueError for result that would exceed int32 limit - groupby.apply consistent transform detection * API changes - Index.get_indexer_for() no longer accepts keyword arguments (other than target); in the past these would be silently ignored if the index was not unique (GH42310) - Change in the position of the min_rows argument in DataFrame.to_string() due to change in the docstring (GH44304) - Reduction operations for DataFrame or Series now raising a ValueError when None is passed for skipna (GH44178) - read_csv() and read_html() no longer raising an error when one of the header rows consists only of Unnamed: columns (GH13054) - Changed the name attribute of several holidays in USFederalHolidayCalendar to match official federal holiday names. * Deprecations - Deprecated Int64Index, UInt64Index & Float64Index - Deprecated Frame.append and Series.append- Split out test runs into separate flavors, optimize memory usage in pytest-xdist runs * Tue Jan 04 2022 Ben Greiner - Update to version 1.3.5 * Fixed regression in Series.equals() when comparing floats with dtype object to None (GH44190) * Fixed regression in merge_asof() raising error when array was supplied as join key (GH42844) * Fixed regression when resampling DataFrame with DateTimeIndex with empty groups and uint8, uint16 or uint32 columns incorrectly raising RuntimeError (GH43329) * Fixed regression in creating a DataFrame from a timezone-aware Timestamp scalar near a Daylight Savings Time transition (GH42505) * Fixed performance regression in read_csv() (GH44106) * Fixed regression in Series.duplicated() and Series.drop_duplicates() when Series has Categorical dtype with boolean categories (GH44351) * Fixed regression in GroupBy.sum() with timedelta64[ns] dtype containing NaT failing to treat that value as NA (GH42659) * Fixed regression in RollingGroupby.cov() and RollingGroupby.corr() when other had the same shape as each group would incorrectly return superfluous groups in the result (GH42915) * Wed Oct 20 2021 Guillaume GARDET - Update to version 1.3.4 * Fixed regression in DataFrame.convert_dtypes() incorrectly converts byte strings to strings (GH43183) * Fixed regression in GroupBy.agg() where it was failing silently with mixed data types along axis=1 and MultiIndex (GH43209) * Fixed regression in merge() with integer and NaN keys failing with outer merge (GH43550) * Fixed regression in DataFrame.corr() raising ValueError with method=\"spearman\" on 32-bit platforms (GH43588) * Fixed performance regression in MultiIndex.equals() (GH43549) * Fixed performance regression in GroupBy.first() and GroupBy.last() with StringDtype (GH41596) * Fixed regression in Series.cat.reorder_categories() failing to update the categories on the Series (GH43232) * Fixed regression in Series.cat.categories() setter failing to update the categories on the Series (GH43334) * Fixed regression in read_csv() raising UnicodeDecodeError exception when memory_map=True (GH43540) * Fixed regression in DataFrame.explode() raising AssertionError when column is any scalar which is not a string (GH43314) * Fixed regression in Series.aggregate() attempting to pass args and kwargs multiple times to the user supplied func in certain cases (GH43357) * Fixed regression when iterating over a DataFrame.groupby.rolling object causing the resulting DataFrames to have an incorrect index if the input groupings were not sorted (GH43386) * Fixed regression in DataFrame.groupby.rolling.cov() and DataFrame.groupby.rolling.corr() computing incorrect results if the input groupings were not sorted (GH43386) * Fixed bug in pandas.DataFrame.groupby.rolling() and pandas.api.indexers.FixedForwardWindowIndexer leading to segfaults and window endpoints being mixed across groups (GH43267) * Fixed bug in GroupBy.mean() with datetimelike values including NaT values returning incorrect results (GH43132) * Fixed bug in Series.aggregate() not passing the first args to the user supplied func in certain cases (GH43357) * Fixed memory leaks in Series.rolling.quantile() and Series.rolling.median() (GH43339) * Mon Sep 20 2021 Ben Greiner - Update to version 1.3.3 * Fixed regression in DataFrame constructor failing to broadcast for defined Index and len one list of Timestamp (GH42810) * Fixed regression in GroupBy.agg() incorrectly raising in some cases (GH42390) * Fixed regression in GroupBy.apply() where nan values were dropped even with dropna=False (GH43205) * Fixed regression in GroupBy.quantile() which was failing with pandas.NA (GH42849) * Fixed regression in merge() where on columns with ExtensionDtype or bool data types were cast to object in right and outer merge (GH40073) * Fixed regression in RangeIndex.where() and RangeIndex.putmask() raising AssertionError when result did not represent a RangeIndex (GH43240) * Fixed regression in read_parquet() where the fastparquet engine would not work properly with fastparquet 0.7.0 (GH43075) * Fixed regression in DataFrame.loc.__setitem__() raising ValueError when setting array as cell value (GH43422) * Fixed regression in is_list_like() where objects with __iter__ set to None would be identified as iterable (GH43373) * Fixed regression in DataFrame.__getitem__() raising error for slice of DatetimeIndex when index is non monotonic (GH43223) * Fixed regression in Resampler.aggregate() when used after column selection would raise if func is a list of aggregation functions (GH42905) * Fixed regression in DataFrame.corr() where Kendall correlation would produce incorrect results for columns with repeated values (GH43401) * Fixed regression in DataFrame.groupby() where aggregation on columns with object types dropped results on those columns (GH42395, GH43108) * Fixed regression in Series.fillna() raising TypeError when filling float Series with list-like fill value having a dtype which couldn’t cast lostlessly (like float32 filled with float64) (GH43424) * Fixed regression in read_csv() raising AttributeError when the file handle is an tempfile.SpooledTemporaryFile object (GH43439) * Fixed performance regression in core.window.ewm. ExponentialMovingWindow.mean() (GH42333) * Performance improvement for DataFrame.__setitem__() when the key or value is not a DataFrame, or key is not list-like (GH43274) * Fixed bug in DataFrameGroupBy.agg() and DataFrameGroupBy. transform() with engine=\"numba\" where index data was not being correctly passed into func (GH43133)- Release 1.3.2 * Performance regression in DataFrame.isin() and Series.isin() for nullable data types (GH42714) * Regression in updating values of Series using boolean index, created by using DataFrame.pop() (GH42530) * Regression in DataFrame.from_records() with empty records (GH42456) * Fixed regression in DataFrame.shift() where TypeError occurred when shifting DataFrame created by concatenation of slices and fills with values (GH42719) * Regression in DataFrame.agg() when the func argument returned lists and axis=1 (GH42727) * Regression in DataFrame.drop() does nothing if MultiIndex has duplicates and indexer is a tuple or list of tuples (GH42771) * Fixed regression where read_csv() raised a ValueError when parameters names and prefix were both set to None (GH42387) * Fixed regression in comparisons between Timestamp object and datetime64 objects outside the implementation bounds for nanosecond datetime64 (GH42794) * Fixed regression in Styler.highlight_min() and Styler. highlight_max() where pandas.NA was not successfully ignored (GH42650) * Fixed regression in concat() where copy=False was not honored in axis=1 Series concatenation (GH42501) * Regression in Series.nlargest() and Series.nsmallest() with nullable integer or float dtype (GH42816) * Fixed regression in Series.quantile() with Int64Dtype (GH42626) * Fixed regression in Series.groupby() and DataFrame.groupby() where supplying the by argument with a Series named with a tuple would incorrectly raise (GH42731) * Bug in read_excel() modifies the dtypes dictionary when reading a file with duplicate columns (GH42462) * 1D slices over extension types turn into N-dimensional slices over ExtensionArrays (GH42430) * Fixed bug in Series.rolling() and DataFrame.rolling() not calculating window bounds correctly for the first row when center=True and window is an offset that covers all the rows (GH42753) * Styler.hide_columns() now hides the index name header row as well as column headers (GH42101) * Styler.set_sticky() has amended CSS to control the column/index names and ensure the correct sticky positions (GH42537) * Bug in de-serializing datetime indexes in PYTHONOPTIMIZED mode (GH42866) * Tue Aug 17 2021 Fabian Vogt - Drop suggests of python-numba (pulls in LLVM10) and python-QtPy (pulls in Qt3D, python-qt5 is enough) to make the TW DVD fit again * Thu Aug 12 2021 Ben Greiner - Update to version 1.3.1 Fixed regressions * Pandas could not be built on PyPy (GH42355) * DataFrame constructed with an older version of pandas could not be unpickled (GH42345) * Performance regression in constructing a DataFrame from a dictionary of dictionaries (GH42248) * Fixed regression in DataFrame.agg() dropping values when the DataFrame had an Extension Array dtype, a duplicate index, and axis=1 (GH42380) * Fixed regression in DataFrame.astype() changing the order of noncontiguous data (GH42396) * Performance regression in DataFrame in reduction operations requiring casting such as DataFrame.mean() on integer data (GH38592) * Performance regression in DataFrame.to_dict() and Series.to_dict () when orient argument one of “records”, “dict”, or “split” (GH42352) * Fixed regression in indexing with a list subclass incorrectly raising TypeError (GH42433, GH42461) * Fixed regression in DataFrame.isin() and Series.isin() raising TypeError with nullable data containing at least one missing value (GH42405) * Regression in concat() between objects with bool dtype and integer dtype casting to object instead of to integer (GH42092) * Bug in Series constructor not accepting a dask.Array (GH38645) * Fixed regression for SettingWithCopyWarning displaying incorrect stacklevel (GH42570) * Fixed regression for merge_asof() raising KeyError when one of the by columns is in the index (GH34488) * Fixed regression in to_datetime() returning pd.NaT for inputs that produce duplicated values, when cache=True (GH42259) * Fixed regression in SeriesGroupBy.value_counts() that resulted in an IndexError when called on a Series with one row (GH42618) * Fixed bug in DataFrame.transpose() dropping values when the DataFrame had an Extension Array dtype and a duplicate index (GH42380) * Fixed bug in DataFrame.to_xml() raising KeyError when called with index=False and an offset index (GH42458) * Fixed bug in Styler.set_sticky() not handling index names correctly for single index columns case (GH42537) * Fixed bug in DataFrame.copy() failing to consolidate blocks in the result (GH42579) * Thu Jul 22 2021 Arun Persaud - specfile: * update requirements * README.rst ->README.md- update to version 1.3.0: * long changelog, see https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.3.0.html- changes from version 1.2.5: * Fixed regression in concat() between two DataFrame where one has an Index that is all-None and the other is DatetimeIndex incorrectly raising (GH40841) * Fixed regression in DataFrame.sum() and DataFrame.prod() when min_count and numeric_only are both given (GH41074) * Fixed regression in read_csv() when using memory_map=True with an non-UTF8 encoding (GH40986) * Fixed regression in DataFrame.replace() and Series.replace() when the values to replace is a NumPy float array (GH40371) * Fixed regression in ExcelFile() when a corrupt file is opened but not closed (GH41778) * Fixed regression in DataFrame.astype() with dtype=str failing to convert NaN in categorical columns (GH41797)- Unpack some files required for testing * Mon May 03 2021 Arun Persaud - update to version 1.2.4: * Fixed regressions + Fixed regression in DataFrame.sum() when min_count greater than the DataFrame shape was passed resulted in a ValueError (GH39738) + Fixed regression in DataFrame.to_json() raising AttributeError when run on PyPy (GH39837) + Fixed regression in (in)equality comparison of pd.NaT with a non-datetimelike numpy array returning a scalar instead of an array (GH40722) + Fixed regression in DataFrame.where() not returning a copy in the case of an all True condition (GH39595) + Fixed regression in DataFrame.replace() raising IndexError when regex was a multi-key dictionary (GH39338) + Fixed regression in repr of floats in an object column not respecting float_format when printed in the console or outputted through DataFrame.to_string(), DataFrame.to_html(), and DataFrame.to_latex() (GH40024) + Fixed regression in NumPy ufuncs such as np.add not passing through all arguments for DataFrame (GH40662) * Wed Mar 03 2021 Arun Persaud - update to version 1.2.3: * Fixed regressions + Fixed regression in to_excel() raising KeyError when giving duplicate columns with columns attribute (GH39695) + Fixed regression in nullable integer unary ops propagating mask on assignment (GH39943) + Fixed regression in DataFrame.__setitem__() not aligning DataFrame on right-hand side for boolean indexer (GH39931) + Fixed regression in to_json() failing to use compression with URL-like paths that are internally opened in binary mode or with user-provided file objects that are opened in binary mode (GH39985) + Fixed regression in Series.sort_index() and DataFrame.sort_index(), which exited with an ungraceful error when having kwarg ascending=None passed. Passing ascending=None is still considered invalid, and the improved error message suggests a proper usage (ascending must be a boolean or a list-like of boolean) (GH39434) + Fixed regression in DataFrame.transform() and Series.transform() giving incorrect column labels when passed a dictionary with a mix of list and non-list values (GH40018) * Sun Feb 14 2021 Ben Greiner - Update to version 1.2.2 * https://pandas.pydata.org/docs/whatsnew/v1.2.2.html * fixed regressions and bugfixes- Update to version 1.2.1 * https://pandas.pydata.org/docs/whatsnew/v1.2.1.html * fixed regressions and bugfixes * Calling NumPy ufuncs on non-aligned DataFrames * The deprecated attributes _AXIS_NAMES and _AXIS_NUMBERS of DataFrame and Series will no longer show up in dir or inspect. getmembers calls (GH38740) * Bumped minimum fastparquet version to 0.4.0 to avoid AttributeError from numba (GH38344) * Bumped minimum pymysql version to 0.8.1 to avoid test failures (GH38344) * Added reference to backwards incompatible check_freq arg of testing.assert_frame_equal() and testing.assert_series_equal() in pandas 1.1.0 whats new (GH34050)- Update to version 1.2.0 * https://pandas.pydata.org/docs/whatsnew/v1.2.0.html * WARNING: The xlwt package for writing old-style .xls excel files is no longer maintained. The xlrd package is now only for reading old-style .xls files. Previously, the default argument engine=None to read_excel() would result in using the xlrd engine in many cases, including new Excel 2007+ (.xlsx) files. If openpyxl is installed, many of these cases will now default to using the openpyxl engine. See the read_excel() documentation for more details. Thus, it is strongly encouraged to install openpyxl to read Excel 2007+ (.xlsx) files. Please do not report issues when using ``xlrd`` to read ``.xlsx`` files. This is no longer supported, switch to using openpyxl instead. Attempting to use the xlwt engine will raise a FutureWarning unless the option io.excel.xls.writer is set to \"xlwt\". While this option is now deprecated and will also raise a FutureWarning, it can be globally set and the warning suppressed. Users are recommended to write .xlsx files using the openpyxl engine instead. Enhancements * Optionally disallow duplicate labels * Passing arguments to fsspec backends * Support for binary file handles in to_csv * Support for short caption and table position in to_latex * Change in default floating precision for read_csv and read_table * Experimental nullable data types for float data * Index/column name preservation when aggregating * GroupBy supports EWM operations directly Deprecations * https://pandas.pydata.org/docs/whatsnew/v1.2.0.html#deprecations- Skip python36 build: New minimum supported Python is 3.7.1- Only Suggest instead of Recommend optional dependencies. Nobody wants to pull in all of those packages by default.- Remove pandas-pytest.ini- Rework test deselection- Limit to 4 pytest-xdist workers, as collection consumes a lot of memory * Fri Oct 30 2020 Arun Persaud - update to version 1.1.4: * Fixed regressions + Fixed regression in read_csv() raising a ValueError when names was of type dict_keys (GH36928) + Fixed regression in read_csv() with more than 1M rows and specifying a index_col argument (GH37094) + Fixed regression where attempting to mutate a DateOffset object would no longer raise an AttributeError (GH36940) + Fixed regression where DataFrame.agg() would fail with TypeError when passed positional arguments to be passed on to the aggregation function (GH36948). + Fixed regression in RollingGroupby with sort=False not being respected (GH36889) + Fixed regression in Series.astype() converting None to \"nan\" when casting to string (GH36904) + Fixed regression in Series.rank() method failing for read-only data (GH37290) + Fixed regression in RollingGroupby causing a segmentation fault with Index of dtype object (GH36727) + Fixed regression in DataFrame.resample(...).apply(...)() raised AttributeError when input was a DataFrame and only a Series was evaluated (GH36951) + Fixed regression in DataFrame.groupby(..).std() with nullable integer dtype (GH37415) + Fixed regression in PeriodDtype comparing both equal and unequal to its string representation (GH37265) + Fixed regression where slicing DatetimeIndex raised AssertionError on irregular time series with pd.NaT or on unsorted indices (GH36953 and GH35509) + Fixed regression in certain offsets (pd.offsets.Day() and below) no longer being hashable (GH37267) + Fixed regression in StataReader which required chunksize to be manually set when using an iterator to read a dataset (GH37280) + Fixed regression in setitem with DataFrame.iloc() which raised error when trying to set a value while filtering with a boolean list (GH36741) + Fixed regression in setitem with a Series getting aligned before setting the values (GH37427) + Fixed regression in MultiIndex.is_monotonic_increasing returning wrong results with NaN in at least one of the levels (GH37220) + Fixed regression in inplace arithmetic operation on a Series not updating the parent DataFrame (GH36373) * Bug fixes + Bug causing groupby(...).sum() and similar to not preserve metadata (GH29442) + Bug in Series.isin() and DataFrame.isin() raising a ValueError when the target was read-only (GH37174) + Bug in GroupBy.fillna() that introduced a performance regression after 1.0.5 (GH36757) + Bug in DataFrame.info() was raising a KeyError when the DataFrame has integer column names (GH37245) + Bug in DataFrameGroupby.apply() would drop a CategoricalIndex when grouped on (GH35792) * Mon Oct 05 2020 Arun Persaud - specfile: * updated cython version- update to version 1.1.3: * Development Changes + The minimum version of Cython is now the most recent bug-fix version (0.29.21) (GH36296). * Fixed regressions + Fixed regression in DataFrame.agg(), DataFrame.apply(), Series.agg(), and Series.apply() where internal suffix is exposed to the users when no relabelling is applied (GH36189) + Fixed regression in IntegerArray unary plus and minus operations raising a TypeError (GH36063) + Fixed regression when adding a timedelta_range() to a Timestamp raised a ValueError (GH35897) + Fixed regression in Series.__getitem__() incorrectly raising when the input was a tuple (GH35534) + Fixed regression in Series.__getitem__() incorrectly raising when the input was a frozenset (GH35747) + Fixed regression in modulo of Index, Series and DataFrame using numexpr using C not Python semantics (GH36047, GH36526) + Fixed regression in read_excel() with engine=\"odf\" caused UnboundLocalError in some cases where cells had nested child nodes (GH36122, GH35802) + Fixed regression in DataFrame.replace() inconsistent replace when using a float in the replace method (GH35376) + Fixed regression in Series.loc() on a Series with a MultiIndex containing Timestamp raising InvalidIndexError (GH35858) + Fixed regression in DataFrame and Series comparisons between numeric arrays and strings (GH35700, GH36377) + Fixed regression in DataFrame.apply() with raw=True and user-function returning string (GH35940) + Fixed regression when setting empty DataFrame column to a Series in preserving name of index in frame (GH36527) + Fixed regression in Period incorrect value for ordinal over the maximum timestamp (GH36430) + Fixed regression in read_table() raised ValueError when delim_whitespace was set to True (GH35958) + Fixed regression in Series.dt.normalize() when normalizing pre-epoch dates the result was shifted one day (GH36294) * Bug fixes + Bug in read_spss() where passing a pathlib.Path as path would raise a TypeError (GH33666) + Bug in Series.str.startswith() and Series.str.endswith() with category dtype not propagating na parameter (GH36241) + Bug in Series constructor where integer overflow would occur for sufficiently large scalar inputs when an index was provided (GH36291) + Bug in DataFrame.sort_values() raising an AttributeError when sorting on a key that casts column to categorical dtype (GH36383) + Bug in DataFrame.stack() raising a ValueError when stacking MultiIndex columns based on position when the levels had duplicate names (GH36353) + Bug in Series.astype() showing too much precision when casting from np.float32 to string dtype (GH36451) + Bug in Series.isin() and DataFrame.isin() when using NaN and a row length above 1,000,000 (GH22205) + Bug in cut() raising a ValueError when passed a Series of labels with ordered=False (GH36603) * Other + Reverted enhancement added in pandas-1.1.0 where timedelta_range() infers a frequency when passed start, stop, and periods (GH32377) * Sat Sep 12 2020 Arun Persaud - update to version 1.1.2: * Fixed regressions + Regression in DatetimeIndex.intersection() incorrectly raising AssertionError when intersecting against a list (GH35876) + Fix regression in updating a column inplace (e.g. using df[\'col\'].fillna(.., inplace=True)) (GH35731) + Fix regression in DataFrame.append() mixing tz-aware and tz-naive datetime columns (GH35460) + Performance regression for RangeIndex.format() (GH35712) + Regression where MultiIndex.get_loc() would return a slice spanning the full index when passed an empty list (GH35878) + Fix regression in invalid cache after an indexing operation; this can manifest when setting which does not update the data (GH35521) + Regression in DataFrame.replace() where a TypeError would be raised when attempting to replace elements of type Interval (GH35931) + Fix regression in pickle roundtrip of the closed attribute of IntervalIndex (GH35658) + Fixed regression in DataFrameGroupBy.agg() where a ValueError: buffer source array is read-only would be raised when the underlying array is read-only (GH36014) + Fixed regression in Series.groupby.rolling() number of levels of MultiIndex in input was compressed to one (GH36018) + Fixed regression in DataFrameGroupBy on an empty DataFrame (GH36197) * Bug fixes + Bug in DataFrame.eval() with object dtype column binary operations (GH35794) + Bug in Series constructor raising a TypeError when constructing sparse datetime64 dtypes (GH35762) + Bug in DataFrame.apply() with result_type=\"reduce\" returning with incorrect index (GH35683) + Bug in Series.astype() and DataFrame.astype() not respecting the errors argument when set to \"ignore\" for extension dtypes (GH35471) + Bug in DateTimeIndex.format() and PeriodIndex.format() with name=True setting the first item to \"None\" where it should be \"\" (GH35712) + Bug in Float64Index.__contains__() incorrectly raising TypeError instead of returning False (GH35788) + Bug in Series constructor incorrectly raising a TypeError when passed an ordered set (GH36044) + Bug in Series.dt.isocalendar() and DatetimeIndex.isocalendar() that returned incorrect year for certain dates (GH36032) + Bug in DataFrame indexing returning an incorrect Series in some cases when the series has been altered and a cache not invalidated (GH33675) + Bug in DataFrame.corr() causing subsequent indexing lookups to be incorrect (GH35882) + Bug in import_optional_dependency() returning incorrect package names in cases where package name is different from import name (GH35948) + Bug when setting empty DataFrame column to a Series in preserving name of index in frame (GH31368) * Other + factorize() now supports na_sentinel=None to include NaN in the uniques of the values and remove dropna keyword which was unintentionally exposed to public facing API in 1.1 version from factorize() (GH35667) + DataFrame.plot() and Series.plot() raise UserWarning about usage of FixedFormatter and FixedLocator (GH35684 and GH35945) * Sat Sep 05 2020 Arun Persaud - specfile: * updated versions of some requirements, require numpy during build * removed pandas-pr34991-npconstructor.patch, included upstream * removed sed commands that are not needed anymore * skip test to see if pandas is installed- update to version 1.1.1: * Fixed regressions + Fixed regression in CategoricalIndex.format() where, when stringified scalars had different lengths, the shorter string would be right-filled with spaces, so it had the same length as the longest string (GH35439) + Fixed regression in Series.truncate() when trying to truncate a single-element series (GH35544) + Fixed regression where DataFrame.to_numpy() would raise a RuntimeError for mixed dtypes when converting to str (GH35455) + Fixed regression where read_csv() would raise a ValueError when pandas.options.mode.use_inf_as_na was set to True (GH35493) + Fixed regression where pandas.testing.assert_series_equal() would raise an error when non-numeric dtypes were passed with check_exact=True (GH35446) + Fixed regression in .groupby(..).rolling(..) where column selection was ignored (GH35486) + Fixed regression where DataFrame.interpolate() would raise a TypeError when the DataFrame was empty (GH35598) + Fixed regression in DataFrame.shift() with axis=1 and heterogeneous dtypes (GH35488) + Fixed regression in DataFrame.diff() with read-only data (GH35559) + Fixed regression in .groupby(..).rolling(..) where a segfault would occur with center=True and an odd number of values (GH35552) + Fixed regression in DataFrame.apply() where functions that altered the input in-place only operated on a single row (GH35462) + Fixed regression in DataFrame.reset_index() would raise a ValueError on empty DataFrame with a MultiIndex with a datetime64 dtype level (GH35606, GH35657) + Fixed regression where pandas.merge_asof() would raise a UnboundLocalError when left_index, right_index and tolerance were set (GH35558) + Fixed regression in .groupby(..).rolling(..) where a custom BaseIndexer would be ignored (GH35557) + Fixed regression in DataFrame.replace() and Series.replace() where compiled regular expressions would be ignored during replacement (GH35680) + Fixed regression in aggregate() where a list of functions would produce the wrong results if at least one of the functions did not aggregate (GH35490) + Fixed memory usage issue when instantiating large pandas.arrays.StringArray (GH35499) * Bug fixes + Bug in Styler whereby cell_ids argument had no effect due to other recent changes (GH35588) (GH35663) + Bug in pandas.testing.assert_series_equal() and pandas.testing.assert_frame_equal() where extension dtypes were not ignored when check_dtypes was set to False (GH35715) + Bug in to_timedelta() fails when arg is a Series with Int64 dtype containing null values (GH35574) + Bug in .groupby(..).rolling(..) where passing closed with column selection would raise a ValueError (GH35549) + Bug in DataFrame constructor failing to raise ValueError in some cases when data and index have mismatched lengths (GH33437)- changes from version 1.1.0: * Enhancements + KeyErrors raised by loc specify missing labels + All dtypes can now be converted to \"StringDtype\" + Non-monotonic PeriodIndex Partial String Slicing + Comparing two `DataFrame` or two `Series` and summarizing the differences + Allow NA in groupby key + Sorting with keys + Fold argument support in Timestamp constructor + Parsing timezone-aware format with different timezones in to_datetime + Grouper and resample now supports the arguments origin and offset + fsspec now used for filesystem handling * see https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.1.0.html for complete list * Wed Jul 22 2020 Benjamin Greiner - support newest numpy by removing old test gh#pandas-dev/pandas#34991 pandas-pr34991-npconstructor.patch- move testing to multibuild flavor- run slow tests only on x86_64- replace gcc10-skip-one-test.patch with pytest -k deselection- tidy SKIP_TESTS declarations- add pandas-pytest.ini as pytest.ini in order to support the custom marks and filter some warnings- remove random hash seed * Tue Jun 30 2020 Matej Cepl - Skip test_raw_roundtrip on i586 * Wed Jun 24 2020 Todd R - Update to version 1.0.5 * Fixed regressions + Fix regression in read_parquet() when reading from file-like objects (GH34467). + Fix regression in reading from public S3 buckets (GH34626). Note this disables the ability to read Parquet files from directories on S3 again (GH26388, GH34632), which was added in the 1.0.4 release, but is now targeted for pandas 1.1.0. + Fixed regression in replace() raising an AssertionError when replacing values in an extension dtype with values of a different dtype (GH34530) * Bug fixes + Fixed building from source with Python 3.8 fetching the wrong version of NumPy * Sat May 30 2020 Arun Persaud - update to version 1.0.4: * Fixed regressions + Fix regression where :meth:`Series.isna` and :meth:`DataFrame.isna` would raise for categorical dtype when pandas.options.mode.use_inf_as_na was set to True (:issue:`33594`) + Fix regression in :meth:`GroupBy.first` and :meth:`GroupBy.last` where None is not preserved in object dtype (:issue:`32800`) + Fix regression in DataFrame reductions using numeric_only=True and ExtensionArrays (:issue:`33256`). + Fix performance regression in memory_usage(deep=True) for object dtype (:issue:`33012`) + Fix regression where :meth:`Categorical.replace` would replace with NaN whenever the new value and replacement value were equal (:issue:`33288`) + Fix regression where an ordered :class:`Categorical` containing only NaN values would raise rather than returning NaN when taking the minimum or maximum (:issue:`33450`) + Fix regression in :meth:`DataFrameGroupBy.agg` with dictionary input losing ExtensionArray dtypes (:issue:`32194`) + Fix to preserve the ability to index with the \"nearest\" method with xarray\'s CFTimeIndex, an :class:`Index` subclass (pydata/xarray#3751, :issue:`32905`). + Fix regression in :meth:`DataFrame.describe` raising TypeError: unhashable type: \'dict\' (:issue:`32409`) + Fix regression in :meth:`DataFrame.replace` casts columns to object dtype if items in to_replace not in values (:issue:`32988`) + Fix regression in :meth:`Series.groupby` would raise ValueError when grouping by :class:`PeriodIndex` level (:issue:`34010`) + Fix regression in :meth:`GroupBy.rolling.apply` ignores args and kwargs parameters (:issue:`33433`) + Fix regression in error message with np.min or np.max on unordered :class:`Categorical` (:issue:`33115`) + Fix regression in :meth:`DataFrame.loc` and :meth:`Series.loc` throwing an error when a datetime64[ns, tz] value is provided (:issue:`32395`) * Bug fixes + Bug in :meth:`SeriesGroupBy.first`, :meth:`SeriesGroupBy.last`, :meth:`SeriesGroupBy.min`, and :meth:`SeriesGroupBy.max` returning floats when applied to nullable Booleans (:issue:`33071`) + Bug in :meth:`Rolling.min` and :meth:`Rolling.max`: Growing memory usage after multiple calls when using a fixed window (:issue:`30726`) + Bug in :meth:`~DataFrame.to_parquet` was not raising PermissionError when writing to a private s3 bucket with invalid creds. (:issue:`27679`) + Bug in :meth:`~DataFrame.to_csv` was silently failing when writing to an invalid s3 bucket. (:issue:`32486`) + Bug in :meth:`read_parquet` was raising a FileNotFoundError when passed an s3 directory path. (:issue:`26388`) + Bug in :meth:`~DataFrame.to_parquet` was throwing an AttributeError when writing a partitioned parquet file to s3 (:issue:`27596`) + Bug in :meth:`GroupBy.quantile` causes the quantiles to be shifted when the by axis contains NaN (:issue:`33200`, :issue:`33569`) * Mon May 25 2020 Martin Liška - Add gcc10-skip-one-test.patch in order to fix a failing test-case on i586. * Sat Mar 28 2020 Arun Persaud - update to 1.0.3: * Fixed regressions + Fixed regression in resample.agg when the underlying data is non-writeable (GH31710) + Fixed regression in DataFrame exponentiation with reindexing (GH32685)- Increase memory _constraints to 8GB RAM. * Mon Mar 16 2020 Tomáš Chvátal - Skip i586 failing tests with upstream ticket * Fri Mar 13 2020 Hans-Peter Jansen - Update to 1.0.2: * see https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.0.2.html- Add pyperclip and Jinja2 as test dependencies * Mon Mar 09 2020 Dirk Mueller - Update to 1.0.1: * see https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.0.1.html * see https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.0.0.html * Tue Jan 14 2020 Tomáš Chvátal - Skip one test that fails on 32bit: test_encode_non_c_locale * Mon Nov 11 2019 Steve Kowalik - Update to version 0.25.3 + Support Python 3.8 + Bug fixes > Indexing * Fix regression in DataFrame.reindex() not following the limit argument * Fix regression in RangeIndex.get_indexer() for decreasing RangeIndex where target values may be improperly identified as missing/present > I/O * Fix regression in notebook display where tags were missing for DataFrame.index values * Regression in to_csv() where writing a Series or DataFrame indexed by an IntervalIndex would incorrectly raise a TypeError * Fix to_csv() with ExtensionArray with list-like values > Groupby/resample/rolling * Bug incorrectly raising an IndexError when passing a list of quantiles to pandas.core.groupby.DataFrameGroupBy.quantile() * Bug in pandas.core.groupby.GroupBy.shift(), pandas.core.groupby.GroupBy.bfill() and pandas.core.groupby.GroupBy.ffill() where timezone information would be dropped * Bug in DataFrameGroupBy.quantile() where NA values in the grouping could cause segfaults or incorrect results * Fri Sep 20 2019 Tomáš Chvátal - Use xdist to run tests in threads, it takes ages otherwise * Wed Aug 28 2019 Todd R - Update to version 0.25.1 + Bug fixes > Categorical * Bug in :meth:`Categorical.fillna` that would replace all values, not just those that are ``NaN`` > Datetimelike * Bug in :func:`to_datetime` where passing a timezone-naive :class:`DatetimeArray` or :class:`DatetimeIndex` and ``utc=True`` would incorrectly return a timezone-naive result * Bug in :meth:`Period.to_timestamp` where a :class:`Period` outside the :class:`Timestamp` implementation bounds (roughly 1677-09-21 to 2262-04-11) would return an incorrect :class:`Timestamp` instead of raising ``OutOfBoundsDatetime`` * Bug in iterating over :class:`DatetimeIndex` when the underlying data is read-only > Timezones * Bug in :class:`Index` where a numpy object array with a timezone aware :class:`Timestamp` and ``np.nan`` would not return a :class:`DatetimeIndex` > Numeric * Bug in :meth:`Series.interpolate` when using a timezone aware :class:`DatetimeIndex` * Bug when printing negative floating point complex numbers would raise an ``IndexError`` * Bug where :class:`DataFrame` arithmetic operators such as :meth:`DataFrame.mul` with a :class:`Series` with axis=1 would raise an ``AttributeError`` on :class:`DataFrame` larger than the minimum threshold to invoke numexpr * Bug in :class:`DataFrame` arithmetic where missing values in results were incorrectly masked with ``NaN`` instead of ``Inf`` > Conversion * Improved the warnings for the deprecated methods :meth:`Series.real` and :meth:`Series.imag` > Interval * Bug in :class:`IntervalIndex` where `dir(obj)` would raise ``ValueError`` > Indexing * Bug in partial-string indexing returning a NumPy array rather than a ``Series`` when indexing with a scalar like ``.loc[\'2015\']`` * Break reference cycle involving :class:`Index` and other index classes to allow garbage collection of index objects without running the GC. * Fix regression in assigning values to a single column of a DataFrame with a ``MultiIndex`` columns. * Fix regression in ``.ix`` fallback with an ``IntervalIndex``. > Missing * Bug in :func:`pandas.isnull` or :func:`pandas.isna` when the input is a type e.g. ``type(pandas.Series())`` > I/O * Avoid calling ``S3File.s3`` when reading parquet, as this was removed in s3fs version 0.3.0 * Better error message when a negative header is passed in :func:`pandas.read_csv` * Follow the ``min_rows`` display option (introduced in v0.25.0) correctly in the HTML repr in the notebook. > Plotting * Added a ``pandas_plotting_backends`` entrypoint group for registering plot backends. See :ref:`extending.plotting-backends` for more. * Fixed the re-instatement of Matplotlib datetime converters after calling :meth:`pandas.plotting.deregister_matplotlib_converters`. * Fix compatibility issue with matplotlib when passing a pandas ``Index`` to a plot call. > Groupby/resample/rolling * Fixed regression in :meth:`pands.core.groupby.DataFrameGroupBy.quantile` raising when multiple quantiles are given * Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` where applying a timezone conversion lambda function would drop timezone information * Bug in :meth:`pandas.core.groupby.GroupBy.nth` where ``observed=False`` was being ignored for Categorical groupers * Bug in windowing over read-only arrays * Fixed segfault in `pandas.core.groupby.DataFrameGroupBy.quantile` when an invalid quantile was passed > Reshaping * A ``KeyError`` is now raised if ``.unstack()`` is called on a :class:`Series` or :class:`DataFrame` with a flat :class:`Index` passing a name which is not the correct one * Bug :meth:`merge_asof` could not merge :class:`Timedelta` objects when passing `tolerance` kwarg * Bug in :meth:`DataFrame.crosstab` when ``margins`` set to ``True`` and ``normalize`` is not ``False``, an error is raised. * :meth:`DataFrame.join` now suppresses the ``FutureWarning`` when the sort parameter is specified * Bug in :meth:`DataFrame.join` raising with readonly arrays > Sparse * Bug in reductions for :class:`Series` with Sparse dtypes > Other * Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` when replacing timezone-aware timestamps using a dict-like replacer * Bug in :meth:`Series.rename` when using a custom type indexer. Now any value that isn\'t callable or dict-like is treated as a scalar. * Mon Jul 22 2019 Todd R - Update to Version 0.25.0 + Warning * Starting with the 0.25.x series of releases, pandas only supports Python 3.5.3 and higher. * The minimum supported Python version will be bumped to 3.6 in a future release. * Panel has been fully removed. For N-D labeled data structures, please use xarray * read_pickle read_msgpack are only guaranteed backwards compatible back to pandas version 0.20.3 + Enhancements * Groupby aggregation with relabeling Pandas has added special groupby behavior, known as \"named aggregation\", for naming the output columns when applying multiple aggregation functions to specific columns. * Groupby Aggregation with multiple lambdas You can now provide multiple lambda functions to a list-like aggregation in pandas.core.groupby.GroupBy.agg. * Better repr for MultiIndex Printing of MultiIndex instances now shows tuples of each row and ensures that the tuple items are vertically aligned, so it\'s now easier to understand the structure of the MultiIndex. * Shorter truncated repr for Series and DataFrame Currently, the default display options of pandas ensure that when a Series or DataFrame has more than 60 rows, its repr gets truncated to this maximum of 60 rows (the display.max_rows option). However, this still gives a repr that takes up a large part of the vertical screen estate. Therefore, a new option display.min_rows is introduced with a default of 10 which determines the number of rows showed in the truncated repr: * Json normalize with max_level param support json_normalize normalizes the provided input dict to all nested levels. The new max_level parameter provides more control over which level to end normalization. * Series.explode to split list-like values to rows Series and DataFrame have gained the DataFrame.explode methods to transform list-likes to individual rows. * DataFrame.plot keywords logy, logx and loglog can now accept the value \'sym\' for symlog scaling. * Added support for ISO week year format (\'%G-%V-%u\') when parsing datetimes using to_datetime * Indexing of DataFrame and Series now accepts zerodim np.ndarray * Timestamp.replace now supports the fold argument to disambiguate DST transition times * DataFrame.at_time and Series.at_time now support datetime.time objects with timezones * DataFrame.pivot_table now accepts an observed parameter which is passed to underlying calls to DataFrame.groupby to speed up grouping categorical data. * Series.str has gained Series.str.casefold method to removes all case distinctions present in a string * DataFrame.set_index now works for instances of abc.Iterator, provided their output is of the same length as the calling frame * DatetimeIndex.union now supports the sort argument. The behavior of the sort parameter matches that of Index.union * RangeIndex.union now supports the sort argument. If sort=False an unsorted Int64Index is always returned. sort=None is the default and returns a monotonically increasing RangeIndex if possible or a sorted Int64Index if not * TimedeltaIndex.intersection now also supports the sort keyword * DataFrame.rename now supports the errors argument to raise errors when attempting to rename nonexistent keys * Added api.frame.sparse for working with a DataFrame whose values are sparse * RangeIndex has gained ~RangeIndex.start, ~RangeIndex.stop, and ~RangeIndex.step attributes * datetime.timezone objects are now supported as arguments to timezone methods and constructors * DataFrame.query and DataFrame.eval now supports quoting column names with backticks to refer to names with spaces * merge_asof now gives a more clear error message when merge keys are categoricals that are not equal * pandas.core.window.Rolling supports exponential (or Poisson) window type * Error message for missing required imports now includes the original import error\'s text * DatetimeIndex and TimedeltaIndex now have a mean method * DataFrame.describe now formats integer percentiles without decimal point * Added support for reading SPSS .sav files using read_spss * Added new option plotting.backend to be able to select a plotting backend different than the existing matplotlib one. Use pandas.set_option(\'plotting.backend\', \'\') where * pandas.offsets.BusinessHour supports multiple opening hours intervals * read_excel can now use openpyxl to read Excel files via the engine=\'openpyxl\' argument. This will become the default in a future release * pandas.io.excel.read_excel supports reading OpenDocument tables. Specify engine=\'odf\' to enable. Consult the IO User Guide for more details * Interval, IntervalIndex, and ~arrays.IntervalArray have gained an ~Interval.is_empty attribute denoting if the given interval(s) are empty + Backwards incompatible API changes * Indexing with date strings with UTC offsets Indexing a DataFrame or Series with a DatetimeIndex with a date string with a UTC offset would previously ignore the UTC offset. Now, the UTC offset is respected in indexing. * MultiIndex constructed from levels and codes Constructing a MultiIndex with NaN levels or codes value < -1 was allowed previously. Now, construction with codes value < -1 is not allowed and NaN levels\' corresponding codes would be reassigned as -1. * Groupby.apply on DataFrame evaluates first group only once The implementation of DataFrameGroupBy.apply() previously evaluated the supplied function consistently twice on the first group to infer if it is safe to use a fast code path. Particularly for functions with side effects, this was an undesired behavior and may have led to surprises. * Concatenating sparse values When passed DataFrames whose values are sparse, concat will now return a Series or DataFrame with sparse values, rather than a SparseDataFrame . * The .str-accessor performs stricter type checks Due to the lack of more fine-grained dtypes, Series.str so far only checked whether the data was of object dtype. Series.str will now infer the dtype data *within * the Series; in particular, \'bytes\'-only data will raise an exception (except for Series.str.decode, Series.str.get, Series.str.len, Series.str.slice). * Categorical dtypes are preserved during groupby Previously, columns that were categorical, but not the groupby key(s) would be converted to object dtype during groupby operations. Pandas now will preserve these dtypes. * Incompatible Index type unions When performing Index.union operations between objects of incompatible dtypes, the result will be a base Index of dtype object. This behavior holds true for unions between Index objects that previously would have been prohibited. The dtype of empty Index objects will now be evaluated before performing union operations rather than simply returning the other Index object. Index.union can now be considered commutative, such that A.union(B) == B.union(A) . * DataFrame groupby ffill/bfill no longer return group labels The methods ffill, bfill, pad and backfill of DataFrameGroupBy previously included the group labels in the return value, which was inconsistent with other groupby transforms. Now only the filled values are returned. * DataFrame describe on an empty categorical / object column will return top and freq When calling DataFrame.describe with an empty categorical / object column, the \'top\' and \'freq\' columns were previously omitted, which was inconsistent with the output for non-empty columns. Now the \'top\' and \'freq\' columns will always be included, with numpy.nan in the case of an empty DataFrame * __str__ methods now call __repr__ rather than vice versa Pandas has until now mostly defined string representations in a Pandas objects\'s __str__/__unicode__/__bytes__ methods, and called __str__ from the __repr__ method, if a specific __repr__ method is not found. This is not needed for Python3. In Pandas 0.25, the string representations of Pandas objects are now generally defined in __repr__, and calls to __str__ in general now pass the call on to the __repr__, if a specific __str__ method doesn\'t exist, as is standard for Python. This change is backward compatible for direct usage of Pandas, but if you subclass Pandas objects *and * give your subclasses specific __str__/__repr__ methods, you may have to adjust your __str__/__repr__ methods . * Indexing an IntervalIndex with Interval objects Indexing methods for IntervalIndex have been modified to require exact matches only for Interval queries. IntervalIndex methods previously matched on any overlapping Interval. Behavior with scalar points, e.g. querying with an integer, is unchanged . * Binary ufuncs on Series now align Applying a binary ufunc like numpy.power now aligns the inputs when both are Series . * Categorical.argsort now places missing values at the end Categorical.argsort now places missing values at the end of the array, making it consistent with NumPy and the rest of pandas . * Column order is preserved when passing a list of dicts to DataFrame Starting with Python 3.7 the key-order of dict is guaranteed _. In practice, this has been true since Python 3.6. The DataFrame constructor now treats a list of dicts in the same way as it does a list of OrderedDict, i.e. preserving the order of the dicts. This change applies only when pandas is running on Python>=3.6 . * Increased minimum versions for dependencies * DatetimeTZDtype will now standardize pytz timezones to a common timezone instance * Timestamp and Timedelta scalars now implement the to_numpy method as aliases to Timestamp.to_datetime64 and Timedelta.to_timedelta64, respectively. * Timestamp.strptime will now rise a NotImplementedError * Comparing Timestamp with unsupported objects now returns :pyNotImplemented instead of raising TypeError. This implies that unsupported rich comparisons are delegated to the other object, and are now consistent with Python 3 behavior for datetime objects * Bug in DatetimeIndex.snap which didn\'t preserving the name of the input Index * The arg argument in pandas.core.groupby.DataFrameGroupBy.agg has been renamed to func * The arg argument in pandas.core.window._Window.aggregate has been renamed to func * Most Pandas classes had a __bytes__ method, which was used for getting a python2-style bytestring representation of the object. This method has been removed as a part of dropping Python2 * The .str-accessor has been disabled for 1-level MultiIndex, use MultiIndex.to_flat_index if necessary * Removed support of gtk package for clipboards * Using an unsupported version of Beautiful Soup 4 will now raise an ImportError instead of a ValueError * Series.to_excel and DataFrame.to_excel will now raise a ValueError when saving timezone aware data. * ExtensionArray.argsort places NA values at the end of the sorted array. * DataFrame.to_hdf and Series.to_hdf will now raise a NotImplementedError when saving a MultiIndex with extention data types for a fixed format. * Passing duplicate names in read_csv will now raise a ValueError + Deprecations * Sparse subclasses The SparseSeries and SparseDataFrame subclasses are deprecated. Their functionality is better-provided by a Series or DataFrame with sparse values. * msgpack format The msgpack format is deprecated as of 0.25 and will be removed in a future version. It is recommended to use pyarrow for on-the-wire transmission of pandas objects. * The deprecated .ix[] indexer now raises a more visible FutureWarning instead of DeprecationWarning . * Deprecated the units=M (months) and units=Y (year) parameters for units of pandas.to_timedelta, pandas.Timedelta and pandas.TimedeltaIndex * pandas.concat has deprecated the join_axes-keyword. Instead, use DataFrame.reindex or DataFrame.reindex_like on the result or on the inputs * The SparseArray.values attribute is deprecated. You can use np.asarray(...) or the SparseArray.to_dense method instead . * The functions pandas.to_datetime and pandas.to_timedelta have deprecated the box keyword. Instead, use to_numpy or Timestamp.to_datetime64 or Timedelta.to_timedelta64. * The DataFrame.compound and Series.compound methods are deprecated and will be removed in a future version . * The internal attributes _start, _stop and _step attributes of RangeIndex have been deprecated. Use the public attributes ~RangeIndex.start, ~RangeIndex.stop and ~RangeIndex.step instead . * The Series.ftype, Series.ftypes and DataFrame.ftypes methods are deprecated and will be removed in a future version. Instead, use Series.dtype and DataFrame.dtypes . * The Series.get_values, DataFrame.get_values, Index.get_values, SparseArray.get_values and Categorical.get_values methods are deprecated. One of np.asarray(..) or ~Series.to_numpy can be used instead . * The \'outer\' method on NumPy ufuncs, e.g. np.subtract.outer has been deprecated on Series objects. Convert the input to an array with Series.array first * Timedelta.resolution is deprecated and replaced with Timedelta.resolution_string. In a future version, Timedelta.resolution will be changed to behave like the standard library datetime.timedelta.resolution * read_table has been undeprecated. * Index.dtype_str is deprecated. * Series.imag and Series.real are deprecated. * Series.put is deprecated. * Index.item and Series.item is deprecated. * The default value ordered=None in ~pandas.api.types.CategoricalDtype has been deprecated in favor of ordered=False. When converting between categorical types ordered=True must be explicitly passed in order to be preserved. * Index.contains is deprecated. Use key in index (__contains__) instead . * DataFrame.get_dtype_counts is deprecated. * Categorical.ravel will return a Categorical instead of a np.ndarray + Removal of prior version deprecations/changes * Removed Panel * Removed the previously deprecated sheetname keyword in read_excel * Removed the previously deprecated TimeGrouper * Removed the previously deprecated parse_cols keyword in read_excel * Removed the previously deprecated pd.options.html.border * Removed the previously deprecated convert_objects * Removed the previously deprecated select method of DataFrame and Series * Removed the previously deprecated behavior of Series treated as list-like in ~Series.cat.rename_categories * Removed the previously deprecated DataFrame.reindex_axis and Series.reindex_axis * Removed the previously deprecated behavior of altering column or index labels with Series.rename_axis or DataFrame.rename_axis * Removed the previously deprecated tupleize_cols keyword argument in read_html, read_csv, and DataFrame.to_csv * Removed the previously deprecated DataFrame.from.csv and Series.from_csv * Removed the previously deprecated raise_on_error keyword argument in DataFrame.where and DataFrame.mask * Removed the previously deprecated ordered and categories keyword arguments in astype * Removed the previously deprecated cdate_range * Removed the previously deprecated True option for the dropna keyword argument in SeriesGroupBy.nth * Removed the previously deprecated convert keyword argument in Series.take and DataFrame.take + Performance improvements * Significant speedup in SparseArray initialization that benefits most operations, fixing performance regression introduced in v0.20.0 * DataFrame.to_stata() is now faster when outputting data with any string or non-native endian columns * Improved performance of Series.searchsorted. The speedup is especially large when the dtype is int8/int16/int32 and the searched key is within the integer bounds for the dtype * Improved performance of pandas.core.groupby.GroupBy.quantile * Improved performance of slicing and other selected operation on a RangeIndex * RangeIndex now performs standard lookup without instantiating an actual hashtable, hence saving memory * Improved performance of read_csv by faster tokenizing and faster parsing of small float numbers * Improved performance of read_csv by faster parsing of N/A and boolean values * Improved performance of IntervalIndex.is_monotonic, IntervalIndex.is_monotonic_increasing and IntervalIndex.is_monotonic_decreasing by removing conversion to MultiIndex * Improved performance of DataFrame.to_csv when writing datetime dtypes * Improved performance of read_csv by much faster parsing of MM/YYYY and DD/MM/YYYY datetime formats * Improved performance of nanops for dtypes that cannot store NaNs. Speedup is particularly prominent for Series.all and Series.any * Improved performance of Series.map for dictionary mappers on categorical series by mapping the categories instead of mapping all values * Improved performance of IntervalIndex.intersection * Improved performance of read_csv by faster concatenating date columns without extra conversion to string for integer/float zero and float NaN; by faster checking the string for the possibility of being a date * Improved performance of IntervalIndex.is_unique by removing conversion to MultiIndex * Restored performance of DatetimeIndex.__iter__ by re-enabling specialized code path * Improved performance when building MultiIndex with at least one CategoricalIndex level * Improved performance by removing the need for a garbage collect when checking for SettingWithCopyWarning * For to_datetime changed default value of cache parameter to True * Improved performance of DatetimeIndex and PeriodIndex slicing given non-unique, monotonic data . * Improved performance of pd.read_json for index-oriented data. * Improved performance of MultiIndex.shape . + Bug fixes > Categorical * Bug in DataFrame.at and Series.at that would raise exception if the index was a CategoricalIndex * Fixed bug in comparison of ordered Categorical that contained missing values with a scalar which sometimes incorrectly resulted in True * Bug in DataFrame.dropna when the DataFrame has a CategoricalIndex containing Interval objects incorrectly raised a TypeError > Datetimelike * Bug in to_datetime which would raise an (incorrect) ValueError when called with a date far into the future and the format argument specified instead of raising OutOfBoundsDatetime * Bug in to_datetime which would raise InvalidIndexError: Reindexing only valid with uniquely valued Index objects when called with cache=True, with arg including at least two different elements from the set {None, numpy.nan, pandas.NaT} * Bug in DataFrame and Series where timezone aware data with dtype=\'datetime64[ns] was not cast to naive * Improved Timestamp type checking in various datetime functions to prevent exceptions when using a subclassed datetime * Bug in Series and DataFrame repr where np.datetime64(\'NaT\') and np.timedelta64(\'NaT\') with dtype=object would be represented as NaN * Bug in to_datetime which does not replace the invalid argument with NaT when error is set to coerce * Bug in adding DateOffset with nonzero month to DatetimeIndex would raise ValueError * Bug in to_datetime which raises unhandled OverflowError when called with mix of invalid dates and NaN values with format=\'%Y%m%d\' and error=\'coerce\' * Bug in isin for datetimelike indexes; DatetimeIndex, TimedeltaIndex and PeriodIndex where the levels parameter was ignored. * Bug in to_datetime which raises TypeError for format=\'%Y%m%d\' when called for invalid integer dates with length >= 6 digits with errors=\'ignore\' * Bug when comparing a PeriodIndex against a zero-dimensional numpy array * Bug in constructing a Series or DataFrame from a numpy datetime64 array with a non-ns unit and out-of-bound timestamps generating rubbish data, which will now correctly raise an OutOfBoundsDatetime error . * Bug in date_range with unnecessary OverflowError being raised for very large or very small dates * Bug where adding Timestamp to a np.timedelta64 object would raise instead of returning a Timestamp * Bug where comparing a zero-dimensional numpy array containing a np.datetime64 object to a Timestamp would incorrect raise TypeError * Bug in to_datetime which would raise ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True when called with cache=True, with arg including datetime strings with different offset > Timedelta * Bug in TimedeltaIndex.intersection where for non-monotonic indices in some cases an empty Index was returned when in fact an intersection existed * Bug with comparisons between Timedelta and NaT raising TypeError * Bug when adding or subtracting a BusinessHour to a Timestamp with the resulting time landing in a following or prior day respectively * Bug when comparing a TimedeltaIndex against a zero-dimensional numpy array > Timezones * Bug in DatetimeIndex.to_frame where timezone aware data would be converted to timezone naive data * Bug in to_datetime with utc=True and datetime strings that would apply previously parsed UTC offsets to subsequent arguments * Bug in Timestamp.tz_localize and Timestamp.tz_convert does not propagate freq * Bug in Series.at where setting Timestamp with timezone raises TypeError * Bug in DataFrame.update when updating with timezone aware data would return timezone naive data * Bug in to_datetime where an uninformative RuntimeError was raised when passing a naive Timestamp with datetime strings with mixed UTC offsets * Bug in to_datetime with unit=\'ns\' would drop timezone information from the parsed argument * Bug in DataFrame.join where joining a timezone aware index with a timezone aware column would result in a column of NaN * Bug in date_range where ambiguous or nonexistent start or end times were not handled by the ambiguous or nonexistent keywords respectively * Bug in DatetimeIndex.union when combining a timezone aware and timezone unaware DatetimeIndex * Bug when applying a numpy reduction function (e.g. numpy.minimum) to a timezone aware Series > Numeric * Bug in to_numeric in which large negative numbers were being improperly handled * Bug in to_numeric in which numbers were being coerced to float, even though errors was not coerce * Bug in to_numeric in which invalid values for errors were being allowed * Bug in format in which floating point complex numbers were not being formatted to proper display precision and trimming * Bug in error messages in DataFrame.corr and Series.corr. Added the possibility of using a callable. * Bug in Series.divmod and Series.rdivmod which would raise an (incorrect) ValueError rather than return a pair of Series objects as result * Raises a helpful exception when a non-numeric index is sent to interpolate with methods which require numeric index. * Bug in ~pandas.eval when comparing floats with scalar operators, for example: x < -0.1 * Fixed bug where casting all-boolean array to integer extension array failed * Bug in divmod with a Series object containing zeros incorrectly raising AttributeError * Inconsistency in Series floor-division (//) and divmod filling positive//zero with NaN instead of Inf > Conversion * Bug in DataFrame.astype() when passing a dict of columns and types the errors parameter was ignored. > Strings * Bug in the __name__ attribute of several methods of Series.str, which were set incorrectly * Improved error message when passing Series of wrong dtype to Series.str.cat > Interval * Construction of Interval is restricted to numeric, Timestamp and Timedelta endpoints * Fixed bug in Series/DataFrame not displaying NaN in IntervalIndex with missing values * Bug in IntervalIndex.get_loc where a KeyError would be incorrectly raised for a decreasing IntervalIndex * Bug in Index constructor where passing mixed closed Interval objects would result in a ValueError instead of an object dtype Index > Indexing * Improved exception message when calling DataFrame.iloc with a list of non-numeric objects . * Improved exception message when calling .iloc or .loc with a boolean indexer with different length . * Bug in KeyError exception message when indexing a MultiIndex with a non-existant key not displaying the original key . * Bug in .iloc and .loc with a boolean indexer not raising an IndexError when too few items are passed . * Bug in DataFrame.loc and Series.loc where KeyError was not raised for a MultiIndex when the key was less than or equal to the number of levels in the MultiIndex . * Bug in which DataFrame.append produced an erroneous warning indicating that a KeyError will be thrown in the future when the data to be appended contains new columns . * Bug in which DataFrame.to_csv caused a segfault for a reindexed data frame, when the indices were single-level MultiIndex . * Fixed bug where assigning a arrays.PandasArray to a pandas.core.frame.DataFrame would raise error * Allow keyword arguments for callable local reference used in the DataFrame.query string * Fixed a KeyError when indexing a MultiIndex` level with a list containing exactly one label, which is missing * Bug which produced AttributeError on partial matching Timestamp in a MultiIndex * Bug in Categorical and CategoricalIndex with Interval values when using the in operator (__contains) with objects that are not comparable to the values in the Interval * Bug in DataFrame.loc and DataFrame.iloc on a DataFrame with a single timezone-aware datetime64[ns] column incorrectly returning a scalar instead of a Series * Bug in CategoricalIndex and Categorical incorrectly raising ValueError instead of TypeError when a list is passed using the in operator (__contains__) * Bug in setting a new value in a Series with a Timedelta object incorrectly casting the value to an integer * Bug in Series setting a new key (__setitem__) with a timezone-aware datetime incorrectly raising ValueError * Bug in DataFrame.iloc when indexing with a read-only indexer * Bug in Series setting an existing tuple key (__setitem__) with timezone-aware datetime values incorrectly raising TypeError > Missing * Fixed misleading exception message in Series.interpolate if argument order is required, but omitted . * Fixed class type displayed in exception message in DataFrame.dropna if invalid axis parameter passed * A ValueError will now be thrown by DataFrame.fillna when limit is not a positive integer > MultiIndex * Bug in which incorrect exception raised by Timedelta when testing the membership of MultiIndex > I/O * Bug in DataFrame.to_html() where values were truncated using display options instead of outputting the full content * Fixed bug in missing text when using to_clipboard if copying utf-16 characters in Python 3 on Windows * Bug in read_json for orient=\'table\' when it tries to infer dtypes by default, which is not applicable as dtypes are already defined in the JSON schema * Bug in read_json for orient=\'table\' and float index, as it infers index dtype by default, which is not applicable because index dtype is already defined in the JSON schema * Bug in read_json for orient=\'table\' and string of float column names, as it makes a column name type conversion to Timestamp, which is not applicable because column names are already defined in the JSON schema * Bug in json_normalize for errors=\'ignore\' where missing values in the input data, were filled in resulting DataFrame with the string \"nan\" instead of numpy.nan * DataFrame.to_html now raises TypeError when using an invalid type for the classes parameter instead of AssertionError * Bug in DataFrame.to_string and DataFrame.to_latex that would lead to incorrect output when the header keyword is used * Bug in read_csv not properly interpreting the UTF8 encoded filenames on Windows on Python 3.6+ * Improved performance in pandas.read_stata and pandas.io.stata.StataReader when converting columns that have missing values * Bug in DataFrame.to_html where header numbers would ignore display options when rounding * Bug in read_hdf where reading a table from an HDF5 file written directly with PyTables fails with a ValueError when using a sub-selection via the start or stop arguments * Bug in read_hdf not properly closing store after a KeyError is raised * Improved the explanation for the failure when value labels are repeated in Stata dta files and suggested work-arounds * Improved pandas.read_stata and pandas.io.stata.StataReader to read incorrectly formatted 118 format files saved by Stata * Improved the col_space parameter in DataFrame.to_html to accept a string so CSS length values can be set correctly * Fixed bug in loading objects from S3 that contain # characters in the URL * Adds use_bqstorage_api parameter to read_gbq to speed up downloads of large data frames. This feature requires version 0.10.0 of the pandas-gbq library as well as the google-cloud-bigquery-storage and fastavro libraries. * Fixed memory leak in DataFrame.to_json when dealing with numeric data * Bug in read_json where date strings with Z were not converted to a UTC timezone * Added cache_dates=True parameter to read_csv, which allows to cache unique dates when they are parsed * DataFrame.to_excel now raises a ValueError when the caller\'s dimensions exceed the limitations of Excel * Fixed bug in pandas.read_csv where a BOM would result in incorrect parsing using engine=\'python\' * read_excel now raises a ValueError when input is of type pandas.io.excel.ExcelFile and engine param is passed since pandas.io.excel.ExcelFile has an engine defined * Bug while selecting from HDFStore with where=\'\' specified . * Fixed bug in DataFrame.to_excel() where custom objects (i.e. PeriodIndex) inside merged cells were not being converted into types safe for the Excel writer * Bug in read_hdf where reading a timezone aware DatetimeIndex would raise a TypeError * Bug in to_msgpack and read_msgpack which would raise a ValueError rather than a FileNotFoundError for an invalid path * Fixed bug in DataFrame.to_parquet which would raise a ValueError when the dataframe had no columns * Allow parsing of PeriodDtype columns when using read_csv > Plotting * Fixed bug where api.extensions.ExtensionArray could not be used in matplotlib plotting * Bug in an error message in DataFrame.plot. Improved the error message if non-numerics are passed to DataFrame.plot * Bug in incorrect ticklabel positions when plotting an index that are non-numeric / non-datetime * Fixed bug causing plots of PeriodIndex timeseries to fail if the frequency is a multiple of the frequency rule code * Fixed bug when plotting a DatetimeIndex with datetime.timezone.utc timezone > Groupby/resample/rolling * Bug in pandas.core.resample.Resampler.agg with a timezone aware index where OverflowError would raise when passing a list of functions * Bug in pandas.core.groupby.DataFrameGroupBy.nunique in which the names of column levels were lost * Bug in pandas.core.groupby.GroupBy.agg when applying an aggregation function to timezone aware data * Bug in pandas.core.groupby.GroupBy.first and pandas.core.groupby.GroupBy.last where timezone information would be dropped * Bug in pandas.core.groupby.GroupBy.size when grouping only NA values * Bug in Series.groupby where observed kwarg was previously ignored * Bug in Series.groupby where using groupby with a MultiIndex Series with a list of labels equal to the length of the series caused incorrect grouping * Ensured that ordering of outputs in groupby aggregation functions is consistent across all versions of Python * Ensured that result group order is correct when grouping on an ordered Categorical and specifying observed=True * Bug in pandas.core.window.Rolling.min and pandas.core.window.Rolling.max that caused a memory leak * Bug in pandas.core.window.Rolling.count and pandas.core.window.Expanding.count was previously ignoring the axis keyword * Bug in pandas.core.groupby.GroupBy.idxmax and pandas.core.groupby.GroupBy.idxmin with datetime column would return incorrect dtype * Bug in pandas.core.groupby.GroupBy.cumsum, pandas.core.groupby.GroupBy.cumprod, pandas.core.groupby.GroupBy.cummin and pandas.core.groupby.GroupBy.cummax with categorical column having absent categories, would return incorrect result or segfault * Bug in pandas.core.groupby.GroupBy.nth where NA values in the grouping would return incorrect results * Bug in pandas.core.groupby.SeriesGroupBy.transform where transforming an empty group would raise a ValueError * Bug in pandas.core.frame.DataFrame.groupby where passing a pandas.core.groupby.grouper.Grouper would return incorrect groups when using the .groups accessor * Bug in pandas.core.groupby.GroupBy.agg where incorrect results are returned for uint64 columns. * Bug in pandas.core.window.Rolling.median and pandas.core.window.Rolling.quantile where MemoryError is raised with empty window * Bug in pandas.core.window.Rolling.median and pandas.core.window.Rolling.quantile where incorrect results are returned with closed=\'left\' and closed=\'neither\' * Improved pandas.core.window.Rolling, pandas.core.window.Window and pandas.core.window.EWM functions to exclude nuisance columns from results instead of raising errors and raise a DataError only if all columns are nuisance * Bug in pandas.core.window.Rolling.max and pandas.core.window.Rolling.min where incorrect results are returned with an empty variable window * Raise a helpful exception when an unsupported weighted window function is used as an argument of pandas.core.window.Window.aggregate > Reshaping * Bug in pandas.merge adds a string of None, if None is assigned in suffixes instead of remain the column name as-is . * Bug in merge when merging by index name would sometimes result in an incorrectly numbered index (missing index values are now assigned NA) * to_records now accepts dtypes to its column_dtypes parameter * Bug in concat where order of OrderedDict (and dict in Python 3.6+) is not respected, when passed in as objs argument * Bug in pivot_table where columns with NaN values are dropped even if dropna argument is False, when the aggfunc argument contains a list * Bug in concat where the resulting freq of two DatetimeIndex with the same freq would be dropped . * Bug in merge where merging with equivalent Categorical dtypes was raising an error * bug in DataFrame instantiating with a dict of iterators or generators (e.g. pd.DataFrame({\'A\': reversed(range(3))})) raised an error . * Bug in DataFrame instantiating with a range (e.g. pd.DataFrame(range(3))) raised an error . * Bug in DataFrame constructor when passing non-empty tuples would cause a segmentation fault * Bug in Series.apply failed when the series is a timezone aware DatetimeIndex * Bug in pandas.cut where large bins could incorrectly raise an error due to an integer overflow * Bug in DataFrame.sort_index where an error is thrown when a multi-indexed DataFrame is sorted on all levels with the initial level sorted last * Bug in Series.nlargest treats True as smaller than False * Bug in DataFrame.pivot_table with a IntervalIndex as pivot index would raise TypeError * Bug in which DataFrame.from_dict ignored order of OrderedDict when orient=\'index\' . * Bug in DataFrame.transpose where transposing a DataFrame with a timezone-aware datetime column would incorrectly raise ValueError * Bug in pivot_table when pivoting a timezone aware column as the values would remove timezone information * Bug in merge_asof when specifying multiple by columns where one is datetime64[ns, tz] dtype > Sparse * Significant speedup in SparseArray initialization that benefits most operations, fixing performance regression introduced in v0.20.0 * Bug in SparseFrame constructor where passing None as the data would cause default_fill_value to be ignored * Bug in SparseDataFrame when adding a column in which the length of values does not match length of index, AssertionError is raised instead of raising ValueError * Introduce a better error message in Series.sparse.from_coo so it returns a TypeError for inputs that are not coo matrices * Bug in numpy.modf on a SparseArray. Now a tuple of SparseArray is returned . > Build Changes * Fix install error with PyPy on macOS > ExtensionArray * Bug in factorize when passing an ExtensionArray with a custom na_sentinel . * Series.count miscounts NA values in ExtensionArrays * Added Series.__array_ufunc__ to better handle NumPy ufuncs applied to Series backed by extension arrays . * Keyword argument deep has been removed from ExtensionArray.copy > Other * Removed unused C functions from vendored UltraJSON implementation * Allow Index and RangeIndex to be passed to numpy min and max functions * Use actual class name in repr of empty objects of a Series subclass . * Bug in DataFrame where passing an object array of timezone-aware datetime objects would incorrectly raise ValueError- Remove upstream-included pandas-tests-memory.patch * Sat Mar 16 2019 Arun Persaud - specfile: * requier pytest-mock- update to version 0.24.2: * Fixed Regressions + Fixed regression in DataFrame.all() and DataFrame.any() where bool_only=True was ignored (GH25101) + Fixed issue in DataFrame construction with passing a mixed list of mixed types could segfault. (GH25075) + Fixed regression in DataFrame.apply() causing RecursionError when dict-like classes were passed as argument. (GH25196) + Fixed regression in DataFrame.replace() where regex=True was only replacing patterns matching the start of the string (GH25259) + Fixed regression in DataFrame.duplicated(), where empty dataframe was not returning a boolean dtyped Series. (GH25184) + Fixed regression in Series.min() and Series.max() where numeric_only=True was ignored when the Series contained Categorical data (GH25299) + Fixed regression in subtraction between Series objects with datetime64[ns] dtype incorrectly raising OverflowError when the Series on the right contains null values (GH25317) + Fixed regression in TimedeltaIndex where np.sum(index) incorrectly returned a zero-dimensional object instead of a scalar (GH25282) + Fixed regression in IntervalDtype construction where passing an incorrect string with ‘Interval’ as a prefix could result in a RecursionError. (GH25338) + Fixed regression in creating a period-dtype array from a read-only NumPy array of period objects. (GH25403) + Fixed regression in Categorical, where constructing it from a categorical Series and an explicit categories= that differed from that in the Series created an invalid object which could trigger segfaults. (GH25318) + Fixed regression in to_timedelta() losing precision when converting floating data to Timedelta data (GH25077). + Fixed pip installing from source into an environment without NumPy (GH25193) + Fixed regression in DataFrame.replace() where large strings of numbers would be coerced into int64, causing an OverflowError (GH25616) + Fixed regression in factorize() when passing a custom na_sentinel value with sort=True (GH25409). + Fixed regression in DataFrame.to_csv() writing duplicate line endings with gzip compress (GH25311) * Bug Fixes + I/O o Better handling of terminal printing when the terminal dimensions are not known (GH25080) o Bug in reading a HDF5 table-format DataFrame created in Python 2, in Python 3 (GH24925) o Bug in reading a JSON with orient=\'table\' generated by DataFrame.to_json() with index=False (GH25170) o Bug where float indexes could have misaligned values when printing (GH25061) + Reshaping o Bug in transform() where applying a function to a timezone aware column would return a timezone naive result (GH24198) o Bug in DataFrame.join() when joining on a timezone aware DatetimeIndex (GH23931) o Visualization o Bug in Series.plot() where a secondary y axis could not be set to log scale (GH25545) + Other o Bug in Series.is_unique() where single occurrences of NaN were not considered unique (GH25180) o Bug in merge() when merging an empty DataFrame with an Int64 column or a non-empty DataFrame with an Int64 column that is all NaN (GH25183) o Bug in IntervalTree where a RecursionError occurs upon construction due to an overflow when adding endpoints, which also causes IntervalIndex to crash during indexing operations (GH25485) o Bug in Series.size raising for some extension-array-backed Series, rather than returning the size (GH25580) o Bug in resampling raising for nullable integer-dtype columns (GH25580) * Fri Feb 22 2019 Tomáš Chvátal - Add patch to fix testrun on 32bit: https://github.com/pandas-dev/pandas/issues/25384 * pandas-tests-memory.patch * Thu Feb 21 2019 Tomáš Chvátal - Add requirement for at least 4 GB of physical memory * Tue Feb 19 2019 Tomáš Chvátal - Do not delete tests, they are used even by other inheriting packages for their testing- Execute tests * Tue Feb 05 2019 Todd R - Update to 0.24.1 * The default ``sort`` value for :meth:`Index.union` has changed from ``True`` to ``None`` (:issue:`24959`). The default *behavior *, however, remains the same * Fixed regression in :meth:`DataFrame.to_dict` with ``records`` orient raising an ``AttributeError`` when the ``DataFrame`` contained more than 255 columns, or wrongly converting column names that were not valid python identifiers (:issue:`24939`, :issue:`24940`). * Fixed regression in :func:`read_sql` when passing certain queries with MySQL/pymysql (:issue:`24988`). * Fixed regression in :class:`Index.intersection` incorrectly sorting the values by default (:issue:`24959`). * Fixed regression in :func:`merge` when merging an empty ``DataFrame`` with multiple timezone-aware columns on one of the timezone-aware columns (:issue:`25014`). * Fixed regression in :meth:`Series.rename_axis` and :meth:`DataFrame.rename_axis` where passing ``None`` failed to remove the axis name (:issue:`25034`) * Fixed regression in :func:`to_timedelta` with `box=False` incorrectly returning a ``datetime64`` object instead of a ``timedelta64`` object (:issue:`24961`) * Fixed regression where custom hashable types could not be used as column keys in :meth:`DataFrame.set_index` (:issue:`24969`) * Bug in :meth:`DataFrame.groupby` with :class:`Grouper` when there is a time change (DST) and grouping frequency is ``\'1d\'`` (:issue:`24972`) * Fixed the warning for implicitly registered matplotlib converters not showing. See :ref:`whatsnew_0211.converters` for more (:issue:`24963`). * Fixed AttributeError when printing a DataFrame\'s HTML repr after accessing the IPython config object (:issue:`25036`) * Mon Jan 28 2019 Todd R - Update to 0.24.0 Highlights include: * Optional Integer NA Support * New APIs for accessing the array backing a Series or Index * A new top-level method for creating arrays * Store Interval and Period data in a Series or DataFrame * Support for joining on two MultiIndexes * Wed Aug 08 2018 jengelhAATTinai.de- Ensure neutrality of description. Remove future visions. Use noun phrase in summary. * Sat Aug 04 2018 toddrme2178AATTgmail.com- Update to 0.23.4 * Python 3.7 with Windows gave all missing values for rolling variance calculations (:issue:`21813`) * Bug where calling :func:`DataFrameGroupBy.agg` with a list of functions including ``ohlc`` as the non-initial element would raise a ``ValueError`` (:issue:`21716`) * Bug in ``roll_quantile`` caused a memory leak when calling ``.rolling(...).quantile(q)`` with ``q`` in (0,1) (:issue:`21965`) * Bug in :func:`Series.clip` and :func:`DataFrame.clip` cannot accept list-like threshold containing ``NaN`` (:issue:`19992`) * Sat Jul 14 2018 arunAATTgmx.de- update to version 0.23.3: * This release fixes a build issue with the sdist for Python 3.7 (GH21785) There are no other changes. * Sat Jul 07 2018 arunAATTgmx.de- update to version 0.23.2: * Fixed Regressions + Fixed regression in to_csv() when handling file-like object incorrectly (GH21471) + Re-allowed duplicate level names of a MultiIndex. Accessing a level that has a duplicate name by name still raises an error (GH19029). + Bug in both DataFrame.first_valid_index() and Series.first_valid_index() raised for a row index having duplicate values (GH21441) + Fixed printing of DataFrames with hierarchical columns with long names (GH21180) + Fixed regression in reindex() and groupby() with a MultiIndex or multiple keys that contains categorical datetime-like values (GH21390). + Fixed regression in unary negative operations with object dtype (GH21380) + Bug in Timestamp.ceil() and Timestamp.floor() when timestamp is a multiple of the rounding frequency (GH21262) + Fixed regression in to_clipboard() that defaulted to copying dataframes with space delimited instead of tab delimited (GH21104) * Build Changes + The source and binary distributions no longer include test data files, resulting in smaller download sizes. Tests relying on these data files will be skipped when using pandas.test(). (GH19320) * Bug Fixes * Conversion + Bug in constructing Index with an iterator or generator (GH21470) + Bug in Series.nlargest() for signed and unsigned integer dtypes when the minimum value is present (GH21426) * Indexing + Bug in Index.get_indexer_non_unique() with categorical key (GH21448) + Bug in comparison operations for MultiIndex where error was raised on equality / inequality comparison involving a MultiIndex with nlevels == 1 (GH21149) + Bug in DataFrame.drop() behaviour is not consistent for unique and non-unique indexes (GH21494) + Bug in DataFrame.duplicated() with a large number of columns causing a ‘maximum recursion depth exceeded’ (GH21524). * I/O + Bug in read_csv() that caused it to incorrectly raise an error when nrows=0, low_memory=True, and index_col was not None (GH21141) + Bug in json_normalize() when formatting the record_prefix with integer columns (GH21536) * Categorical + Bug in rendering Series with Categorical dtype in rare conditions under Python 2.7 (GH21002) * Timezones + Bug in Timestamp and DatetimeIndex where passing a Timestamp localized after a DST transition would return a datetime before the DST transition (GH20854) + Bug in comparing DataFrame`s with tz-aware :class:`DatetimeIndex columns with a DST transition that raised a KeyError (GH19970) * Timedelta + Bug in Timedelta where non-zero timedeltas shorter than 1 microsecond were considered False (GH21484) * Wed Jun 13 2018 toddrme2178AATTgmail.com- Update to 0.23.1 + Fixed Regressions * Reverted change to comparing a Series holding datetimes and a datetime.date object * Reverted the ability of to_sql() to perform multivalue inserts as this caused regression in certain cases (GH21103). In the future this will be made configurable. * Fixed regression in the DatetimeIndex.date and DatetimeIndex.time attributes in case of timezone-aware data: DatetimeIndex.time returned a tz-aware time instead of tz-naive (GH21267) and DatetimeIndex.date returned incorrect date when the input date has a non-UTC timezone (GH21230). * Fixed regression in pandas.io.json.json_normalize() when called with None values in nested levels in JSON, and to not drop keys with value as None (GH21158, GH21356). * Bug in to_csv() causes encoding error when compression and encoding are specified (GH21241, GH21118) * Bug preventing pandas from being importable with -OO optimization (GH21071) * Bug in Categorical.fillna() incorrectly raising a TypeError when value the individual categories are iterable and value is an iterable (GH21097, GH19788) * Fixed regression in constructors coercing NA values like None to strings when passing dtype=str (GH21083) * Regression in pivot_table() where an ordered Categorical with missing values for the pivot’s index would give a mis-aligned result (GH21133) * Fixed regression in merging on boolean index/columns (GH21119). + Performance Improvements * Improved performance of CategoricalIndex.is_monotonic_increasing(), CategoricalIndex.is_monotonic_decreasing() and CategoricalIndex.is_monotonic() (GH21025) * Improved performance of CategoricalIndex.is_unique() (GH21107) + Bug fixes * Groupby/Resample/Rolling > Bug in DataFrame.agg() where applying multiple aggregation functions to a DataFrame with duplicated column names would cause a stack overflow (GH21063) > Bug in pandas.core.groupby.GroupBy.ffill() and pandas.core.groupby.GroupBy.bfill() where the fill within a grouping would not always be applied as intended due to the implementations’ use of a non-stable sort (GH21207) > Bug in pandas.core.groupby.GroupBy.rank() where results did not scale to 100% when specifying method=\'dense\' and pct=True > Bug in pandas.DataFrame.rolling() and pandas.Series.rolling() which incorrectly accepted a 0 window size rather than raising (GH21286) * Data-type specific > Bug in Series.str.replace() where the method throws TypeError on Python 3.5.2 (:issue: 21078) > Bug in Timedelta: where passing a float with a unit would prematurely round the float precision (:issue: 14156) > Bug in pandas.testing.assert_index_equal() which raised AssertionError incorrectly, when comparing two CategoricalIndex objects with param check_categorical=False (GH19776) * Sparse > Bug in SparseArray.shape which previously only returned the shape SparseArray.sp_values (GH21126) * Indexing > Bug in Series.reset_index() where appropriate error was not raised with an invalid level name (GH20925) > Bug in interval_range() when start/periods or end/periods are specified with float start or end (GH21161) > Bug in MultiIndex.set_names() where error raised for a MultiIndex with nlevels == 1 (GH21149) > Bug in IntervalIndex constructors where creating an IntervalIndex from categorical data was not fully supported (GH21243, issue:21253) > Bug in MultiIndex.sort_index() which was not guaranteed to sort correctly with level=1; this was also causing data misalignment in particular DataFrame.stack() operations (GH20994, GH20945, GH21052) * Plotting > New keywords (sharex, sharey) to turn on/off sharing of x/y-axis by subplots generated with pandas.DataFrame().groupby().boxplot() (:issue: 20968) * I/O > Bug in IO methods specifying compression=\'zip\' which produced uncompressed zip archives (GH17778, GH21144) > Bug in DataFrame.to_stata() which prevented exporting DataFrames to buffers and most file-like objects (GH21041) > Bug in read_stata() and StataReader which did not correctly decode utf-8 strings on Python 3 from Stata 14 files (dta version 118) (GH21244) > Bug in IO JSON read_json() reading empty JSON schema with orient=\'table\' back to DataFrame caused an error (GH21287) * Reshaping > Bug in concat() where error was raised in concatenating Series with numpy scalar and tuple names (GH21015) > Bug in concat() warning message providing the wrong guidance for future behavior (GH21101) * Other > Tab completion on Index in IPython no longer outputs deprecation warnings (GH21125) > Bug preventing pandas being used on Windows without C++ redistributable installed (GH21106) * Mon May 21 2018 toddrme2178AATTgmail.com- Update dependencies * Thu May 17 2018 tchvatalAATTsuse.com- Update to 0.23.0: * Round-trippable JSON format with ‘table’ orient. * Instantiation from dicts respects order for Python 3.6+. * Dependent column arguments for assign. * Merging / sorting on a combination of columns and index levels. * Extending Pandas with custom types. * Excluding unobserved categories from groupby. * Changes to make output shape of DataFrame.apply consistent. * Thu May 17 2018 tchvatalAATTsuse.com- Do not bother generating pandas doc if it is already in both html and pdf provided by upstream, just point to the URL * Thu Jan 11 2018 tchvatalAATTsuse.com- Drop commented code to allow us py3 only build * Wed Jan 03 2018 arunAATTgmx.de- specfile: * update copyright year- update to version 0.22.0: * Pandas 0.22.0 changes the handling of empty and all-NA sums and products. The summary is that + The sum of an empty or all-NA Series is now 0 + The product of an empty or all-NA Series is now 1 + We’ve added a min_count parameter to .sum() and .prod() controlling the minimum number of valid values for the result to be valid. If fewer than min_count non-NA values are present, the result is NA. The default is 0. To return NaN, the 0.21 behavior, use min_count=1. | |
|
|