Working with time series data involves a consistent set of tasks. Raw data arrives at irregular intervals and needs resampling. Anomalous spikes need to be identified before they distort any downstream analysis. Trends and seasonal patterns need separating from noise. And when you have multiple series, understanding how they relate to each other takes more than a quick visual scan.
These five Python scripts handle these common time series tasks. They are designed to work with standard CSV or Excel inputs, produce clean outputs, and be straightforward to configure for different datasets.
You can get all the scripts on GitHub.
Real-world time series data rarely arrives at uniform intervals. Sensor readings, transaction logs, and event streams have gaps, duplicates, and inconsistent timestamps. Before any meaningful analysis, the data needs to be aligned to a consistent frequency.
Takes a CSV or Excel file with a datetime column and one or more value columns, resamples to a frequency you specify, and applies aggregation functions per column. Fills or flags gaps and writes a clean output file with a summary of what was changed.
The script parses the datetime column with pandas, sets it as the index, and uses resample() with configurable frequency strings. Per-column aggregation methods are defined in a config, so a temperature column can use mean while a sales column uses sum. Missing intervals after resampling are handled with forward-fill, interpolation, or explicit NaN flagging depending on your setting. A gap report lists every interval where data was absent in the original.
⏩ Get the time series resampler script
A single anomalous spike or drop in a time series can skew averages, break downstream models, and mask real trends. Identifying these points manually by scanning plots or raw values is impractical at any meaningful data volume.
Scans one or more numeric columns in a time series file and flags data points that fall outside expected bounds using a choice of three detection methods: z-score, interquartile range (IQR), or rolling statistics. Outputs an annotated file with anomaly flags and a separate summary report.
The z-score method flags points where the standardized value exceeds a configurable threshold (default ±3). The interquartile range (IQR) method flags points outside 1.5× the interquartile range. The rolling method computes a moving mean and standard deviation over a configurable window and flags points that deviate significantly from the local context. This is useful for series with strong trends or seasonality. All three can be run together; the output column records which method flagged each point. An optional --plot flag saves a chart for each column with anomalies highlighted.
⏩ Get the anomaly detector script
A time series is usually a combination of several components: a long-term trend, a repeating seasonal pattern, and irregular residual noise. Analyzing the series as a whole makes it hard to understand any one component clearly.
Applies classical time series decomposition to a numeric column, separating the observed series into trend, seasonal, and residual components. Supports both additive and multiplicative decomposition models. Exports each component as a column in the output file and saves a multi-panel chart.
The script uses statsmodels.tsa.seasonal.seasonal_decompose() on the target column after resampling to a consistent frequency if needed. The decomposition period is configurable. Additive decomposition suits series where seasonal variation is roughly constant in magnitude; multiplicative suits series where it scales with the trend level. The output Excel file contains the original series alongside the three extracted components. The saved chart shows all four panels stacked.
⏩ Get the time series decomposition script
Producing a forecast from a time series typically involves model selection, parameter tuning, and validation steps that require statistical knowledge to get right. Setting this up from scratch each time is time-consuming, and doing it informally produces forecasts that are hard to trust or reproduce.
Fits a seasonal autoregressive integrated moving average (SARIMA) model to a time series column, generates a forecast for a configurable number of periods, and writes results to an output file including the forecast values, confidence intervals, and basic accuracy metrics on a held-out validation period. Optionally auto-selects model parameters using Akaike information criterion (AIC) minimization.
The script uses statsmodels.tsa.statespace.sarimax.SARIMAX for model fitting. When --auto-order is set, it performs a lightweight grid search over a configurable range of ARIMA and seasonal parameters, selecting the combination with the lowest AIC. The series is split into a training set and a held-out test set configurable as a number of periods. Accuracy is reported on the test set using mean absolute error (MAE) and root mean squared error (RMSE) before the final model is re-fit on the full series to produce the forward forecast. Results include the point forecast and 95% confidence intervals. A forecast chart is saved showing the historical series, the test period actuals vs. predictions, and the forward forecast with confidence bands.
⏩ Get the SARIMA forecasting script
When working with several related time series — different products, regions, sensors, or metrics — understanding how they move together requires more than viewing them on the same chart. Correlation analysis, lag relationships, and aligned summary statistics all need computing, and doing this across many pairs of series quickly becomes unwieldy.
Takes a file with multiple time series columns, aligns them to a common frequency, and produces a multi-tab comparison report covering pairwise correlations, lag analysis (cross-correlation up to a configurable lag), and a side-by-side summary statistics table. Charts are generated for the top correlated pairs.
The script uses pandas to align all columns to a shared datetime index after resampling. Pairwise Pearson and Spearman correlations are computed and written to a correlation matrix tab. Cross-correlation is computed for each pair up to a configurable maximum lag, identifying the lag at which each pair peaks, which is useful for finding leading/lagging relationships. A summary tab includes mean, standard deviation, min, max, and trend direction (positive/negative slope from a linear fit) for each series. The top five most correlated pairs each get a dual-axis line chart in a dedicated charts tab.
⏩ Get the multi-series comparison script
These five scripts cover the core tasks involved in working with time series data. They are designed to be used independently or sequentially: resample first, detect anomalies, decompose, forecast, then compare across series.
To get started, first download the script you plan to use and install all the dependencies listed in its README file. Next, update the configuration section at the top of the script so it aligns with your specific data and column names. Before running it on your full dataset, test the script on a small sample to confirm the output is correct. Once you're satisfied with the results, you can schedule it or integrate it into your existing data pipeline.
Happy analyzing!
Bala Priya C**** is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.