29 nov 2023

For my assignment today, I used some powerful math techniques to predict property values. Think of it like estimating the value of a house by looking at several aspects. I used fancy techniques such as drawing arcs and lines to make forecasts. It’s like having an extremely smart calculator that can determine the worth of a house based on the size, location, etc. I am creating a tool here that will help people choose properties by using my math approach. My math approach will give a buyer or seller a reasonable estimate of a residence’s value. It’s like having a smart and helpful friend who knows everything there is to know about real estate. It’s like putting together the final piece of the puzzle in this phase. This is where all of the maths and statistics comes together to make something incredibly practical. So, today, I have not only developed new mathematical skills, but I have also created something that will have a huge impact on the property assessment industry.

nov 27 2023

Analyzing property values: uncovering insights with Z-tests Today for my project, I learned how important it is to use a property assessment dataset in my data science report. I immersed myself in the world of advanced mathematical statistical techniques, focusing on how to use the Z test to uncover significant insights. To begin with, I cleaned up and preprocessed my property assessment dataset. This step was necessary for my subsequent analyses to be reliable. The Z test, a powerful statistical tool, served as the basis for my research and allowed me to draw accurate conclusions regarding population variations and means. As I used the Z test to evaluate different aspects of property values, my understanding of the importance of the Z test in deriving reliable conclusions from data increased. The process’s rigorous mathematical approach showed me how important statistical techniques are for extracting useful information from large datasets.

24 nov 2023

I worked on the survey dataset, which was then split into a training and test set, calculated the average service time for each survey type, and came up with characteristics and target variables for a linear regression model. The test set service times are then predicted and the script uses matplotlib to display the regression line and RMSE (root mean square error) to estimate the model. The final result is a figure that shows how average usage time is predicted using a linear regression model depending on the type of study.

22 nov 2023

Geospatial analysis is a powerful way to gain insight into data with a geographic component. It involves examining and interpreting information in relation to its spatial context. This technology uses a variety of tools and techniques, such as GPS data, satellite imagery, and geographic information systems (GIS), to analyze and visualize map data. The integration of location-based data enables professionals from various fields such as epidemiology, logistics, environmental science and urban planning to gain a holistic understanding of complex problems. By utilizing geospatial analysis, practitioners can identify patterns, correlations and trends that may be hidden in traditional data analysis methods. A spatial perspective allows for deeper exploration of the relationships between data points, leading to informed decision making. For example, in epidemiology, geographic monitoring of disease outbreaks can provide critical insights into disease spread and containment. One of the main strengths of geospatial analytics is the ability to display data visually on maps. This visualization helps identify spatial patterns, trends, and relationships between geographic features that may not appear in tabular data. As a result, experts can discover valuable information and connections that contribute to a deeper understanding of the underlying dynamics.

20 nov 2023

Time series analysis is an important method for understanding time data and includes components such as identifying trends, identifying recurring patterns (seasonality), and observing long-term wave movements (cyclical patterns). Smoothing techniques such as moving averages and exponential smoothing improve analysis by highlighting trends. Decomposition breaks the data into trends, seasonality and residual components for clarity. Ensuring stationarity, where statistical properties remain constant, often requires differentiation or transformations. The autocorrelation and partial autocorrelation functions identify dependencies and relationships between observations at different lags. Forecasting methods are central, and ARIMA models combine autoregressive, differential, and moving average components. Exponential smoothing methods contribute to accurate forecasts, and advanced models such as Prophet and Long Short-Term Memory (LSTM) improve forecasting capabilities. Applications of time series analysis include financial forecasting, demand forecasting for inventory management, and energy consumption optimization. Overall, time series analysis provides a comprehensive framework for gaining insights, making informed decisions, and accurately predicting trends in various time series data.

17 nov 2023

The ARIMA (AutoRegressive Integrated Moving Average) model, a powerful time series forecasting method, consists of three main components. The AutoRegressive (AR) element captures the relationship between an observation and its lagged counterparts, denoted by “p”, which represents the number of lagged observations considered. A higher p-value indicates a more complex structure involving long-term dependencies. The integral (I) component includes differentiation to achieve stationarity, which is crucial for time series analysis. The “d” represents the order of difference, which indicates how many times it is used. The moving average (MA) component accounts for the correlations between the observations and the residual errors of the moving average model, where “q” represents the order of the residuals. Expressed as ARIMA (p, d, q), the model can be applied to finance, environmental research, and any time-dependent data analysis. The modeling process includes data exploration, stationarity checking, parameter selection, model training, validation and testing, and finally forecasting. ARIMA models are invaluable tools for analysts and data scientists, providing a systematic framework for effective time series forecasting and analysis.

15 nov 2023

Today I learned about time series. A time series refers to a chronological sequence of data points consisting of measurements or observations made at uniform and regular intervals. This data format is widely used in various fields such as environmental sciences, biology, finance and economics. When dealing with time series, the main goal is to understand the inherent patterns, trends, and behaviors that may appear in the data over time. Time series analysis includes e.g. model, interpret and predict future values ​​based on historical trends. Project life cycle forecasting involves predicting future trends or outcomes based on historical data. The life cycle typically includes steps such as data collection, exploratory data analysis (EDA), model selection, model training, validation and testing, deployment, monitoring, and maintenance. This cyclical approach ensures accurate and up-to-date forecasts that require regular checking and correction. Base models are simple reference points or reference points for more complex models. They provide a basic forecast that helps evaluate the performance of more advanced models.

13 nov 2023

Today in class we delved into the exciting world of time series analysis. This advanced field of statistics provides valuable insights into the evolution of data over time and gives us the abilities to predict future patterns based on historical data. We explore key tools such as moving averages and autoregressive models that act as magic tools to decipher the mysteries embedded in sequences of data points. The importance of time series analysis extends beyond mathematical concepts to real-world applications, such as identifying trends in the weather or the stock market. The ability to recognize data patterns, seasonal variations and anomalies is emerging as a superpower in data science. This superpower allows us to make informed decisions and plan for the future using information from historical data.

10 nov 2023

In our decision tree analysis, we tried to predict the course of individual behavior patterns in different event scenarios. These included circumstances where people did not flee and others where they fled by vehicle, on foot or by some other means. With an accuracy of almost 67 percent, our model showed a remarkable ability to predict accurately in most cases. This accuracy highlights the overall effectiveness of the model in evaluating and interpreting different behavioral responses in different event contexts. On the other hand, a closer examination of the confusion matrix revealed some misclassifications. The algorithm correctly predicted 676 escape episodes and successfully identified 37 cases where people did not escape, but it also misclassified several cases. In particular, the model predicted an escape in 125 cases when it did not actually occur. However, 33 cases incorrectly predicted escape on foot and 136 cases incorrectly indicated escape by car. These misclassifications highlight specific parts of the model that need to be improved to increase its reliability and forecast accuracy.