Data is all around us. It always has been. However it is only through the development of the digital age that it has really come into focus. Prior to the past decade, data was mostly considered to be laboratory notes and school science projects, but now it is collected and traded as a valuable commodity. Technology and marketing companies are interested in the collection of all information that can be observed and measured, in the hope they may be able to benefit from it through advertising etc. Personal information is a prime target for collection and something big business and governments are collecting and collating for future reference. While almost unheard of in the ‘90s, data is now an important part of the lives of many of us, and it’s collection and usage affects us on a daily basis.
Why is Data so Important?
So why is there so much focus on the collection of data for analysis? Forecasting is one of the major reasons. People always want to know what is going to happen in the future, and by extrapolating data based on up-to-date information reasonably accurate predictions can be made. Data based forecasting is so important that business proposals in the digital era without any data-backed forecasting are usually seen as being incomplete.
A time series is a collection of measurements of well-defined items obtained through repeated measurements. For example, measuring the value of retail sales at any particular store each month of the year would comprise a time series. This is because sales revenue is well defined, and consistently measured at equally spaced intervals. Data collected irregularly or only once cannot be considered to be a time series. A time series can be decomposed into three components: the trend component (long term direction), the seasonal component (systematic, calendar related movements) and the irregular (unsystematic, short term fluctuations). The decomposition is important in data analysis, because what we see in a chart doesn’t necessarily happen in real life. For instance, consider the example of turkey sales over a year. If we look at turkey sales around Thanksgiving, we would estimate that sales were much higher over the entire year than they would be in actuality. Alternatively if we only looked at turkey sales in February we would get a lower estimate than the actual number of sales. This is why it is important to look at an entire time series to get a real long-long term picture.
Data can behave in a large number of ways. Depending on the situation it may have a rhythmic behavior or be highly erratic. It may have a stiff and dull attitude, being insensitive to external influence; or it may have a flexible nature and be very sensitive to externalities. It may also just behave erratically without having any major controlling parameters. We’ve discussed how data can affect a business or an organization based on data forecasting. Now we’re going to talk about the second reason why data is important in business. A collection of data can be seen as more than just a time series chart; people also needs to know what correlation can be drawn from data.
Ellison and Marks brought a simple correlation case in a paper entitled Practical Data Analysis: Business Case Studies, in 2013. They showed a case of data set containing measurements taken on customers at an Australian women’s clothing store that specialises in boho maxi dresses and other gypsy clothing. The variables included number of clothing items purchased, total amount spent, gender of the bill payer, and day of the week. They wanted to see what variable or variables had or have the strongest influence to purchases. They also compared the amounts spent between women, and the men who were shopping for gifts for women.
From their study we learned that women tended to purchase more expensive items from the store later in the week, possibly due to its proximity to the weekend, whereas men’s purchases were more erratic and did not follow any sort of pattern. For single purchases males spent more on average than females, however women were much more likely to purchase numerous items of clothing. Over the week the study was conducted several women bought over 5 items at a time including boho dresses, pants, tops and accessories, however this was definitely not the norm. Much can be learned about consumer behavior in women’s clothing outlets by reviewing thisstudy, and all of this information is useful for the owners of the store (and their competitors) so that they can adjust their services to work with the data and increase profitability.
It is not just small businesses that can benefit from data. Big corporations have recently come under fire for their collection of people’s personal data. To the point where some people consider what they are doing to be an invasion of privacy. Corporations profit from the sale of personal information that can be used to target people for marketing purposes, to profile them, and for other more nefarious means. Anybody who has heard or the allegations of Andrew Snowden can attest to that.
The Future of Data
It was not until around a decade ago that a new kind of profession was created. These people are named “data scientists”. A data scientist represents an evolution from the business or data analyst role. These people are familiar with statistical analysis and trend prediction and generally have a background in computer science and applications, modeling, statistics, analytics and math. We are talking about a powerful career where people can essentially predict the future, through forecasting and analysis. A good data scientist will not just address business problems, they can also pick the right problems to solve, ones that will bring the most benefit to the organisation.
The work of a data scientist generally covers the following aspects:
- Formulation of context-relevant questions and hypotheses to correlate with scientific research
- Interpret data sets for the production of statistical evidence which could then be communicated in written form
- Build models based on new data types, experimental design, and statistical inference
Aside to the proficiency in computer science, math, and statistics, a good data scientist must have the curiosity, creativity, focus and attention to detail.
Data scientists are always needed as far as there’s data involve in any operation. Companies that hire data scientist include:
- Federal, provincial/state and municipal government departments
- Online marketing companies, especially those involved in advertising
- Transportation companies
- Telecommunications companies
- Insurance, finance, and banking organisations
- Real estate companies, especially those involved in Managing Property
- Management consulting companies
- Advertising agencies
- Manufacturing companies
- Construction companies
- Retail companies with potential for repeat customers such as hairdressers
- Utility companies
- Oil, gas and mining companies
- Online retail stores selling repeat purchase items such as t-shirts
- Hospitals and health care organizations
- Colleges and universities
The list goes on. As long as it remains lucrative for people to collect data, there will be people collecting it. Right now you yourself are probably included in the datasheets of hundreds, if not thousands, of companies and associations. While you may not be aware of it data is all around you. It can be used to benefit your life, your business and your education however it can also be used to do the opposite. Data is the next big thing in business and our consumeristic societies. Numbers are not just numbers, they can speak. It’s up to us to listen.
Even if you’re not doing it, you can be sure somebody else is.